Giter Club home page Giter Club logo

ragxplorer's Introduction

ragxplorer's People

Contributors

aniruddha-adhikary avatar dan-s-mueller avatar eltociear avatar gabrielchua avatar ibrahimroshdy avatar jgalego avatar ruanwz avatar sweep-ai[bot] avatar tedsecretsource avatar vince-lam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ragxplorer's Issues

Add a link or button to reset the form so you can upload a new document after analyzing an initial document

Is your feature request related to a problem? Please describe.
Currently, once you upload and analyze a document, you have to refresh the application to analyze a new document. This is a request to add a button or link to the form so you can more easily analyze a new document.

Describe the solution you'd like
Add a link to the upper left-hand corner (in the header) to go back to the main form.

Describe alternatives you've considered
display the whole form below the graph

Data Privacy

Details

Please help us understand whether the uploaded pdfs on the hosted webapp are stored somewhere or are instantly deleted after the session?

Retrieved IDs unrelated to user query

Description

RAGxplorer returns seemingly random results to a query (see image). This is happening because chromadb returns documents typographically ordered by ID ("1", "10", "100" and so on, instead of "0", "1", "2").

Configuration

image

The graph has low accessibility (dark blue dots on a black background)

Describe the bug
The contrast ratio of the graph is somewhat low contributing to low accessibility.

To Reproduce
Steps to reproduce the behavior:

  1. Upload a file
  2. Enter search terms and press Enter
  3. The graph displays on a black background with dark blue dots representing the vectors

Expected behavior
The contrast ratio between the dark blue dots and the background is great enough to more easily see the distribution.

Add more dimensionality reduction techniques

Is your feature request related to a problem? Please describe.

The package currently only applies UMAP dimensionality reduction. More dimensionality reduction techniques, like t-SNE and PCA, could be added to ragxplorer to improve functionality.

Describe the solution you'd like

Add t-SNE and PCA dimensionality reduction techniques to the package by updating the projections.py and ragxplorer.py scripts.

An additional parameter of dim_reduct will be added to the load_pdf and visualize_query methods. This parameter will have a default argument of UMAP and can also take t-SNE and PCA as inputs.

Feedback and Suggestions to Improve this Project

First and foremost, I want to express my heartfelt thanks to all of you for showing interest in this project. It's incredibly humbling and exciting to see others taking notice of something I built.

As this is my first time writing code that's being used by others, I am keenly aware that there's a lot I can learn and many ways in which the project can be improved. That's where I need your help!

I'm looking for suggestions on how best to carry this project forward and organize the code more effectively. If you have any ideas, best practices, or tips, please don't hesitate to share. Your insights will be invaluable in making this a better and more user-friendly project.

I also ask for your patience and understanding regarding the current state of the code. I'm aware that it may not be up to the professional standards yet, and I'm fully committed to learning and improving. Any constructive feedback or advice in this regard would be greatly appreciated.

Please feel free to post your suggestions, feedback, or any questions you might have as responses to this issue. I'm looking forward to reading your input and engaging in discussions that can lead to the betterment of this project.

Install, nothing worked

When ran 

pip install -r xxxxxx

nothing worked:

ERROR: Could not find a version that satisfies the requirement pysqlite3-binary (from versions: none)
ERROR: No matching distribution found for pysqlite3-binary

Re-Build Streamlit App

Is your feature request related to a problem? Please describe.

  • With the next version, the streamlit app has been removed.

Describe the solution you'd like

  • To add back the streamlit app

Describe alternatives you've considered

  • To have the streamlit app in another repo

Add ability to save visualization data and connect to existing database.

Is your feature request related to a problem? Please describe.
Having to load a PDF every time you use the tool can take a long time. It is also useful to look at multiple PDFs. Furthermore, since UMAP is stochastic, having the ability to reproduce results would be helpful for those performing studies.

Describe the solution you'd like
Add functionality to connect to existing chromadb's.
Add export functionality for the UMAP function and projections, which can be re-read in.

Describe alternatives you've considered
I have created a forked version of this repository which performs these tasks: https://github.com/dsmueller3760/RAGxplorer/tree/load_db (under load_db branch).

Additional context
N/A

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.