intellabs / fastrag Goto Github PK

Efficient Retrieval Augmentation and Generation Framework

License: Apache License 2.0

Python 90.17% CSS 0.10% Shell 0.03% Jupyter Notebook 9.70%

nlp benchmark colbert information-retrieval semantic-search sentence-transformers summarization transformers diffusion knowledge-graph

fastrag's People

Contributors

Stargazers

Watchers

fastrag's Issues

Add fastRAG to our index of 'integrations'

Hey maintainers. I'm trying to curate an index of Haystack integrations and projects (which for now I've simply named 'integrations'). These are custom nodes, document stores, pipelines and so on that can be used with Haystack, or built with Haystack.
I've created a PR adding this repo to that index too: deepset-ai/haystack-integrations#3 and was wondering if any of you would like to take a look and let me know if you're ok having this added there? The idea would be to render the markdown page somewhere on our website later.

REPLUG compute the LM likelihood

''We use GPT-3 Curie (Brown et al., 2020b) as the supervision LM to compute the LM likelihood.''

How to get or estimate the probability of black-box GPT-3?

running demo.py raises a connection error

I am trying to get the demo to work, but getting the following error

installed all required dependencies, and executed: python run_demo.py -t QA1
following it the command shell output

Creating services...


Server on  localhost:8000/docs   PID=63217
UI on      localhost:8501        PID=61593
62284
61950
61751

How to use this repo on other languages?

I want to use passages in different languages, is it possible?

Fine tuned Embedding model

Hello,
I just discovered this repo. I was wondering if its possible to plug in custom fine tuned embedding models.

Is this repo still active?

Wondering if this repo is being actively worked upon? Any future roadmap which can be shared with the community can be very helpful.
Thanks in advance,
Karrtik

Confusions about REPLUG

How to train the retriever of REPLUG? How to update the embeddings of query and documents?
In your replug_parallel_reader.ipynb, why directly import the default BM25Retriever rather than the trained Retriever？
What does PromptModel do? What is its function? what does ReplugHFLocalInvocationLayer do? It seems this part is not mentioned in the paper.

Thanks for your help!

I need your help creating an example with fastRAG and MiniAutoGen: Lightweight and Flexible Agents for Multi-Agent Chats

🌐 Hello, amazing community!

I'm exploring the integration of two powerful libraries: MiniAutoGen and fastRAG, and I would greatly appreciate your help and insights!

MiniAutoGen is an innovative open-source library designed to take applications with Large Language Models (LLMs) to the next level. Its differentiators are its lightweight and flexible approach, which allows for a high degree of customization.

Here are some notable features of MiniAutoGen:

Multi-Agent Dialogues: The ability to create complex and nuanced interactions with multiple intelligent agents operating together.
Agent Coordination: A mechanism that ensures harmony and efficient management among the agents.
Customizable Agents: Total freedom to shape agent behaviors according to project needs.
Action Pipeline: Simplifies and automates agent operations, facilitating scalability and maintenance.
Integration with +100 LLMs: Expanding conversational capabilities with over 100 LLMs for intelligent and contextualized responses.

My Challenge: I'm seeking help from the community to develop new integrations and modules.

I Seek Your Help: Do you have examples, tips, or guidance on how I can accomplish this integration? Any insight or shared experience would be extremely valuable!

Check out MiniAutoGen on Google Colab: MiniAutoGen on Google Colab
And here is the GitHub repository for more information: GitHub - brunocapelao/miniAutoGen

I'm looking forward to your ideas and suggestions. Let's shape the future of AI conversations together! 🌟

Add updates to `PLAIDDocumentStore`

See here: stanford-futuredata/ColBERT#111, https://github.com/stanford-futuredata/ColBERT/blob/main/colbert/index_updater.py for the pathway

Issues Running Different LLM Models on examples/rag_with_quantized_llm.ipynb

Hello,

I'm relatively new to working with Large Language Models (LLM) and am reaching out through the issues tab as I couldn't find a discussions section. I'm currently exploring the Intel/fastRAG repository to learn more about the implementation of RAG models with quantized LLMs, and I have encountered some challenges that I hope to get guidance on. I've been trying to run the examples/rag_with_quantized_llm.ipynb notebook on a GCP server (c3-standard-8 instance with 8 vCPUs and 32 GB memory, running Ubuntu 22.04)

I've successfully run the example using the facebook/opt-iml-max-1.3b model specified in the notebook. However, when attempting to experiment with other models, specifically open-research/openlm-research/open_llama_3b and openlm-research/open_llama_7b, I've encountered some challenges:

With the open_llama_3b model, the process gets stuck at the "Quantizing" step without progressing further.
Attempting to use the open_llama_7b model results in the process being killed immediately after the "Saving external data to one file..." message. This is surprising, especially considering the relatively small model size.
Given my limited experience, I'm reaching out for some guidance. I'm curious if there are minimum hardware requirements for each model size or specific quantizing precision that I might not be aware of. Any insights or suggestions on how to successfully run these models, or adjustments to my setup that could help, would be greatly appreciated.

Thank you for your time and assistance.

FiDReader Local Model

Hey, I noticed that in your examples that you ask for the user to add a local path for a trained fid model but as far as I can tell, we can just provide a HF model name and it works. At least, I got the knowledge graph notebook to work like that. So maybe there's no need for the assertion. This is what I did:

from fastrag.readers import FiDReader

fid_model_path = None  ## change this to the local FID model
#assert fid_model_path is not None, "Please change fid_model_path to the path of your trained FiD model"

reader = FiDReader(
    input_converter_tokenizer_max_len= 256,
    model_name_or_path="Intel/fid_flan_t5_base_nq", 
    num_beams=1, 
    min_length=2, 
    max_length=100, 
    use_gpu=False
)

Let me know if I'm missing something :) - really cool project btw! It's great to see some custom Haystack nodes 👋

ImportError: cannot import name 'Seq2SeqGenerator' from 'haystack.nodes'

Problem
Hi, I am trying to reproduce the knowledge graph example you provided here

but I am getting the following error:

File "xxx/haystack1.py", line 6, in
from fastrag.readers import FiDReader
File "xxx/lib/python3.8/site-packages/fastrag/init.py", line 4, in
from fastrag import image_generators, kg_creators, rankers, readers, retrievers, stores
File "xxx/lib/python3.8/site-packages/fastrag/readers/init.py", line 6, in
from fastrag.readers.FiD import FiDReader
File "xxx/lib/python3.8/site-packages/fastrag/readers/FiD.py", line 8, in
from haystack.nodes import Seq2SeqGenerator
ImportError: cannot import name 'Seq2SeqGenerator' from 'haystack.nodes' (xxx/haystack/haystack/nodes/init.py)
/usr/lib/python3.8/tempfile.py:957: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpuj6o9jw1'>
_warnings.warn(warn_message, ResourceWarning)

Steps to reproduce
git clone https://github.com/deepset-ai/haystack.git
cd haystack
pip install --upgrade pip
pip install -e '.[all-gpu]'

pip install git+https://github.com/IntelLabs/fastRAG.git

HW specifications:

OS: Ubuntu 20.04.6 LTS
Python version: 3.8.10

pip install .[intel] fails

I used steps below on a Window 11 PC:

cd \fastrag\
conda create -n fastrag python=3.10.6
conda activate fastrag
git clone https://github.com/IntelLabs/fastRAG.git
cd fastRAG\
pip install .
pip install .[intel]

The failure is:

E:\fastrag\fastRAG (main -> origin)
(fastrag) λ pip install .[intel]
Processing e:\fastrag\fastrag
  Preparing metadata (setup.py) ... done
Requirement already satisfied: farm-haystack==1.23.0 in e:\anaconda3\envs\fastrag\lib\site-packages (1.23.0)
Requirement already satisfied: transformers>=4.35.2 in e:\anaconda3\envs\fastrag\lib\site-packages (4.35.2)
Requirement already satisfied: datasets in e:\anaconda3\envs\fastrag\lib\site-packages (2.19.0)
Requirement already satisfied: evaluate in e:\anaconda3\envs\fastrag\lib\site-packages (0.4.1)
Requirement already satisfied: pandas in e:\anaconda3\envs\fastrag\lib\site-packages (2.2.2)
Requirement already satisfied: nltk in e:\anaconda3\envs\fastrag\lib\site-packages (3.8.1)
Requirement already satisfied: tqdm in e:\anaconda3\envs\fastrag\lib\site-packages (4.66.2)
Requirement already satisfied: numba in e:\anaconda3\envs\fastrag\lib\site-packages (0.59.1)
Requirement already satisfied: openpyxl in e:\anaconda3\envs\fastrag\lib\site-packages (3.1.2)
Requirement already satisfied: numpy in e:\anaconda3\envs\fastrag\lib\site-packages (1.26.4)
Requirement already satisfied: protobuf==3.20.2 in e:\anaconda3\envs\fastrag\lib\site-packages (3.20.2)
Requirement already satisfied: ujson in e:\anaconda3\envs\fastrag\lib\site-packages (5.9.0)
Requirement already satisfied: accelerate in e:\anaconda3\envs\fastrag\lib\site-packages (0.29.3)
Requirement already satisfied: fastapi in e:\anaconda3\envs\fastrag\lib\site-packages (0.110.2)
Requirement already satisfied: uvicorn in e:\anaconda3\envs\fastrag\lib\site-packages (0.29.0)
Requirement already satisfied: Pillow==10.1.0 in e:\anaconda3\envs\fastrag\lib\site-packages (10.1.0)
INFO: pip is looking at multiple versions of fastrag[intel] to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement intel-extension-for-pytorch (from fastrag[intel]) (from versions: none)
ERROR: No matching distribution found for intel-extension-for-pytorch

E:\fastrag\fastRAG (main -> origin)
(fastrag) λ

Packages and their versions installed in this venv are listed in attached.
fastrag packages.txt

Multilingual RAG

Hi,

Thanks for this great repo!

Is there any way to use this pipeline in multilingual settings?

Are there multilingual version of Colbert, PLAID and FiD? Else, how would you recommend to proceed?

Connection error when running fastrag.rest_api_application

I ran this separately just to test the fastRAG pipeline

python -m fastrag.rest_api.application --config=config/my_config.yaml --app_type qa

The output looks like this

e torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( config.json: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 794/794 [00:00<00:00, 353kB/s] pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 90.9M/90.9M [00:09<00:00, 9.75MB/s] tokenizer_config.json: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 316/316 [00:00<00:00, 143kB/s] vocab.txt: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 232k/232k [00:00<00:00, 482kB/s] special_tokens_map.json: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 112/112 [00:00<00:00, 41.6kB/s] The model 'FusionInDecoderForConditionalGeneration' is not supported for text2text-generation. Supported models are ['BartForConditionalGeneration', 'BigBirdPegasusForConditionalGeneration', 'BlenderbotForConditionalGeneration', 'BlenderbotSmallForConditionalGeneration', 'EncoderDecoderModel', 'FSMTForConditionalGeneration', 'GPTSanJapaneseForConditionalGeneration', 'LEDForConditionalGeneration', 'LongT5ForConditionalGeneration', 'M2M100ForConditionalGeneration', 'MarianMTModel', 'MBartForConditionalGeneration', 'MT5ForConditionalGeneration', 'MvpForConditionalGeneration', 'NllbMoeForConditionalGeneration', 'PegasusForConditionalGeneration', 'PegasusXForConditionalGeneration', 'PLBartForConditionalGeneration', 'ProphetNetForConditionalGeneration', 'SeamlessM4TForTextToText', 'SwitchTransformersForConditionalGeneration', 'T5ForConditionalGeneration', 'UMT5ForConditionalGeneration', 'XLMProphetNetForConditionalGeneration']. INFO: Started server process [67999] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:65140 - "GET /status HTTP/1.1" 404 Not Found

When I turn on the UI separately it will return 404 Not Found on the endpoint, and connection error displays as below. Any help? But running on notebooks is absolutely fine.

Question for the replug_parallel_reader

Hello,

Thanks for this fantastic repo! I have a question for the replug_parallel_reader.ipynb. I follow the same code and setting in replug_parallel_reader.ipynb. For the question ""Who is the main villan in Lord of the Rings?", my answer output is Roman Empire. This answer seems incorrect compared to the answer ("Sauron") shown in the above ipynb notebook.

If possible, could you please let me know what could be potential reasons for this? Thanks!

Document Lister not imported

In GPT as both retriever and ranker, there is a line of code:

from fastrag.prompters.document_shapers.document_lister import DocumentLister

However if we look at fastrag/prompters/document_shapers/__init__.py the file is empty, the documentLister is not imported.

Support for Haystack v2

Hi,

is there, in the roadmap, a plan to support/migrate to Haystack v2.0?

Missing libs directory

Due to libs deletion the link to IndexUpdater from README doesn't work.

Cannot run the examples, haystack issues

Can your examples include the haystack installation lines (with compatible versions)? I get errors on the newest farm-haystack and haystack-ai packages. When I downgraded to farm-haystack==1.17.2 (saw in another issue), then basic haystack 'getting started' code doesn't work. Are these projects synced?

Tried to run the notebook 'client_inference_with_Llama_cpp.ipynb'. Got the following error:

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
Cell In[1], line 2
      1 from haystack import Pipeline
----> 2 from haystack.nodes.prompt import PromptNode
      3 from haystack.nodes import PromptModel
      4 from haystack.nodes.prompt.prompt_template import PromptTemplate

File ~/miniconda3/envs/jupyter/lib/python3.10/site-packages/haystack/nodes/__init__.py:1
----> 1 from haystack.nodes.base import BaseComponent
      3 from haystack.nodes.document_classifier import BaseDocumentClassifier, TransformersDocumentClassifier
      4 from haystack.nodes.extractor import EntityExtractor, simplify_ner_for_qa

File ~/miniconda3/envs/jupyter/lib/python3.10/site-packages/haystack/nodes/base.py:11
      8 import logging
     10 from haystack.schema import Document, MultiLabel
---> 11 from haystack.errors import PipelineSchemaError
     12 from haystack.utils.reflection import args_to_kwargs
     15 logger = logging.getLogger(__name__)

ImportError: cannot import name 'PipelineSchemaError' from 'haystack.errors' (/root/miniconda3/envs/jupyter/lib/python3.10/site-packages/haystack/errors.py)

Finetuning example

it would be greatly beneficial if there is an example to demonstrate how to use the fine tuning script with a custom proprietary data set.
Thanks in advance.

Not working on large list of documents.

Although the example you've provided "simple_odqa_pipeline.ipynb' works pretty fine on the example list you've provided. I changed that list to a list of 23 elements, each with a pretty long string. It's taking over 6.5 minutes to generate the result.

It's infeasible at runtime to incorporate this overhead. Any solution for this? Or did you try that?

Load performance: latency/RPS

Hi,

Have you load tested fastRAG pipeline with Colbert, PLAID and FiD on CPU instance?

Could you provide examples of latency, RPS relatively to CPU instance characteristics?

Add Chroma

Hi there - is there any desire to add support for Chroma?

intellabs / fastrag Goto Github PK

fastrag's People

Contributors

Stargazers

Watchers

Forkers

fastrag's Issues

Recommend Projects

Recommend Topics

Recommend Org