intellabs / fastrag Goto Github PK
View Code? Open in Web Editor NEWEfficient Retrieval Augmentation and Generation Framework
License: Apache License 2.0
Efficient Retrieval Augmentation and Generation Framework
License: Apache License 2.0
Hey maintainers. I'm trying to curate an index of Haystack integrations and projects (which for now I've simply named 'integrations'). These are custom nodes, document stores, pipelines and so on that can be used with Haystack, or built with Haystack.
I've created a PR adding this repo to that index too: deepset-ai/haystack-integrations#3 and was wondering if any of you would like to take a look and let me know if you're ok having this added there? The idea would be to render the markdown page somewhere on our website later.
''We use GPT-3 Curie (Brown et al., 2020b) as the supervision LM to compute the LM likelihood.''
How to get or estimate the probability of black-box GPT-3?
I want to use passages in different languages, is it possible?
Hello,
I just discovered this repo. I was wondering if its possible to plug in custom fine tuned embedding models.
Wondering if this repo is being actively worked upon? Any future roadmap which can be shared with the community can be very helpful.
Thanks in advance,
Karrtik
Thanks for your help!
π Hello, amazing community!
I'm exploring the integration of two powerful libraries: MiniAutoGen and fastRAG, and I would greatly appreciate your help and insights!
MiniAutoGen is an innovative open-source library designed to take applications with Large Language Models (LLMs) to the next level. Its differentiators are its lightweight and flexible approach, which allows for a high degree of customization.
Here are some notable features of MiniAutoGen:
My Challenge: I'm seeking help from the community to develop new integrations and modules.
I Seek Your Help: Do you have examples, tips, or guidance on how I can accomplish this integration? Any insight or shared experience would be extremely valuable!
Check out MiniAutoGen on Google Colab: MiniAutoGen on Google Colab
And here is the GitHub repository for more information: GitHub - brunocapelao/miniAutoGen
I'm looking forward to your ideas and suggestions. Let's shape the future of AI conversations together! π
See here: stanford-futuredata/ColBERT#111, https://github.com/stanford-futuredata/ColBERT/blob/main/colbert/index_updater.py for the pathway
Hello,
I'm relatively new to working with Large Language Models (LLM) and am reaching out through the issues tab as I couldn't find a discussions section. I'm currently exploring the Intel/fastRAG repository to learn more about the implementation of RAG models with quantized LLMs, and I have encountered some challenges that I hope to get guidance on. I've been trying to run the examples/rag_with_quantized_llm.ipynb notebook on a GCP server (c3-standard-8 instance with 8 vCPUs and 32 GB memory, running Ubuntu 22.04)
I've successfully run the example using the facebook/opt-iml-max-1.3b model specified in the notebook. However, when attempting to experiment with other models, specifically open-research/openlm-research/open_llama_3b and openlm-research/open_llama_7b, I've encountered some challenges:
With the open_llama_3b model, the process gets stuck at the "Quantizing" step without progressing further.
Attempting to use the open_llama_7b model results in the process being killed immediately after the "Saving external data to one file..." message. This is surprising, especially considering the relatively small model size.
Given my limited experience, I'm reaching out for some guidance. I'm curious if there are minimum hardware requirements for each model size or specific quantizing precision that I might not be aware of. Any insights or suggestions on how to successfully run these models, or adjustments to my setup that could help, would be greatly appreciated.
Thank you for your time and assistance.
Hey, I noticed that in your examples that you ask for the user to add a local path for a trained fid model but as far as I can tell, we can just provide a HF model name and it works. At least, I got the knowledge graph notebook to work like that. So maybe there's no need for the assertion. This is what I did:
from fastrag.readers import FiDReader
fid_model_path = None ## change this to the local FID model
#assert fid_model_path is not None, "Please change fid_model_path to the path of your trained FiD model"
reader = FiDReader(
input_converter_tokenizer_max_len= 256,
model_name_or_path="Intel/fid_flan_t5_base_nq",
num_beams=1,
min_length=2,
max_length=100,
use_gpu=False
)
Let me know if I'm missing something :) - really cool project btw! It's great to see some custom Haystack nodes π
Problem
Hi, I am trying to reproduce the knowledge graph example you provided here
but I am getting the following error:
File "xxx/haystack1.py", line 6, in
from fastrag.readers import FiDReader
File "xxx/lib/python3.8/site-packages/fastrag/init.py", line 4, in
from fastrag import image_generators, kg_creators, rankers, readers, retrievers, stores
File "xxx/lib/python3.8/site-packages/fastrag/readers/init.py", line 6, in
from fastrag.readers.FiD import FiDReader
File "xxx/lib/python3.8/site-packages/fastrag/readers/FiD.py", line 8, in
from haystack.nodes import Seq2SeqGenerator
ImportError: cannot import name 'Seq2SeqGenerator' from 'haystack.nodes' (xxx/haystack/haystack/nodes/init.py)
/usr/lib/python3.8/tempfile.py:957: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmpuj6o9jw1'>
_warnings.warn(warn_message, ResourceWarning)
Steps to reproduce
git clone https://github.com/deepset-ai/haystack.git
cd haystack
pip install --upgrade pip
pip install -e '.[all-gpu]'
pip install git+https://github.com/IntelLabs/fastRAG.git
HW specifications:
I used steps below on a Window 11 PC:
cd \fastrag\
conda create -n fastrag python=3.10.6
conda activate fastrag
git clone https://github.com/IntelLabs/fastRAG.git
cd fastRAG\
pip install .
pip install .[intel]
The failure is:
E:\fastrag\fastRAG (main -> origin)
(fastrag) Ξ» pip install .[intel]
Processing e:\fastrag\fastrag
Preparing metadata (setup.py) ... done
Requirement already satisfied: farm-haystack==1.23.0 in e:\anaconda3\envs\fastrag\lib\site-packages (1.23.0)
Requirement already satisfied: transformers>=4.35.2 in e:\anaconda3\envs\fastrag\lib\site-packages (4.35.2)
Requirement already satisfied: datasets in e:\anaconda3\envs\fastrag\lib\site-packages (2.19.0)
Requirement already satisfied: evaluate in e:\anaconda3\envs\fastrag\lib\site-packages (0.4.1)
Requirement already satisfied: pandas in e:\anaconda3\envs\fastrag\lib\site-packages (2.2.2)
Requirement already satisfied: nltk in e:\anaconda3\envs\fastrag\lib\site-packages (3.8.1)
Requirement already satisfied: tqdm in e:\anaconda3\envs\fastrag\lib\site-packages (4.66.2)
Requirement already satisfied: numba in e:\anaconda3\envs\fastrag\lib\site-packages (0.59.1)
Requirement already satisfied: openpyxl in e:\anaconda3\envs\fastrag\lib\site-packages (3.1.2)
Requirement already satisfied: numpy in e:\anaconda3\envs\fastrag\lib\site-packages (1.26.4)
Requirement already satisfied: protobuf==3.20.2 in e:\anaconda3\envs\fastrag\lib\site-packages (3.20.2)
Requirement already satisfied: ujson in e:\anaconda3\envs\fastrag\lib\site-packages (5.9.0)
Requirement already satisfied: accelerate in e:\anaconda3\envs\fastrag\lib\site-packages (0.29.3)
Requirement already satisfied: fastapi in e:\anaconda3\envs\fastrag\lib\site-packages (0.110.2)
Requirement already satisfied: uvicorn in e:\anaconda3\envs\fastrag\lib\site-packages (0.29.0)
Requirement already satisfied: Pillow==10.1.0 in e:\anaconda3\envs\fastrag\lib\site-packages (10.1.0)
INFO: pip is looking at multiple versions of fastrag[intel] to determine which version is compatible with other requirements. This could take a while.
ERROR: Could not find a version that satisfies the requirement intel-extension-for-pytorch (from fastrag[intel]) (from versions: none)
ERROR: No matching distribution found for intel-extension-for-pytorch
E:\fastrag\fastRAG (main -> origin)
(fastrag) Ξ»
Packages and their versions installed in this venv are listed in attached.
fastrag packages.txt
Hi,
Thanks for this great repo!
Is there any way to use this pipeline in multilingual settings?
Are there multilingual version of Colbert, PLAID and FiD? Else, how would you recommend to proceed?
I ran this separately just to test the fastRAG pipeline
python -m fastrag.rest_api.application --config=config/my_config.yaml --app_type qa
The output looks like this
e torch.utils._pytree.register_pytree_node instead. _torch_pytree._register_pytree_node( config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 794/794 [00:00<00:00, 353kB/s] pytorch_model.bin: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 90.9M/90.9M [00:09<00:00, 9.75MB/s] tokenizer_config.json: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 316/316 [00:00<00:00, 143kB/s] vocab.txt: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 232k/232k [00:00<00:00, 482kB/s] special_tokens_map.json: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 112/112 [00:00<00:00, 41.6kB/s] The model 'FusionInDecoderForConditionalGeneration' is not supported for text2text-generation. Supported models are ['BartForConditionalGeneration', 'BigBirdPegasusForConditionalGeneration', 'BlenderbotForConditionalGeneration', 'BlenderbotSmallForConditionalGeneration', 'EncoderDecoderModel', 'FSMTForConditionalGeneration', 'GPTSanJapaneseForConditionalGeneration', 'LEDForConditionalGeneration', 'LongT5ForConditionalGeneration', 'M2M100ForConditionalGeneration', 'MarianMTModel', 'MBartForConditionalGeneration', 'MT5ForConditionalGeneration', 'MvpForConditionalGeneration', 'NllbMoeForConditionalGeneration', 'PegasusForConditionalGeneration', 'PegasusXForConditionalGeneration', 'PLBartForConditionalGeneration', 'ProphetNetForConditionalGeneration', 'SeamlessM4TForTextToText', 'SwitchTransformersForConditionalGeneration', 'T5ForConditionalGeneration', 'UMT5ForConditionalGeneration', 'XLMProphetNetForConditionalGeneration']. INFO: Started server process [67999] INFO: Waiting for application startup. INFO: Application startup complete. INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) INFO: 127.0.0.1:65140 - "GET /status HTTP/1.1" 404 Not Found
When I turn on the UI separately it will return 404 Not Found on the endpoint, and connection error displays as below. Any help? But running on notebooks is absolutely fine.
Hello,
Thanks for this fantastic repo! I have a question for the replug_parallel_reader.ipynb. I follow the same code and setting in replug_parallel_reader.ipynb. For the question ""Who is the main villan in Lord of the Rings?", my answer output is Roman Empire. This answer seems incorrect compared to the answer ("Sauron") shown in the above ipynb notebook.
If possible, could you please let me know what could be potential reasons for this? Thanks!
In GPT as both retriever and ranker, there is a line of code:
from fastrag.prompters.document_shapers.document_lister import DocumentLister
However if we look at fastrag/prompters/document_shapers/__init__.py
the file is empty, the documentLister is not imported.
Hi,
is there, in the roadmap, a plan to support/migrate to Haystack v2.0?
Due to libs deletion the link to IndexUpdater from README doesn't work.
Can your examples include the haystack installation lines (with compatible versions)? I get errors on the newest farm-haystack and haystack-ai packages. When I downgraded to farm-haystack==1.17.2 (saw in another issue), then basic haystack 'getting started' code doesn't work. Are these projects synced?
Tried to run the notebook 'client_inference_with_Llama_cpp.ipynb'. Got the following error:
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
Cell In[1], line 2
1 from haystack import Pipeline
----> 2 from haystack.nodes.prompt import PromptNode
3 from haystack.nodes import PromptModel
4 from haystack.nodes.prompt.prompt_template import PromptTemplate
File ~/miniconda3/envs/jupyter/lib/python3.10/site-packages/haystack/nodes/__init__.py:1
----> 1 from haystack.nodes.base import BaseComponent
3 from haystack.nodes.document_classifier import BaseDocumentClassifier, TransformersDocumentClassifier
4 from haystack.nodes.extractor import EntityExtractor, simplify_ner_for_qa
File ~/miniconda3/envs/jupyter/lib/python3.10/site-packages/haystack/nodes/base.py:11
8 import logging
10 from haystack.schema import Document, MultiLabel
---> 11 from haystack.errors import PipelineSchemaError
12 from haystack.utils.reflection import args_to_kwargs
15 logger = logging.getLogger(__name__)
ImportError: cannot import name 'PipelineSchemaError' from 'haystack.errors' (/root/miniconda3/envs/jupyter/lib/python3.10/site-packages/haystack/errors.py)
it would be greatly beneficial if there is an example to demonstrate how to use the fine tuning script with a custom proprietary data set.
Thanks in advance.
Although the example you've provided "simple_odqa_pipeline.ipynb' works pretty fine on the example list you've provided. I changed that list to a list of 23 elements, each with a pretty long string. It's taking over 6.5 minutes to generate the result.
It's infeasible at runtime to incorporate this overhead. Any solution for this? Or did you try that?
Hi,
Have you load tested fastRAG pipeline with Colbert, PLAID and FiD on CPU instance?
Could you provide examples of latency, RPS relatively to CPU instance characteristics?
Hi there - is there any desire to add support for Chroma?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.