Welcome to the End-to-End Retrieval-Augmented Generation (RAG) Pipeline project! This repository provides a complete solution for building, deploying, and interacting with a RAG pipeline, leveraging various modern technologies including LangChain, Pinecone, OpenAI, and Streamlit.
credits goes to :https://github.com/Vasanthengineer4949/End-to-End-RAG/tree/main
The End-to-End RAG Pipeline project is designed to facilitate the process of loading documents, creating embeddings, storing them in a vector store, and running queries against this store using a Language Model (LLM). This project integrates several components to provide a seamless experience for building and interacting with an RAG pipeline.
- Document Loading: Load documents from web URLs using
WebBaseLoader
. - Text Splitting: Efficiently split documents into chunks with
RecursiveCharacterTextSplitter
. - Embedding Generation: Generate embeddings using OpenAI's models.
- Vector Store: Store embeddings in Pinecone for fast retrieval.
- Language Model Integration: Utilize Groq's LLM for processing queries.
- Guardrails: Ensure safe and effective interactions with NeMo Guardrails.
- Streamlit Interface: User-friendly interface for interacting with the pipeline.
- Python 3.8 or higher
- Pinecone API Key
- OpenAI API Key
- Groq API Key
- LangSmith API Key
- Clone the Repository
git clone https://github.com/your-username/end-to-end-rag.git cd end-to-end-rag
- Create and Activate Virtual Environment
python3 -m venv venv
source venv/bin/activate # On Windows, use `venv\Scripts\activate`
- Install Dependencies
or
pip install -r requirements.txt
Pipfile
- Environment Variables
OPENAI_API_KEY=your_openai_api_key
PINECONE_API_KEY=your_pinecone_api_key GROQ_API_KEY=your_groq_api_key LANGSMITH_API_KEY=your_langsmith_api_key
5. **Running the Streamlit App**
```sh
streamlit run app.py
or
pipenv streamlit run app.py
Project Structure: . ├── README.md ├── app.py ├── run.py ├── config │ ├── actions.py │ ├── config.py │ ├── config.yml │ ├── rails.co │ └── ... ├── requirements.txt └── .env