This project is a language model (LLM) powered chatbot, designed to comprehend and respond to questions based on an uploaded PDF file. It is built using Streamlit, LangChain, and HuggingFace LLM models.
The app uses the HuggingFace Instruct embeddings to understand the content of the PDF and answer the user's queries. The document is split into chunks and embeddings are created for each chunk. These embeddings are then stored and used to answer any queries related to the document.
If you upload a PDF, the application will:
- Extract the text from the PDF.
- Split the text into chunks.
- Compute embeddings for each chunk.
- Store these embeddings for later use.
- Accept user queries related to the PDF.
- Search for the most relevant chunks to the query.
- Use the LLM to generate a response to the query.
To use the chatbot:
- Clone this repository.
- Replace "HUGGINGFACEHUB_API_TOKEN" with your HuggingFace API token in the
app.py
file. - Run the Streamlit app using the command
streamlit run app.py
. - Upload a PDF file using the file uploader in the Streamlit app.
- Ask a question related to your PDF file in the provided text input field.