Giter Club home page Giter Club logo

chatbot-for-pdf's Introduction

Q&A Chatbot for PDF Documents

Chatbot

This repository contains a Streamlit application that utilizes the LangChain and OpenAI's language model to create a conversational Q&A chatbot. The chatbot is designed to answer questions about the content of PDF documents.

Table of Contents

Installation

Before starting, make sure you have Python 3.8+ installed. To run the Q&A Chatbot, follow these steps:

  1. Clone this repository:

    git clone https://github.com/mrzaid/Chatbot-for-PDF.git
    cd Chatbot-for-PDF
    
  2. Install the necessary libraries:

    pip install -r requirements.txt
    

Usage

To use this application:

  1. Obtain an OpenAI API key.

Do not expose this key publicly.

After obtaining your OpenAI API key, create a .env file in the root of your project directory.

  1. Add the following line to the .env file:

    OPENAI_API_KEY=your_openai_api_key
    

    Replace your_openai_api_key with your actual OpenAI API key.

  2. The application is set to automatically load environment variables from the .env file.

  3. Run the application:

    streamlit run app.py
    
  4. Open the Streamlit interface in your web browser.

  5. Upload a PDF file and ask questions about its content. The chatbot will generate answers based on the document's content.

Architecture

Architecture Diagram

The chatbot works in several steps:

  1. Upload PDF: You upload the desired PDF file that you want to ask questions about.

  2. Text Extraction: The bot uses the PyPDF2 library to read the PDF file and extract text from it.

  3. Text Splitting: The bot then splits the text into smaller chunks to overcome token limit issue and understand the content.

  4. Embeddings Creation: Using OpenAIEmbeddings, the bot creates text embeddings from the chunks.

  5. Document Search Creation: The bot then uses these embeddings to create a document search via the FAISS vectorstore.

  6. Conversational Chain Creation: A LangChain ConversationalRetrievalChain is created using the OpenAI model and the document retriever.

  7. User Query: Finally, you enter your query. The bot will provide a response based on the contents of the uploaded PDF, also citing the source sections from the PDF.

Contributing

Contributions are welcome! To contribute, please follow these steps:

  1. Fork the repository.
  2. Create a new branch: git checkout -b your-branch-name.
  3. Make your changes and commit: git commit -m 'Add some feature'.
  4. Push to the branch: git push origin your-branch-name.
  5. Create a pull request.

For larger changes, please open an issue first to discuss what you would like to add.

License

This project is licensed under the MIT License. See LICENSE for more details.

Contact

For more information, feel free to reach out!

chatbot-for-pdf's People

Contributors

mrzaid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.