PDFBot is a self-learning project that utilizes the GPT (Generative Pre-trained Transformer) API to answer questions about uploaded PDF documents. With PDFBot, you can quickly extract information and insights from PDF files by asking questions, making it a valuable tool for research, document analysis, and more.
PDFBot is built using Python and leverages several libraries and technologies to achieve its functionality:
-
Streamlit: The user interface is created using Streamlit, which provides a simple and interactive way to upload PDF files and ask questions.
-
PyPDF2: PyPDF2 is used to extract text from the uploaded PDF files, making their content accessible for analysis.
-
Langchain: Langchain is a library used for text processing and question-answering tasks. It handles text splitting, embeddings, and vector storage.
-
OpenAI GPT API: The OpenAI GPT API is the heart of PDFBot, powering the question-answering capabilities. It processes the user's questions and generates meaningful responses based on the content of the uploaded PDF.
To use PDFBot:
-
Upload a PDF File: Click the "Upload a PDF file" button to select and upload a PDF document.
-
Ask Questions: Once the PDF is uploaded, you can enter your questions in the "Ask pdfBOT ๐ง " input field.
-
Get Answers: PDFBot will process your question and provide you with relevant answers based on the content of the PDF.
To run PDFBot locally or make changes to the code, follow these steps:
- Clone the Repository:
git clone https://github.com/yourusername/PDFBot.git
PDFBot is an open-source project, and contributions are welcome! If you have ideas for improvements, bug fixes, or new features, please create a pull request or submit an issue on the GitHub repository.
This project is licensed under the MIT License.