Project for Intel Unnati Industrial Training 2024

PDF Chatbot using OpenVINO and RAG

A PDF-based chatbot leveraging OpenVINO and RAG techniques for efficient question-answering, developed as part of the Intel Unnati Industrial Training 2024.

Description

This project demonstrates how to create a chatbot that can answer questions related to a given PDF document using the Retrieval Augmented Generation (RAG) technique. The chatbot is implemented in Google Colab and uses various libraries including OpenVINO for efficient inference.

The project consists of a single notebook that performs the following tasks:

Reads and processes a PDF file
Generates a vector store from the PDF content
Uses a Language Model (LLM) to answer questions based on the vector store

Components

Component	Description
PDF Processing	Extracts text from a PDF file using PyPDF2
Vector Store Generation	Creates a FAISS index from text chunks using sentence-transformers
LLM Integration	Uses TinyLlama model for generating responses
OpenVINO Optimization	Leverages Intel's OpenVINO toolkit for optimized model inference
User Interface	Implements a Gradio interface for easy interaction

How to Run

Open the notebook in Google Colab.
Run all cells in order.
Upload a PDF file when prompted.
Ask questions about the PDF content using the Gradio interface.

Prerequisites

Before running the notebook, you need to install the required packages and import the necessary libraries. Run the following commands in a code cell:

# Install required packages
!pip install -q transformers sentence-transformers faiss-cpu PyPDF2 openvino-nightly
!pip install -q optimum[openvino]
!pip install numpy PyPDF2 sentence-transformers faiss-cpu optimum[intel] transformers nltk gradio

# Import required libraries
import numpy as np
import PyPDF2
from sentence_transformers import SentenceTransformer
import faiss
from transformers import AutoTokenizer
from optimum.intel import OVModelForCausalLM
import gc
import torch
import nltk
import gradio as gr
import tempfile
import os

# Download NLTK data
nltk.download('punkt', quiet=True)

Functionality

PDF Processing: The notebook reads the uploaded PDF and extracts its text content.
Semantic Chunking: The extracted text is divided into semantic chunks for better context preservation.
Vector Store Creation: Chunks are embedded using a sentence transformer model and stored in a FAISS index for quick retrieval.
Question Answering: When a user asks a question, the system:
- Finds the most relevant chunks using semantic similarity
- Constructs a prompt with the question and relevant context
- Generates an answer using the TinyLlama model

Configuration

The notebook uses default configurations, but you can modify the following:

Chunk size and overlap in the create_semantic_chunks function
Number of relevant chunks retrieved (k) in the chatbot function
Model used for embeddings and language generation

OpenVINO Integration

This project leverages Intel's OpenVINO toolkit for optimized inference:

The TinyLlama model is loaded and exported using OVModelForCausalLM from the optimum.intel package.
This allows for hardware-specific optimizations, potentially improving inference speed and efficiency, especially on Intel hardware.
OpenVINO optimizations are particularly beneficial for larger models or high-volume query processing.

System Requirements

Google Colab environment (or local setup with similar specifications)
Internet connection for downloading models and libraries
For optimal performance with OpenVINO, Intel CPU is recommended

Performance Notes

The use of OpenVINO optimizations may significantly improve performance, especially on Intel hardware
Performance benefits may be more noticeable with larger models or when processing many queries
The TinyLlama model is relatively small, which allows for quick responses but may limit the complexity of answers

Troubleshooting

If CUDA out of memory errors is encountered, try restarting the runtime or using a CPU-only version.
Ensure all required libraries are correctly installed. Check the error message for missing packages.

Limitations

Performance depends on the quality and length of the uploaded PDF.
Uses a small language model (TinyLlama) which may limit response quality for complex queries.

Future Improvements

Support for multiple PDF uploads
Integration with more powerful language models
Implementation of conversation history and context awareness
Fine-tuning options for specific domains

Team Members and their contributions

This project was developed by a team of 5 members as part of the Intel Unnati Industrial Training 2024:

Swetakshi Nanda: Project lead, architecture design
Pratyush Pahari: LLM integration and OpenVINO optimization
Arpan Bag: PDF Processing, Embedding generation
Akashdeep Mitra: User interface development and integration
Tulika Chakraborty: Documentation of the complete Project

We would like to thank our mentor Abhishek Nandy and the Intel Unnati program for their guidance and support throughout this project.

Contributing

Contributions to improve the chatbot are welcome. Please feel free to fork the repository and submit a Pull Request or open an issue to discuss about your changes.

License

MIT License

Acknowledgements

This project uses several open-source libraries and models:

paharipratyush / intelunnati Goto Github PK

intelunnati's Introduction