Giter Club home page Giter Club logo

intelunnati's Introduction

Project for Intel Unnati Industrial Training 2024

PDF Chatbot using OpenVINO and RAG

A PDF-based chatbot leveraging OpenVINO and RAG techniques for efficient question-answering, developed as part of the Intel Unnati Industrial Training 2024.

Description

This project demonstrates how to create a chatbot that can answer questions related to a given PDF document using the Retrieval Augmented Generation (RAG) technique. The chatbot is implemented in Google Colab and uses various libraries including OpenVINO for efficient inference.

The project consists of a single notebook that performs the following tasks:

  1. Reads and processes a PDF file
  2. Generates a vector store from the PDF content
  3. Uses a Language Model (LLM) to answer questions based on the vector store

Components

Component Description
PDF Processing Extracts text from a PDF file using PyPDF2
Vector Store Generation Creates a FAISS index from text chunks using sentence-transformers
LLM Integration Uses TinyLlama model for generating responses
OpenVINO Optimization Leverages Intel's OpenVINO toolkit for optimized model inference
User Interface Implements a Gradio interface for easy interaction

How to Run

  1. Open the notebook in Google Colab.
  2. Run all cells in order.
  3. Upload a PDF file when prompted.
  4. Ask questions about the PDF content using the Gradio interface.

Prerequisites

Before running the notebook, you need to install the required packages and import the necessary libraries. Run the following commands in a code cell:

# Install required packages
!pip install -q transformers sentence-transformers faiss-cpu PyPDF2 openvino-nightly
!pip install -q optimum[openvino]
!pip install numpy PyPDF2 sentence-transformers faiss-cpu optimum[intel] transformers nltk gradio

# Import required libraries
import numpy as np
import PyPDF2
from sentence_transformers import SentenceTransformer
import faiss
from transformers import AutoTokenizer
from optimum.intel import OVModelForCausalLM
import gc
import torch
import nltk
import gradio as gr
import tempfile
import os

# Download NLTK data
nltk.download('punkt', quiet=True)

Functionality

  1. PDF Processing: The notebook reads the uploaded PDF and extracts its text content.
  2. Semantic Chunking: The extracted text is divided into semantic chunks for better context preservation.
  3. Vector Store Creation: Chunks are embedded using a sentence transformer model and stored in a FAISS index for quick retrieval.
  4. Question Answering: When a user asks a question, the system:
    • Finds the most relevant chunks using semantic similarity
    • Constructs a prompt with the question and relevant context
    • Generates an answer using the TinyLlama model

Configuration

The notebook uses default configurations, but you can modify the following:

  • Chunk size and overlap in the create_semantic_chunks function
  • Number of relevant chunks retrieved (k) in the chatbot function
  • Model used for embeddings and language generation

OpenVINO Integration

This project leverages Intel's OpenVINO toolkit for optimized inference:

  • The TinyLlama model is loaded and exported using OVModelForCausalLM from the optimum.intel package.
  • This allows for hardware-specific optimizations, potentially improving inference speed and efficiency, especially on Intel hardware.
  • OpenVINO optimizations are particularly beneficial for larger models or high-volume query processing.

System Requirements

  • Google Colab environment (or local setup with similar specifications)
  • Internet connection for downloading models and libraries
  • For optimal performance with OpenVINO, Intel CPU is recommended

Performance Notes

  • The use of OpenVINO optimizations may significantly improve performance, especially on Intel hardware
  • Performance benefits may be more noticeable with larger models or when processing many queries
  • The TinyLlama model is relatively small, which allows for quick responses but may limit the complexity of answers

Troubleshooting

  • If CUDA out of memory errors is encountered, try restarting the runtime or using a CPU-only version.
  • Ensure all required libraries are correctly installed. Check the error message for missing packages.

Limitations

  • Performance depends on the quality and length of the uploaded PDF.
  • Uses a small language model (TinyLlama) which may limit response quality for complex queries.

Future Improvements

  • Support for multiple PDF uploads
  • Integration with more powerful language models
  • Implementation of conversation history and context awareness
  • Fine-tuning options for specific domains

Team Members and their contributions

This project was developed by a team of 5 members as part of the Intel Unnati Industrial Training 2024:

We would like to thank our mentor Abhishek Nandy and the Intel Unnati program for their guidance and support throughout this project.

Contributing

Contributions to improve the chatbot are welcome. Please feel free to fork the repository and submit a Pull Request or open an issue to discuss about your changes.

License

MIT License

Acknowledgements

This project uses several open-source libraries and models:

intelunnati's People

Contributors

paharipratyush avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.