Giter Club home page Giter Club logo

aidt_rag's Introduction

AIDT RAG

Reproducibility kit for the paper The JavaScript Package Selection Task: A Comparative Experiment Using an LLM-based Approach.

After configuring an appropriate Python environment (e.g., using conda), you can run:

  • a standalone version of the RAG system
  • the experiments reported in the paper

For a quick start with the RAG system, simply run:

import pandas as pd
from io import StringIO
from langchain_openai import ChatOpenAI
from rag import AIDTRag

# Configuring OpenAI (GPT)
OPENAI_API_KEY = "your OpenAI API KEY goes here"

# Database of technologies
GITHUB_DATASET = "./data/libraries_github.pkl"
technologies_df = pd.read_pickle(GITHUB_DATASET) 
# Ingestion of the technologies as documents for the database
documents = AIDTRag.load_documents(technologies_df)

TOP_K = 5
rag = AIDTRag(technologies_df, k=TOP_K)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0.0)
rag.set_llm(llm)

query = "extract a barcode from an image"
print("Query:", query)
print()

print("Zero-shot:", TOP_K)
# Invoke zero-shot strategy
zeroshot_json = rag.search(query)
print(zeroshot_json)

print("Retrieval + GPT-3.5 (RAG):", TOP_K)
# Invoke RAG strategy
rag_json = rag.execute(query, rerank='gpt-3.5', explain=True)
print(rag_json)

print()
# ranking_df = pd.read_json(StringIO(zeroshot_json))
ranking_df = pd.read_json(StringIO(rag_json))
ranking_df.head()

If Llama2 is used, Ollama should be first installed and the corresponding model should be downloaded.

Each experiment can be run individually, for example:

python experiment2_tool.py

After running an experiment, two CSV files with metrics and rankings will be automatically generated.

All the CSV files collected from the experiments can be analyzed with Jupyter notebooks.

aidt_rag's People

Contributors

andresdp avatar tommantonela avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.