Giter Club home page Giter Club logo

gpt_index's Introduction

๐Ÿ—‚๏ธ ๏ธGPT Index

GPT Index is a project consisting of a set of data structures that are created using LLMs and can be traversed using LLMs in order to answer queries.

PyPi: https://pypi.org/project/gpt-index/.

Documentation: https://gpt-index.readthedocs.io/en/latest/.

๐Ÿš€ Overview

NOTE: This README is not updated as frequently as the documentation. Please check out the documentation above for the latest updates!

Context

  • LLMs are a phenomenal piece of technology for knowledge generation and reasoning.
  • A big limitation of LLMs is context size (e.g. OpenAI's davinci model for GPT-3 has a limit of 4096 tokens. Large, but not infinite).
  • The ability to feed "knowledge" to LLMs is restricted to this limited prompt size and model weights.
  • Thought: What if LLMs can have access to potentially a much larger database of knowledge without retraining/finetuning?

Proposed Solution

That's where the GPT Index comes in. GPT Index is a simple, flexible interface between your external data and LLMs. It resolves the following pain points:

  • Provides simple data structures to resolve prompt size limitations.
  • Offers data connectors to your external data sources.
  • Offers you a comprehensive toolset trading off cost and performance.

At the core of GPT Index is a data structure. Instead of relying on world knowledge encoded in the model weights, a GPT Index data structure does the following:

  • Uses a pre-trained LLM primarily for reasoning/summarization instead of prior knowledge.
  • Takes as input a large corpus of text data and build a structured index over it (using an LLM or heuristics).
  • Allow users to query the index in order to synthesize an answer to the question - this requires both traversal of the index as well as a synthesis of the answer.

๐Ÿ’ก Contributing

Interesting in Contributing? See our Contribution Guide for more details.

๐Ÿ“„ Documentation

Full documentation can be found here: https://gpt-index.readthedocs.io/en/latest/.

Please check it out for the most up-to-date tutorials, how-to guides, references, and other resources!

๐Ÿ’ป Example Usage

pip install gpt-index

Examples are in the examples folder. Indices are in the indices folder (see list of indices below).

To build a tree index do the following:

from gpt_index import GPTTreeIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader('data').load_data()
index = GPTTreeIndex(documents)

To save to disk and load from disk, do

# save to disk
index.save_to_disk('index.json')
# load from disk
index = GPTTreeIndex.load_from_disk('index.json')

To query,

index.query("<question_text>?", child_branch_factor=1)

๐Ÿ”ง Dependencies

The main third-party package requirements are tiktoken, openai, and langchain.

All requirements should be contained within the setup.py file. To run the package locally without building the wheel, simply do pip install -r requirements.txt.

gpt_index's People

Contributors

jerryjliu avatar teoh avatar veered avatar alec-tschantz avatar cclauss avatar hwchase17 avatar eltociear avatar johnshahawy avatar cnrpman avatar mistapproach avatar pavanyellow avatar mmz-001 avatar gwpx avatar hongyishi avatar

Stargazers

Roman avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.