Giter Club home page Giter Club logo

toly-gpt's Introduction

TolyGPT

TolyGPT.com is a chatbot powered by GPT-4 and trained on the Solana validator codebase. You can use TolyGPT to ask questions about how the Solana validator works and receive highly specific answers with references back to code files.

This version of TolyGPT is available here mostly for research and record keeping purposes. If you're interested in something similar for your project, please see Autodoc, which contains an updated version of the core TolyGPT functionality designed for use with your own repositories.

Credit

This project was originally forked from Sean Sullivans's chatlangchain-js repository. The credit for the UI and core querying flow goes to him.

Other contributors include:

Getting Started

NOTE: This instructions below may not work. please see Autodoc for an updated version. If you must use this project, do so at your own risk.

This is a Next.js project bootstrapped with create-next-app.

First, create a new .env file from .env.example and add your OpenAI API key found here.

cp .env.example .env

Prerequisites

  • Node.js (v16 or higher)
  • Yarn
  • wget (on macOS, you can install this with brew install wget)

Next, we'll need to load our data source.

Data Ingestion

Data ingestion happens in two steps.

First, you should run

sh download.sh

This will download our data source (in this case the Langchain docs ).

Next, install dependencies and run the ingestion script:

yarn && yarn ingest

Note: If on Node v16, use NODE_OPTIONS='--experimental-fetch' yarn ingest

This will parse the data, split text, create embeddings, store them in a vectorstore, and then save it to the data/ directory.

We save it to a directory because we only want to run the (expensive) data ingestion process once.

The Next.js server relies on the presence of the data/ directory. Please make sure to run this before moving on to the next step.

Running the Server

Then, run the development server:

yarn dev

Open http://localhost:3000 with your browser to see the result.

Deploying the server

The production version of this repo is hosted on fly. To deploy your own server on Fly, you can use the provided fly.toml and Dockerfile as a starting point.

Note: As a Next.js app it seems like Vercel is a natural place to host this site. Unfortunately there are limitations to secure websockets using ws with Next.js which requires using a custom server which cannot be hosted on Vercel. Even using server side events, it seems, Vercel's serverless functions seem to prohibit streaming responses (e.g. see here)

Inspirations

This repo borrows heavily from

How To Run on Your Example

If you'd like to chat your own data, you need to:

  1. Set up your own ingestion pipeline, and create a similar data/ directory with a vectorstore in it.
  2. Change the prompt used in pages/api/util.ts - right now this tells the chatbot to only respond to questions about LangChain, so in order to get it to work on your data you'll need to update it accordingly.

The server should work just the same ๐Ÿ˜„

toly-gpt's People

Contributors

samheutmaker avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.