Giter Club home page Giter Club logo

tim.nem.ai's Introduction

Tim Ferris AI

Here's a link to try it out

As a way to examine what's possible with OpenAI's latest embeddings model called text-embedding-ada-002, I spent the weekend building a Tim Ferriss AI to answer questions addressed to him or any of his past guests.

We can use it to get human-like answers based on what was said in any episode.

TLDR;

The site uses a semantic search to find the chunks of text across all episodes that talk about what the question asks. Then it uses a GPT-3 model to generate a coherent answer.

Examples

See a few examples below on how it works:

caffeine deep-creative-work
dopamine habits
investments sleep

Run loop

When you pose a question, the following things happen:

  1. question text gets embedded
  2. that embedding gets matched to N closest embeddings across all transcript chunks
  3. the matched chunks get combined into a context string
  4. the context string and the question get combined into a prompt
  5. prompt is sent to another AI model to formulate into a coherent answer
  6. include a sorted-by-similarity list of episode links from all chunks (since all those episodes talk about what the question asked)

Code

The loop above translates to the following code:

// question text gets embedded 
const embedding = await getEmbedding(question);

// embedding gets matched to N closest embeddings across all transcript chunks
const trascriptChunks = await matchTranscriptChunks(question, embedding);

// matched chunks get combined into a context string
const context = combineChunksIntoContext(trascriptChunks);

// context string and the question get combined into a prompt
const prompt = buildPrompt(context, question);

// prompt is sent to another AI model to formulate into a coherent answer
const answer = await getAnswer(prompt);

// include a sorted-by-similarity list of episode links from all chunks
const sortedEpisodes = await getMatchedEpisodesSortedByRelevance(trascriptChunks);

Setup

I crawled (most) of the episode transcripts, chunked them up into smaller segments of text roughly paragraph-size, and then used the embeddings model to embed each chunk into a 1536-dimensional vector.

The frontend is a Next.js app, the data is stored in Supabase, and the embeddings search is using pg-vector.

tim.nem.ai's People

Contributors

nem035 avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Forkers

ankmister

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.