Giter Club home page Giter Club logo

cloudflare-rag's Introduction

Fullstack Cloudflare RAG

This is a fullstack example of how to build a RAG (Retrieval Augmented Generation) app with Cloudflare. It uses Cloudflare Workers, Pages, D1, KV, R2, AI Gateway and Workers AI.

cloudflare_rag_demo.mp4

Demo

Deploy to Cloudflare Workers

Features:

  • Every interaction is streamed to the UI using Server-Sent Events
  • Hybrid RAG using Full-Text Search on D1 and Vector Search on Vectorize
  • Switchable between various providers (OpenAI, Groq, Anthropic) using AI Gateway with fallbacks
  • Per-IP Rate limiting using Cloudflare's KV
  • OCR is running inside Cloudflare Worker using unpdf
  • Smart Placement automatically places your workloads in an optimal location that minimizes latency and speeds up your applications

Development

Make sure you have Node, pnpm and wrangler CLI installed.

Install dependencies:

pnpm install # or npm install

Deploy necessary primitives:

./setup.sh

Then, in wrangler.toml, set the d1_databases.database_id to your D1 database id and kv_namespaces.rate_limiter to your rate limiter KV namespace id.

Then, create a .dev.vars file with your API keys:

CLOUDFLARE_ACCOUNT_ID=your-cloudflare-account-id # Required
GROQ_API_KEY=your-groq-api-key # Optional
OPENAI_API_KEY=your-openai-api-key # Optional
ANTHROPIC_API_KEY=your-anthropic-api-key # Optional

If you don't have these keys, /api/stream will fallback to Workers AI.

Run the dev server:

npm run dev

And access the app at http://localhost:5173/.

Deployment

Having the necessary primitives setup, first setup secrets:

npx wrangler secret put CLOUDFLARE_ACCOUNT_ID
npx wrangler secret put GROQ_API_KEY
npx wrangler secret put OPENAI_API_KEY
npx wrangler secret put ANTHROPIC_API_KEY

Then, deploy your app to Cloudflare Pages:

npm run deploy

Hybrid Search RAG

Hybrid Search RAG

This project uses a combination of classical Full Text Search (sparse) against Cloudflare D1 and Hybrid Search with embeddings against Vectorize (dense) to provide the best of both worlds providing the most applicable context to the LLM.

The way it works is this:

  1. We take user input and we rewrite it to 5 different queries using an LLM
  2. We run each of these queries against our both datastores - D1 database using BM25 for full-text search and Vectorize for dense retrieval
  3. We take the results from both datastores and we merge them together using Reciprocal Rank Fusion which provides us with a single list of results
  4. We then take the top 10 results from this list and we pass them to the LLM to generate a response

License

This project is licensed under the terms of the MIT License.

Consulting

If you need help in building AI applications, please reach out to me on Twitter or via my website. Happy to help!

cloudflare-rag's People

Contributors

rafalwilinski avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.