Giter Club home page Giter Club logo

serverless-rag-demo's Introduction

Amazon Bedrock/ Llama2/ Falcon with Serverless RAG on Amazon Opensearch Serverless vector db

Overview

A new wave of widespread AI adoption is on the way with generative AI,having the potential to reinvent every aspect of customer experiences and applications. Generative AI is powered by very large machine learning models that are pre-trained on vast amounts of data, commonly referred to as foundation models (FMs). Large Language Models are a subset of Foundation Models(FMs) which are trained on trillions of words and they learn the patterns in the language, allowing them to generate human-like responses to any query we give them. Additionally, foundation models are trained on very general domain corpora, making them less effective for domain-specific tasks. There lies the importance of RAG. You can use Retrieval Augmented Generation (RAG) to retrieve data from outside a foundation model and augment your prompts by adding the relevant retrieved data in context.

Text generation using RAG with LLMs enables you to generate domain-specific text outputs by supplying specific external data as part of the context fed to LLMs. With RAG, the external data used to augment your prompts can come from multiple data sources, such as a document repositories, databases, or APIs. The first step is to convert your documents and any user queries into a compatible format to perform relevancy search. To make the formats compatible, a document collection, or knowledge library, and user-submitted queries are converted to numerical representations using embedding language models. Embedding is the process by which text is given numerical representation in a vector space. RAG model architectures compare the embeddings of user queries within the vector of the knowledge library. The original user prompt is then appended with relevant context from similar documents within the knowledge library. This augmented prompt is then sent to the foundation model. You can update knowledge libraries and their relevant embeddings asynchronously.

Amazon Opensearch Serverless offers vector engine to store embeddings for faster similarity searches. The vector engine provides a simple, scalable, and high-performing similarity search capability in Amazon OpenSearch Serverless that makes it easy for you to build generative artificial intelligence (AI) applications without having to manage the underlying vector database infrastructure.

Project Updates

(13-Dec-2023):

  • Support Meta Llama2 models on Amazon Bedrock. Support for Anthropic's latest Claude 2.1 model (200K context length). Screenshot 2023-12-13 at 11 12 40 AM

(09-Nov-2023):

  • Support Conversations with Opensearch Serverless (BETA) Screenshot 2023-11-09 at 7 06 17 PM

(27-Oct-2023):

  • Improve UI
Screenshot 2023-10-27 at 1 51 17 PM

(18-Oct-2023):

  • Support French/German for Anthropic Claude with Amazon Bedrock
  • Support for Redaction feature
  • Inbuilt Text Chunking feature with RecursiveTextSplitter from Langchain

(03-Oct-2023): Support for Amazon Bedrock

  • Anthropic Claude V1/V2/Instant support over Amazon Bedrock
  • Support for Streaming ingestion with Anthropic Claude Models
  • Faster Stack Deployments
  • New Functionality (PII/Sentiment/Translations) added on the UI
Screenshot 2023-10-03 at 1 37 53 PM

(14-Sept-2023): Support for new LLM's

  • Llama2-7B (Existing G5.2xlarge)
  • Llama2-13B (G5.12xlarge)
  • Llama2-70B (G5.48xlarge)
  • Falcon-7B (G5.2xlarge)
  • Falcon-40B (G5.12xlarge)
  • Falcon-180B (p4de.24xlarge)

New UX/UI (13-Sept-2023): Index Sample Data across different domains. Support multiple-assistant behaviours (Normal/Pirate/Jarvis Assistant modes)

  • Sample_Indexes
  • QueryBehaviour
Available Features

Multi-lingual Support

Screenshot 2023-10-18 at 1 23 37 AM

Sentiment Analysis

Screenshot 2023-10-18 at 1 29 54 AM

PII Data Detection

Screenshot 2023-10-18 at 1 30 48 AM

PII Data Redaction

Screenshot 2023-10-18 at 1 31 52 AM
Bedrock RAG Demo

Bedrock RAG Demo Video

Introducing Conversations with Opensearch Serverless
bedrock_conversations.mov
Translations / Sentiment Analysis / PII Identification and Redaction
github-final.mov
Llama2 RAG Demo

Llama2 RAG Demo

ImprovedVectorDB.mov

This solution demonstrates building a RAG (Retrieval Augmented Solution) with Amazon Opensearch Serverless Vector DB and Amazon Bedrock, Llama2 LLM, Falcon LLM

Prerequisites

Prerequisites

Familiarity with below Services

For Llama2/Falcon models deployed on Amazon Sagemaker

  • Amazon Sagemaker
  • GPU Instance of type ml.g5.2xlarge for endpoint usage
  • Supported Llama2 regions (us-east-1 , us-east-2 , us-west 2 , eu-west-1 , and ap-southeast-1)

Architecture

architecture

Deploying the Solution to your AWS account with AWS Cloudshell

Create an Admin User to deploy this stack

Section1 - Create an IAM user with Administrator permissions (OPTIONAL: If you're already an Admin user/role, you may skip this step)

  1. Search for the service IAM on the AWS Console and go the IAM Dashboard and click on “Users“ tab under ”Access Management” and Click on “Create User” image

  2. Give a name to the IAM user and click “Next“ image

  3. Now Click on Attach Policies directly and Choose "AdminsitratorAccess" and click "Next" image

  4. Now review the details and click on "Create User" image

  5. Now we need to create credentials for this IAM. Go to "Users" tab again and you will see your new user listed over there. Now click on the username. image

  6. Go to Security Credentials Tab and under "Access Keys" click on "Create Access key"

LLMAdminSecurityCredentials2
  1. In the window that appears choose the first option "Command line Interface" and click the checkbox at the bottom and click Next image

8.Now the Tag is optional and you can leave this empty and click on Create Access Key image

  1. Now click on Download .csv file to download the credentials and click on "Done". Now lets proceed to section 2 image
Deploy the RAG based Solution (Total deployment time 40 minutes)

Section 2 - Deploy this RAG based Solution (The below commands should be executed in the region of deployment)

  1. Search for AWS Cloudshell. Configure your aws cli environment with the access/secret keys of the new admin user using the below command on AWS Cloudshell. Optional if you have assumed an Administrator role.
       aws configure
    
LLMAdminConfigureCloudShell
  1. Git Clone the serverless-rag-demo repository from aws-samples

     git clone https://github.com/aws-samples/serverless-rag-demo.git
    
  2. Go to the directory where we have the downloaded files.

      cd serverless-rag-demo
    
  3. Fire the bash script that creates the RAG based solution. Pass the environment and region for deployment. environment can be dev,qa,sandbox. Look at Prerequisites to deploy to the correct reqion.

      sh creator.sh
    
  4. Select the LLM you want to deploy (sh creator.sh) . Select Option 1 for Amazon Bedrock service.

  5. When selecting Amazon Bedrock (Option 1), you should specify an API Key. The key should be atleast 20 characters long.

    Screenshot 2023-10-23 at 10 48 01 PM
  6. Press Enter to proceed with deployment of the stack or ctrl+c to exit

    Screenshot 2023-10-23 at 10 49 04 PM
  7. Total deployment takes around 40 minutes. Once the deployment is complete head to API Gateway. Search for API with name rag-llm-api-{env_name}. Get the invoke URL for the API

    ApiGw1
  8. Invoke the Api Gateway URL that loads an html page for testing the RAG based solution as api-gateway-url/rag

    • Do not forget to append "rag" at the end of the API-GW url

    eg: https://xxxxxxx.execute-api.us-east-1.amazonaws.com/dev/rag

    Add in your API Key used during stack Amazon Bedrock deployment to proceed with the demo

    Screenshot 2023-10-27 at 1 52 06 PM

serverless-rag-demo's People

Contributors

fraser27 avatar aswinisuren avatar amazon-auto avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.