Giter Club home page Giter Club logo

gpt-azure-search-engine's Introduction

image

Accelerator powered by Azure Cognitive Search + Azure OpenAI

Your organization needs a search engine that can make sense of all kinds of types of data, stored in different locations, and that can return the links of similar documents, but more importantly, provide the answer to the question! In other words, you want private and secured ChatGPT for your organization that can interpret, comprehend, and answer questions about your business. The goal of the MVP workshop is to show/prove the value of a Smart Search Engine built with the Azure Services, with your own data in your own environment. For more information on the 2 day workshop, click the powerpoint presentation below:

Accelerator Pitch Deck

Click "view raw" to view powerpoint presentation

Prerequisites Client 2-Day Workshop

  • Azure subscription
  • Accepted Application to Azure Open AI
  • Microsoft members need to be added as Guests in clients Azure AD
  • A Resource Group (RG) needs to be set for this Workshop POC, in the customer Azure tenant
  • The customer team and the Microsoft team must have Contributor permissions to this resource group
  • A storage account must be set in place in the RG
  • Data/Documents must be uploaded to the blob storage account, at least one week prior to the workshop date
  • Azure Machine Learning Workspace must be deployed in the RG
  • Optional but recommended – Databricks Workspace deployed in the RG

Architecture

Architecture

Demo

https://webapp-cstevuxaqrxcm.azurewebsites.net/


🔧Features

  • Shows how you can use Azure OpenAI + Azure Cognitive Search to have a Smart and Multilingual Search engine that not only provides links of the search results, but also answers the question.
  • Solve 80% of the use cases where companies want to use OpenAI to provide answers from their knowledge base to customers or employees, without the need of retraining and hosting the models.
  • All Azure services and configuration are deployed via python code.
  • Uses Azure Cognitive Services to enrich documents: Detect Language, OCR images, Key-phrases extraction, entity recognition (persons, emails, addresses, organizations, urls).
  • Uses LangChain as a wrapper for interacting with Azure OpenAI , vector stores and constructing prompts.
  • Uses Streamlit to build the web application in python.
  • (Coming soon) recommends new searches based on users' history.

Steps to Run the Accelerator

Note: (Pre-requisite) You need to have an Azure OpenAI service already created

  1. Fork this repo to your Github account.
  2. In Azure OpenAI studio, deploy these two models: Make sure that the deployment name is the same as the model name.
    • "gpt-35-turbo" for the model "gpt-35-turbo (0301)"
    • "text-embedding-ada-002"
  3. Create a Resource Group where all the assets of this accelerator are going to be.
  4. Create an Azure Cognitive Search Service and Cognitive Services Account by clicking below:

Deploy To Azure

Note: If you have never created a cognitive multi-service account before, please create one manually in the azure portal to read and accept the Responsible AI terms. Once this is deployed, delete this and then use the above deployment button.

  1. Enable Semantic Search on your Azure Cognitive Search Service:
    • On the left-nav pane, select Semantic Search (Preview).
    • Select either the Free plan or the Standard plan. You can switch between the free plan and the standard plan at any time.
  2. Install the dependencies on your machine:
pip install -r ./requirements.txt
  1. Edit app/credentials.py with your azure services information
  2. Run 01-Load-Data-ACogSearch.ipynb:
    • Loads data into your Search Engine and create the index with AI skills
  3. Run 02-Quering-AOpenAI.ipynb and:
    • Run queries in Azure Cognitive Search and see how they compare with enhancing the experience with Azure OpenAI
  4. Go to the app/ folder and click the Deploy to Azure function to deploy the Web Application in Azure Web App Service. It takes a few minutes.
  • The deployment automatically comes with CI/CD, so any change that you commit/push to the code will automatically trigger a deployment in the Application.

FAQs

  1. Why the vector similarity is done in memory using FAISS versus having a separate vector database like RedisSearch or Pinecone?

A: True, doing the embeddings of the documents pages everytime that there is a query is not efficient. The ideal scenario is to vectorize the docs pages once (first time they are needed) and then retrieve them from a database the next time they are needed. For this a special vector database is necessary. The ideal scenario though, is Azure Search to retreive the vectors as part of the search results, along with the document pages. Azure Search will soon allow this in a few months, let's wait for it. As of right now the embedding process doesn't take that much time or money, so it is worth the wait versus using another database just for vectors.

  1. Why use the REFINE type in LangChaing versus STUFF type?

A: Because using STUFF type with all the content of the pages as context, uses too many tokens. So the best way to deal with large documents is to refine the answers by going trough all of the search results and do many calls to the LLM looking for a refined answer. For more information of the difference between STUFF and REFINE, see HERE

  1. Why use Azure Cognitive Search engine to provide the context for the LLM and not fine tune the LLM instead?

A: Quoting the OpenAI documentation: "GPT-3 has been pre-trained on a vast amount of text from the open internet. When given a prompt with just a few examples, it can often intuit what task you are trying to perform and generate a plausible completion. This is often called "few-shot learning. Fine-tuning improves on few-shot learning by training on many more examples than can fit in the prompt, letting you achieve better results on a wide number of tasks. Once a model has been fine-tuned, you won't need to provide examples in the prompt anymore. This saves costs and enables lower-latency requests"

So training/fine tunning the model requires that we provide hundreds/thousands of Prompt and Completion tuples, or in other words, we need to provide samples of query-responses. For a company knowledge base of Terabytes of information this is not feasible. To come up with all the possible tuples that users my request, is simply not possible. So the search engine is absolutely necessary for a company data search engine using OpenAI.


Known Issues

  1. Error when sending question: "This model's maximum context length is 2047 tokens, however you requested xxxx tokens (xxxxx in your prompt; 0 for the completion). Please reduce your prompt; or completion length"

This error happens if your embedding model text-embedding-ada-002 has a limit of 2047 max tokens. Older versions of this model in Azure OpenAI has this reduced limit. However the newer versions have the 8192 limit. Make sure you request the newer version, or if not possible, reduce the size of the TextSplit in Azure Search indexing from 5000 (default) to 3500.

gpt-azure-search-engine's People

Contributors

pablomarin avatar giorgiosaez avatar juliaashleyh8 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.