A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices.

Home Page: https://azure.microsoft.com/products/search

License: MIT License

Python 73.13% HTML 0.04% TypeScript 5.73% CSS 1.56% Bicep 16.94% Makefile 0.29% Shell 0.86% Dockerfile 0.28% PowerShell 0.02% Jinja 1.15%

ai-search azd-templates azure azure-openai openai

chat-with-your-data-solution-accelerator's Issues

Branch Protection policy

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

At the moment, the main branch does not have any protection on it. Add a branch protection rule to prevent anybody being able to merge to main

Responsible AI Guidelines

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Update with any Responsible AI style and comments: https://styleguides.azurewebsites.net/StyleGuide/Read?id=2984

Expected/desired behavior

Double check if we are following the Responsible AI style guidelines.

Secure Speech Service Key/Region

Motivation

Ensure sensitive credentials cannot be exposed. We are currently returning the Azure Speech Key from the /api/config endpoint in plain text

chat-with-your-data-solution-accelerator/code/app.py

Lines 32 to 40 in ede0d6d

 @app.route("/api/config", methods=["GET"]) 

 def get_config(): 

 # Return the configuration data as JSON 

 return jsonify( 

 { 

 "azureSpeechKey": env_helper.AZURE_SPEECH_KEY, 

 "azureSpeechRegion": env_helper.AZURE_SPEECH_SERVICE_REGION, 

 } 

 )

How would you feel if this feature request was implemented?

Requirements

Stop returning the value of the Azure Speech Key from the /api/config endpoint. Consider using SecretStr from pydantic https://docs.pydantic.dev/2.6/examples/secrets/ to return a masked version of the secret
Ensure that the speech service from the frontend is still functional (it calls the /api/config endpoint) #501
Stretch: Improve this endpoint to return the full config with all secrets masked

Tasks

To be filled in by the engineer picking up the issue

Request for updating ReadMe.md file

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Getting an error when running docker compose up on WSL

Any log messages given by the failure

Failed to load /home/alias/chat-with-your-data-solution-accelerator/docker/...env: open /home/alias/chat-with-your-data-solution-accelerator/docker/...env: no such file or directory.

Expected/desired behavior

OS and Version?

Linux (Ubuntu)

Versions

Mention any other details that might be useful

Issue is the way it's setting the path to that .env file. here
https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/blob/main/docker/docker-compose.yml#L9
. It's using backslash \ which will only work on Windows. We need to modify the docker-compose.yml to use forward slash / .

It would be great if this can be updated in the ReadMe.md file. It would also be great if we could add a autodetect feature where the type of slash to be used is autodetected based on the OS version

Thanks! We'll be in touch soon.

Make sure the streamed response isn't repeating tools unnecessarily

I believe the app.py in this repo is based off the sample-app app.py. In that case, I believe it has a performance bug due to this line:

https://github.com/Azure-Samples/azure-search-openai-solution-accelerator/blob/db254e0316d0c9031158bb70a0531159b4d111af/app.py#L139

If you check in the browser when it's streaming, you'll see that the "tools" gets repeated in every chunk streamed, which requires a lot of bandwidth and is unnecessary. You should only need to stream the tools once, and the frontend can process it when it sees it.

Issue when run locally

I have a error when I start to chat with the bot I run locally. Please help me fix the error if you know why. Thank you!

Bug - Some questions are suddenly generating "list index out of range" error

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Ask a question on the chatbot

Any log messages given by the failure

Expected/desired behavior

An answer from my documents

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Update settings for faster local dev / frontend.

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ s] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

When developing locally, I want to be able to more quickly develop the frontend and backend. Help me start the app in dev mode so that changes will be hot reloaded without having to rebuild the frontend and restart every time.

Mention any other details that might be useful

Updating the vite server configuration to proxy to the flask app running locally will allow this.

Thanks! We'll be in touch soon.

issue when test chat : The API deployment for this resource does not exist. If you created..

Hi,

I was implementing the chat-with-your-data-solution-accelerator solution, but I get an error during the chat test.
description of the error:

/api/conversation/custom:1

Failed to load resource: the server responded with a status of 500 (INTERNAL SERVER ERROR)

{
"error": "415 Unsupported Media Type: Did not attempt to load JSON data because the request Content-Type was not 'application/json'."
}

Ensure we are using latest TLS version on Azure services

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Storage and App Service should define a later version (1.2) of TLS

Pin the indirect dependencies as well

Currently requirements.txt pins only the direct dependencies (as far as I can see). That means that an indirect dependency can cause a bug, which can be really painful for developers. We unfortunately experienced that in the other repo.

See this PR for an example of how we changed to pinning all dependencies:

Azure-Samples/azure-search-openai-demo#693

Support for LangChain agent

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Create OpenAI resources as part of one click deploy

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

The OpenAI resource and associated deployments are created as part of the one click deployment

Unable to deploy the accelerator due to old application insights

Hi,
This solution accelerator is using application insights which is not allowed by my organization.
Is there anyway to edit the repo or template and provide the components which we already have in our system?

Action required: migrate or opt-out of migration to GitHub inside Microsoft

Migrate non-Open Source or non-External Collaboration repositories to GitHub inside Microsoft

In order to protect and secure Microsoft, private or internal repositories in GitHub for Open Source which are not related to open source projects or require collaboration with 3rd parties (customer, partners, etc.) must be migrated to GitHub inside Microsoft a.k.a GitHub Enterprise Cloud with Enterprise Managed User (GHEC EMU).

Action

✍️ Please RSVP to opt-in or opt-out of the migration to GitHub inside Microsoft.

❗Only users with admin permission in the repository are allowed to respond. Failure to provide a response will result to your repository getting automatically archived.🔒

Instructions

Reply with a comment on this issue containing one of the following optin or optout command options below.

✅ Opt-in to migrate

@gimsvc optin --date <target_migration_date in mm-dd-yyyy format>

Example: @gimsvc optin --date 03-15-2023

❌ Opt-out of migration

@gimsvc optout --reason <staging|collaboration|delete|other>

Example: @gimsvc optout --reason staging

Options:

staging : This repository will ship as Open Source or go public

collaboration : Used for external or 3rd party collaboration with customers, partners, suppliers, etc.

delete : This repository will be deleted because it is no longer needed.

other : Other reasons not specified

Need more help? 🖐️

Email [email protected]. ✉️
Post your questions in GitHub inside Microsoft Team in Microsoft Teams. 🗨️

Improve developer experience

Improve the developer experience. As a developer I would like to clone the repository and be able to deploy this accelerator from my machine and run everything locally with the minimum of ceremony.

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

from a single command. (Happy to take ownership of this btw!)

OpenAI embedding token limit - reprocess all documents

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Take a pdf sample file of 10 MB ( the one I used is this one https://freetestdata.com/wp-content/uploads/2022/11/Free_Test_Data_10.5MB_PDF.pdf) and upload it :

This will results in a success for the azure function batch push results (no token limit exceeded)

upload at the same time more than one copy of the file in order that them combined surpass the token limitation. Let the ingestion of data process and an error will be displayed in the Batch Push Results.

(the same error can be achieved if there are several documents in your azure blob storage that when uploaded and ingested individually were okay but combined they surpass your token limitation. If, through the admin page, you click on reprocess all documents, you will encounter the same error (in fact the same azure function is called, but with the parameter to take all files in the blob storage equal to true and not only the one without the embedding).)

Any log messages given by the failure

Result: Failure Exception: RateLimitError: Requests to the Embeddings_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 3 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

Expected/desired behavior

What I would like the app to do is: when reached the token limit (or when it estimates that the next file or chunk is gonna surpass the limit) to wait that the minute finishes and then continue afterwards (if it is not possible at chunk level at least at file level), in this way it will not fail due to token per minute limitations.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Error: `No module named 'utilities'`

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

In Azure Portal, open chatsa01-website-admin > Environment variables

Change value for AZURE_OPENAI_RESOURCE from "https://charris-openai4.openai.azure.com/" to "charris-openai4"

Refresh Explore Data page

Receive errors below
5 . Change the value back

Continue to receive errors

Restart web app

Continue to receive errors

Navigate back to Admin page (no errors)

Navigate to Explore Data page No Errors

Any log messages given by the failure

ModuleNotFoundError: No module named 'utilities'
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "/usr/local/src/myscripts/admin/pages/01_Ingest_Data.py", line 14, in
from utilities.helpers.ConfigHelper import ConfigHelper

Expected/desired behavior

the config value can be corrected without breaking the app.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Naming convention for app components: Frontend, Backend, Admin

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

Suggestion for the naming conventions for the pieces of the app. Today it is shown as two main piece:

Frontend (which includes the Flask app and the TS React UI)
Backend (which includes the admin app/UI)
Batch Processing Functions

For clarity sake, I'd suggest we refer to the "backend" as the "admin" application, as I believe many will conceptualize the main app's Flask component as the "backend" and the react application or UI as the "frontend." Just a naming convention, but would it make sense to organize as:

App
a. Frontend (TS React)
b. Backend (Flask Application)
Admin App (Streamlit)
Batch Processing Functions

Requesting Support of PowerPoint file types

Requesting support of PowerPoint file type, currently only following file types:

PDF
JPEG
JPG
PNG
TXT
HTML
MD (Markdown)
DOCX

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Citations for URLs should include full URL including domain

Motivation

Citations should be clickable to enable users to quickly see full references.

Minimal steps to reproduce

Use the Admin site to ingest a URL (https://legislature.idaho.gov/statutesrules/idstat/Title25/T25CH28/SECT25-2801/)

Ask a question that can be satisfied by the URL content (Who can make an order requiring dog owners to get a license?)

Click the Citation button in the response

Notice that the citation does not include the domain of the website

Requirements

Citations should have a full clickable link including domain

Tasks

To be filled in by the engineer picking up the issue

Task 1
Task 2
...

Cannot use embedding model deployment with a different name from the model itself

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Attempt to use an embedding deployment with any name besides the model.

Example: Instead of naming the deployment "text-ada-embedding-002" name it "ada002" and set the following env variable:
AZURE_OPENAI_EMBEDDING_MODEL=ada002

Any log messages given by the failure

2023-07-14T21:57:03.449736723Z: [ERROR] openai.error.InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.

Expected/desired behavior

The deployment and model name do not have to be the same

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Azure Functions have separate envs and do not pull from .env

Need to make sure that .env is pulled for running Azure Functions locally, otherwise user has to configure everything twice in local.settings.json

ARM Deploy or Bicep Deploy --- Issue

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Deploy the ARM manually using “Deploy a custom template”

Copy and paste the deployment.json content into the editor.

Update the 4 parameters (rg, prefix, AI key, AI name)

Deploy

All resources are successfully created.

Navigate to the Admin web site.

Select Explore Data.

Error is returned

Navigate to the User web site.

Type in a message.

Error is returned.

Any log messages given by the failure

Admin - Explore Data
Traceback (most recent call last): File "/usr/local/src/myscripts/pages/02_Explore_Data.py", line 36, in search_client = vector_store_helper.get_vector_store().client File "/usr/local/src/myscripts/utilities/helpers/AzureSearchHelper.py", line 33, in get_vector_store vector_search_dimensions=len(llm_helper.get_embedding_model().embed_query("Text")), File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 536, in embed_query embedding = self._embedding_func(text, engine=self.deployment) File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 467, in _embedding_func return embed_with_retry( File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 107, in embed_with_retry return _embed_with_retry(**kwargs) File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f return self(f, *args, **kw) File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 379, in call do = self.iter(retry_state=retry_state) File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 314, in iter return fut.result() File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result return self.__get_result() File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result raise self._exception File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 382, in call result = fn(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 104, in _embed_with_retry response = embeddings.client.create(**kwargs) File "/usr/local/lib/python3.9/site-packages/openai/api_resources/embedding.py", line 33, in create response = super().create(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create response, _, api_key = requestor.request( File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 298, in request resp, got_stream = self._interpret_response(result, stream) File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 700, in _interpret_response self._interpret_response_line( File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 763, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.

User - Chat
Failed to load resource: the server responded with a status of 500 (INTERNAL SERVER ERROR)
/api/conversation/custom:1

Expected/desired behavior

Admin - Explore Data --- View the ingested data
User - Chat --- A response from the chat bot

OS and Version?

Windows 11

Versions

No published version

Mention any other details that might be useful

Based on the error messaging, the APIs for this solution were not deployed.

I experienced the same issue using the Bicep deployment with Azure CLI.

Thanks! We'll be in touch soon.

Upgrade Azure functions to use v2 programming model

From a quick look, it seems like the functions aren't yet using the new programming model, which we want everyone to move towards eventually.

Blog post here:
https://techcommunity.microsoft.com/t5/azure-compute-blog/azure-functions-v2-python-programming-model-is-generally/ba-p/3827474

The embeddings operation does not work with the specified model, gpt-35-turbo

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Trying to use the .env file with these parameters:

AZURE_OPENAI_MODEL=gpt-35-turbo
AZURE_OPENAI_MODEL_NAME=gpt-35-turbo
AZURE_OPENAI_EMBEDDING_MODEL=35

Any log messages given by the failure

Error message: openai.error.InvalidRequestError: The embeddings operation does not work with the specified model, gpt-35-turbo. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.
stacktrace:

Traceback (most recent call last):
  File "c:\proj\chat-assistant-deployment-data\app.py", line 55, in <module>
    document_processor.process(source_url=file_sas, processors=processors)
  File "c:\proj\chat-assistant-deployment-data\utilities\helpers\DocumentProcessorHelper.py", line 22, in process
    vector_store = vector_store_helper.get_vector_store()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\proj\chat-assistant-deployment-data\utilities\helpers\AzureSearchHelper.py", line 33, in get_vector_store
    vector_search_dimensions=len(llm_helper.get_embedding_model().embed_query("Text")),
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cpalomar\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\openai.py", line 506, in embed_query
    return self.embed_documents([text])[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expected/desired behavior

It should work with gpt-35-turbo according to the documentation.

OS and Version?

Windows 11

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Add Chat History

Motivation

There is a chat history like the AOAI bring your own data deployment. This allows users to leave the session and return later and retain their previous chat.

How would you feel if this feature request was implemented?

Requirements

Design how to best persist the chat history
Chat history should be persisted for a user

To be decided:

How long does the chat need to be persisted for?
How do we link a chat history to a user?

Tasks

To be filled in by the engineer picking up the issue

Task 1
Task 2
...

Response types have wrong mime-type

Response currently has "event/stream" which is the mime-type for SSE, but the response is not SSE - it's a streaming HTTP response using Transfer-Encoding: chunked.

I agree with the choice of response type (as I detail in my blog post @ http://blog.pamelafox.org/2023/08/fetching-json-over-streaming-http.html ), but the mime-type should be changed. I think it can simply be dropped entirely.

Right now, the browser shows an Event Source tab, thinking that SSE events are coming in, and then none come in. If you make the change, you should be able to see the response come in correctly with no Event Source tab in Developer Tools.

Azure click to deploy error - There was an error downloading the template...

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Navigate to the https://github.com/Azure-Samples/azure-search-openai-solution-accelerator and select Deploy to Azure

2. Azure opens and receive the following error:

Any log messages given by the failure

There was an error downloading the template from URI 'https://raw.githubusercontent.com/Azure-Samples/azure-search-openai-solution-accelerator/main/infrastructure/deployment.json?token=GHSAT0AAAAAAB47C325DQBSNOF2UZNHQE2CZGTZSTA'. Ensure that the template is publicly accessible and that the publisher has enabled CORS policy on the endpoint. To deploy this template, download the template manually and paste the contents in the 'Build your own template in the editor' option below.

Expected/desired behavior

Deploy a Template would load with the deployment.json content

OS and Version?

Windows 11

Versions

No version published.

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Deploy to Azure Template Broken

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

From the README, select "Deploy to Azure"

The "Form Recognizer Location" is empty (this used to be auto-populated with the Location variable)

Enter the required fields and select "Next"

Error: Validation failed. View error details.

Any log messages given by the failure

{"code":"InvalidTemplate","message":"Deployment template validation failed: 'The template resource 'StorageAccountName' at line '264' and column '27' is not valid: Unable to evaluate the template language function 'substring'. The index and length parameters must refer to a location within the string. The index parameter: '0', the length parameter: '24', the length of the string parameter: '22'. Please see https://aka.ms/arm-function-substring for usage details.. Please see https://aka.ms/arm-functions for usage details.'."}

Expected/desired behavior

Validation should be successful.

OS and Version?

Windows 11

Versions

This issue occurs using the most recent PR #76

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Improvements to json.dumps(r).replace("\n", "\\n")

I discussed that line with the creator of the jsonlines package, an expert in NDJSON, and he indicated that newlines do not need to be escaped a second time (they already will be), and he recommended specifying ensure_ascii=False so that it's compatible with emoji and other unicode characters.

See this PR for how I changed it in another repo:
Azure-Samples/openai-chat-app-quickstart#22

And these tests for ensure_ascii:
https://github.com/Azure-Samples/azure-search-openai-demo/pull/532/files#diff-a67cb1853203a6f1956991a9d9881d231c4d43557f5baffd45cc672a87e41cc6R207

doc-processing deployment fails: System topic source cannot be modified. (Code: InvalidRequest)

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

When deploying using deployment.bicep

Any log messages given by the failure

{
"code": "InvalidRequest",
"message": "System topic source cannot be modified."
}

Expected/desired behavior

Successful deployment

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Addition of "Delete Data" Page in Admin WebApp

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

An option to add a delete data section within the Admin app to remove documents that have been indexed previously or by mistake during testing.

Thanks! We'll be in touch soon.

Use async framework, not Flask

Motivation

See my blog post here:
http://blog.pamelafox.org/2023/09/best-practices-for-openai-chat-apps.html

We ported the other sample to Quart, as it was a more 1:1 mapping from Flask, but the more popular async framework is FastAPI.

Such a change will enable users to handle more requests with less resources / lower SKUs.

How would you feel if this feature request was implemented?

Requirements

Switch to a more efficient application framework

Tasks

To be filled in by the engineer picking up the issue

Task 1
Task 2
...

Add chat history

Are you planning on adding chat history stored in CosmosDB, similarly to this solution https://github.com/microsoft/sample-app-aoai-chatGPT ?

text embedding URL construction needs more validation

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

deploy using "Deploy to Azure" button

Set azureOpenAIResource: https://charris-openai4.openai.azure.com/

Set azureOpenAIEmbeddingModel: text-embedding-ada-002

Visit the Ingest_Data page

Paste in a URL and press the button to process

Receive the following error:

Error communicating with OpenAI: HTTPSConnectionPool(host='https', port=443): Max retries exceeded with url: //charris-openai4.openai.azure.com/.openai.azure.com//openai/deployments/text-embedding-ada-002/embeddings?api-version=2023-07-01-preview (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x73dd80a32740>: Failed to resolve 'https' ([Errno -2] Name or service not known)")

It looks like I provided the wrong value for azureOpenAIResource, but the instructions were not clear and there was nothing that caught the misconfiguration.

Any log messages given by the failure

Expected/desired behavior

page is processed without error.
but, if there is a configuration error, I would like to be able to correct it on the Configuration page

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

setting ORCHESTRATION_STRATEGY to langchain did not seem to be adhered to

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

We are deploying this repo with gpt-35-turbo model. We therefore assumed ORCHESTRATION_STRATEGY should be set to langchain. We set that in the ARM template but after we deployed we were getting errors in the front end app and noticed in the admin URL config tab that ORCHESTRATION_STRATEGY was set to OPENAI Functions option.

Any log messages given by the failure

2023-10-23T13:55:28.708618922Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 700, in _interpret_response
2023-10-23T13:55:28.708646822Z: [ERROR] self._interpret_response_line(
2023-10-23T13:55:28.708655022Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 763, in _interpret_response_line
2023-10-23T13:55:28.708663323Z: [ERROR] raise self.handle_error_response(
2023-10-23T13:55:28.708671423Z: [ERROR] openai.error.InvalidRequestError: Unrecognized request argument supplied: functions

Expected/desired behavior

manually changing ORCHESTRATION_STRATEGY in config tab and hitting save fixed issue but maybe someone can check the code

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

openai.error.InvalidRequestError: Unrecognized request argument supplied: functions

Hello,
I deployed the template (and fix the contentsafety endpoint in AppService Configuration) and I'm facing the following error

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Just initiate a conversation

Any log messages given by the failure

See below

2023-10-22T21:32:51.077875015Z: [ERROR] [pid: 1|app: 0|req: 18/18] 169.254.131.1 () {84 vars in 1953 bytes} [Sun Oct 22 21:32:51 2023] GET / => generated 0 bytes in 3 msecs (HTTP/1.1 304) 4 headers in 178 bytes (0 switches on core 0)
2023-10-22T21:32:51.180146360Z: [ERROR] [pid: 1|app: 0|req: 19/19] 169.254.131.1 () {84 vars in 2006 bytes} [Sun Oct 22 21:32:51 2023] GET /assets/index-0aaa15a4.js => generated 0 bytes in 3 msecs (HTTP/1.1 304) 4 headers in 188 bytes (0 switches on core 0)
2023-10-22T21:32:51.200854687Z: [ERROR] [pid: 1|app: 0|req: 20/20] 169.254.131.1 () {82 vars in 1961 bytes} [Sun Oct 22 21:32:51 2023] GET /assets/index-09fe54a5.css => generated 0 bytes in 1 msecs (HTTP/1.1 304) 4 headers in 187 bytes (0 switches on core 0)
2023-10-22T21:32:51.303081733Z: [ERROR] [pid: 1|app: 0|req: 21/21] 169.254.131.1 () {82 vars in 1996 bytes} [Sun Oct 22 21:32:51 2023] GET /assets/Azure-30d5e7c0.svg => generated 0 bytes in 1 msecs (HTTP/1.1 304) 4 headers in 187 bytes (0 switches on core 0)
2023-10-22T21:32:51.523325493Z: [ERROR] [pid: 1|app: 0|req: 22/22] 169.254.131.1 () {82 vars in 1940 bytes} [Sun Oct 22 21:32:51 2023] GET /favicon.ico => generated 0 bytes in 1 msecs (HTTP/1.1 304) 4 headers in 180 bytes (0 switches on core 0)
2023-10-22T21:32:53.600357142Z: [ERROR] [pid: 1|app: 0|req: 23/23] 169.254.131.1 () {82 vars in 1991 bytes} [Sun Oct 22 21:32:53 2023] GET /assets/Send-d0601aaa.svg => generated 0 bytes in 6 msecs (HTTP/1.1 304) 4 headers in 185 bytes (0 switches on core 0)
2023-10-22T21:32:54.052951162Z: [ERROR] Returning default config
2023-10-22T21:32:54.150731345Z: [ERROR] Returning default config
2023-10-22T21:32:54.152476530Z: [ERROR] New message id: 8e3dce5a-f483-4b34-99f6-6e737baab98c with tokens {'prompt': 0, 'completion': 0, 'total': 0}
2023-10-22T21:32:54.486134144Z: [ERROR] ERROR:root:Exception in /api/conversation/custom
2023-10-22T21:32:54.486178444Z: [ERROR] Traceback (most recent call last):
2023-10-22T21:32:54.486186044Z: [ERROR] File "app.py", line 271, in conversation_custom
2023-10-22T21:32:54.486190444Z: [ERROR] messages = message_orchestrator.handle_message(user_message=user_message, chat_history=chat_history, conversation_id=conversation_id, orchestrator=ConfigHelper.get_active_config_or_default().orchestrator)
2023-10-22T21:32:54.486224543Z: [ERROR] File "/usr/src/app/./backend/utilities/helpers/OrchestratorHelper.py", line 13, in handle_message
2023-10-22T21:32:54.486229443Z: [ERROR] return orchestrator.handle_message(user_message, chat_history, conversation_id)
2023-10-22T21:32:54.486233243Z: [ERROR] File "/usr/src/app/./backend/utilities/orchestrator/OrchestratorBase.py", line 33, in handle_message
2023-10-22T21:32:54.486237143Z: [ERROR] result = self.orchestrate(user_message, chat_history, **kwargs)
2023-10-22T21:32:54.486240743Z: [ERROR] File "/usr/src/app/./backend/utilities/orchestrator/OpenAIFunctions.py", line 78, in orchestrate
2023-10-22T21:32:54.486244743Z: [ERROR] result = llm_helper.get_chat_completion_with_functions(messages, self.functions, function_call="auto")
2023-10-22T21:32:54.486248543Z: [ERROR] File "/usr/src/app/./backend/utilities/helpers/LLMHelper.py", line 34, in get_chat_completion_with_functions
2023-10-22T21:32:54.486252443Z: [ERROR] return openai.ChatCompletion.create(
2023-10-22T21:32:54.486256043Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_resources/chat_completion.py", line 25, in create
2023-10-22T21:32:54.486259943Z: [ERROR] return super().create(*args, **kwargs)
2023-10-22T21:32:54.486263543Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
2023-10-22T21:32:54.486267443Z: [ERROR] response, _, api_key = requestor.request(
2023-10-22T21:32:54.486271043Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 298, in request
2023-10-22T21:32:54.486274843Z: [ERROR] resp, got_stream = self._interpret_response(result, stream)
2023-10-22T21:32:54.486278743Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 700, in _interpret_response
2023-10-22T21:32:54.486282443Z: [ERROR] self._interpret_response_line(
2023-10-22T21:32:54.486286643Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 763, in _interpret_response_line
2023-10-22T21:32:54.486290843Z: [ERROR] raise self.handle_error_response(
2023-10-22T21:32:54.486308143Z: [ERROR] openai.error.InvalidRequestError: Unrecognized request argument supplied: functions
2023-10-22T21:32:54.494254276Z: [ERROR] [pid: 1|app: 0|req: 24/24] 169.254.131.1 () {84 vars in 1946 bytes} [Sun Oct 22 21:32:53 2023] POST /api/conversation/custom => generated 62 bytes in 511 msecs (HTTP/1.1 500) 2 headers in 90 bytes (1 switches on core 0)

Any idea how to fix that?

What version of Python is required for this project to run locally?

I am having issues when I try in enter a prompt and the app is trying to call the conversation/custom api. Just get a HTTP 500 server error. What version of Python is required for this project to run locally?

documentation issue or request

Minimal steps to reproduce

I am just following the local dev/debug install and run in the readme.

Running this with VSCode and Windows 11

An error occurred. Please try again. If the problem persists, please contact the site administrator.

Getting error while asking questions from the deployed app "An error occurred. Please try again. If the problem persists, please contact the site administrator.".

Can't deploy ARM template

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Copy paste the ARM template into the editor and deploy, as the instructions in the README state

Any log messages given by the failure

Forms recognizer resource fails to deploy
Resource Type: Microsoft.CognitiveServices/accounts

{
"status": "Failed",
"error": {
"code": "OperationError",
"message": "Failed to Create the resource. Provisioning state: failed",
"details": [
{
"code": "BadRequest",
"message": "The Encryption field is required."
}
]
}
}

Expected/desired behavior

ARM template deploys with no errors

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Some answers might be coming from the internet

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I asked when Microsoft was founded and was told 1975 with a reference to the 10k Then I asked again and this time was told a specific date April 4, 1975. But this date does not appear in any of our sample data docs.

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Allow developers to deploy from their local code base and auto setup the environment values

#126
Split bicep file into modules
create .env with real values from a deployment

Use managed identity, not keys

app.py currently uses keys, not managed identity. Best practice for enterprises is to use identity tokens. See my blog post here:

http://blog.pamelafox.org/2023/09/best-practices-for-openai-chat-apps-go.html

Unfortunately, the dataSources API doesn't yet support managed identity, but we're told the change is coming soon.

standalone question prompts need to truncate history

standalone question prompt should not have unlimited chat history, otherwise it might run into token limits eventually

() Invalid expression: Could not find a property named 'content' on type 'search.document'. Parameter name: $select Code: Message: Invalid expression: Could not find a property named 'content' on type 'search.document'. Parameter name: $select

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Uploaded documents are not showing up

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Upload a document, go to explore Data and click on the dropdown

Any log messages given by the failure

Expected/desired behavior

Document uploaded should show up on explore data page

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

OpenAI Eeror

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I completed the deployment, get to the web app front end. When I plugged in a public website to ingest, I received an OpenAI API error.

Any log messages given by the failure

( openai.error.AuthenticationError: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.

Expected/desired behavior

a successful data ingestion

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

I created my models in Azure OpenAI studio and copied the API key there.
It was provided prior to the template deployment in the API field.
My question is now if I need to check or change to a different API key, where do I find it?
Which resources actually has the API key and how do I modify it?
Thanks.

Thanks! We'll be in touch soon.

Web chat error for all user input --- Included the resolution in the details section.

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Open the web chat (https://-website.azurewebsites.net/)
Type any message and send.
Receive the following message:

Any log messages given by the failure

Debugging locally, I was able to gather the following error message:
"error": "No connection adapters were found for '-contentsafety/contentsafety/text:analyze?api-version=2023-04-30-preview'"'

Expected/desired behavior

A valid response from the server.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Windows 11

Versions

Mention any other details that might be useful

I was able to debug and determine the problem. The AZURE_CONTENT_SAFETY_ENDPOINT within the bicep file (and ARM template) is set to ContentSafetyName.

It should be set to the following: 'https://${Location}.api.cognitive.microsoft.com/'

Thanks! We'll be in touch soon.

Deployment fails with "The account type 'ContentSafety' is either invalid or unavailable in given region."

Please provide us with the following information:

This issue is for a: (mark with an `x`)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Start deploying a custom deployment with "Build your own template in the editor".
Copy the template from https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/blob/main/infrastructure/deployment.json
Set the required parameters
Validate step fails with the following error:
{
"code": "InvalidTemplateDeployment",
"details": [
{
"code": "InvalidApiSetId",
"message": "The account type 'ContentSafety' is either invalid or unavailable in given region."
}
],
"message": "The template deployment 'Microsoft.Template-20231019202539' is not valid according to the validation procedure. The tracking id is '51f356dd-67f6-47a5-af7e-586ae02a3cf1'. See inner errors for details."
}

Note that I tried multiple regions where Content Safety is available (Canada East, West Europe) and the error is the same.

Any log messages given by the failure

Expected/desired behavior

Deployment succeeds.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

Thanks! We'll be in touch soon.

	@app.route("/api/config", methods=["GET"])
	def get_config():
	# Return the configuration data as JSON
	return jsonify(
	{
	"azureSpeechKey": env_helper.AZURE_SPEECH_KEY,
	"azureSpeechRegion": env_helper.AZURE_SPEECH_SERVICE_REGION,
	}
	)

azure-samples / chat-with-your-data-solution-accelerator Goto Github PK

chat-with-your-data-solution-accelerator's Issues

This issue is for a: (mark with an x)

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Expected/desired behavior

Motivation

How would you feel if this feature request was implemented?

Requirements

Tasks

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Versions

Mention any other details that might be useful

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Versions

Mention any other details that might be useful

This issue is for a: (mark with an x)

Expected/desired behavior

Mention any other details that might be useful

This issue is for a: (mark with an x)

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Versions

Mention any other details that might be useful

This issue is for a: (mark with an x)

Expected/desired behavior

Migrate non-Open Source or non-External Collaboration repositories to GitHub inside Microsoft

Action

Instructions

Need more help? 🖐️

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Versions

Mention any other details that might be useful

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Versions

Mention any other details that might be useful

This issue is for a: (mark with an x)

Expected/desired behavior

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Versions

Mention any other details that might be useful

Motivation

Minimal steps to reproduce

Requirements

Tasks

Please provide us with the following information:

This issue is for a: (mark with an x)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)

This issue is for a: (mark with an `x`)