Giter Club home page Giter Club logo

azure-samples / chat-with-your-data-solution-accelerator Goto Github PK

View Code? Open in Web Editor NEW
626.0 156.0 296.0 19.57 MB

A Solution Accelerator for the RAG pattern running in Azure, using Azure AI Search for retrieval and Azure OpenAI large language models to power ChatGPT-style and Q&A experiences. This includes most common requirements and best practices.

Home Page: https://azure.microsoft.com/products/search

License: MIT License

Python 73.13% HTML 0.04% TypeScript 5.73% CSS 1.56% Bicep 16.94% Makefile 0.29% Shell 0.86% Dockerfile 0.28% PowerShell 0.02% Jinja 1.15%
ai-search azd-templates azure azure-openai openai

chat-with-your-data-solution-accelerator's Issues

Branch Protection policy

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

At the moment, the main branch does not have any protection on it. Add a branch protection rule to prevent anybody being able to merge to main

Responsible AI Guidelines

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Update with any Responsible AI style and comments: https://styleguides.azurewebsites.net/StyleGuide/Read?id=2984

Expected/desired behavior

Double check if we are following the Responsible AI style guidelines.

Secure Speech Service Key/Region

Motivation

Ensure sensitive credentials cannot be exposed. We are currently returning the Azure Speech Key from the /api/config endpoint in plain text

@app.route("/api/config", methods=["GET"])
def get_config():
# Return the configuration data as JSON
return jsonify(
{
"azureSpeechKey": env_helper.AZURE_SPEECH_KEY,
"azureSpeechRegion": env_helper.AZURE_SPEECH_SERVICE_REGION,
}
)

How would you feel if this feature request was implemented?

secure

Requirements

  • Stop returning the value of the Azure Speech Key from the /api/config endpoint. Consider using SecretStr from pydantic https://docs.pydantic.dev/2.6/examples/secrets/ to return a masked version of the secret
  • Ensure that the speech service from the frontend is still functional (it calls the /api/config endpoint) #501
  • Stretch: Improve this endpoint to return the full config with all secrets masked

Tasks

To be filled in by the engineer picking up the issue

Request for updating ReadMe.md file

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Getting an error when running docker compose up on WSL

Any log messages given by the failure

Failed to load /home/alias/chat-with-your-data-solution-accelerator/docker/...env: open /home/alias/chat-with-your-data-solution-accelerator/docker/...env: no such file or directory.

Expected/desired behavior

OS and Version?

Linux (Ubuntu)

Versions

Mention any other details that might be useful

Issue is the way it's setting the path to that .env file. here
https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/blob/main/docker/docker-compose.yml#L9
. It's using backslash \ which will only work on Windows. We need to modify the docker-compose.yml to use forward slash / .

It would be great if this can be updated in the ReadMe.md file. It would also be great if we could add a autodetect feature where the type of slash to be used is autodetected based on the OS version


Thanks! We'll be in touch soon.

Make sure the streamed response isn't repeating tools unnecessarily

I believe the app.py in this repo is based off the sample-app app.py. In that case, I believe it has a performance bug due to this line:

https://github.com/Azure-Samples/azure-search-openai-solution-accelerator/blob/db254e0316d0c9031158bb70a0531159b4d111af/app.py#L139

If you check in the browser when it's streaming, you'll see that the "tools" gets repeated in every chunk streamed, which requires a lot of bandwidth and is unnecessary. You should only need to stream the tools once, and the frontend can process it when it sees it.

Issue when run locally

I have a error when I start to chat with the bot I run locally. Please help me fix the error if you know why. Thank you!
image

Bug - Some questions are suddenly generating "list index out of range" error

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Ask a question on the chatbot

Any log messages given by the failure

image

Expected/desired behavior

An answer from my documents

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Update settings for faster local dev / frontend.

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ s] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

When developing locally, I want to be able to more quickly develop the frontend and backend. Help me start the app in dev mode so that changes will be hot reloaded without having to rebuild the frontend and restart every time.

Mention any other details that might be useful

Updating the vite server configuration to proxy to the flask app running locally will allow this.


Thanks! We'll be in touch soon.

issue when test chat : The API deployment for this resource does not exist. If you created..

Hi,

I was implementing the chat-with-your-data-solution-accelerator solution, but I get an error during the chat test.
description of the error:

/api/conversation/custom:1

Failed to load resource: the server responded with a status of 500 (INTERNAL SERVER ERROR)

{
"error": "415 Unsupported Media Type: Did not attempt to load JSON data because the request Content-Type was not 'application/json'."
}

Ensure we are using latest TLS version on Azure services

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Storage and App Service should define a later version (1.2) of TLS

Support for LangChain agent

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Create OpenAI resources as part of one click deploy

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

The OpenAI resource and associated deployments are created as part of the one click deployment

Action required: migrate or opt-out of migration to GitHub inside Microsoft

Migrate non-Open Source or non-External Collaboration repositories to GitHub inside Microsoft

In order to protect and secure Microsoft, private or internal repositories in GitHub for Open Source which are not related to open source projects or require collaboration with 3rd parties (customer, partners, etc.) must be migrated to GitHub inside Microsoft a.k.a GitHub Enterprise Cloud with Enterprise Managed User (GHEC EMU).

Action

✍️ Please RSVP to opt-in or opt-out of the migration to GitHub inside Microsoft.

❗Only users with admin permission in the repository are allowed to respond. Failure to provide a response will result to your repository getting automatically archived.🔒

Instructions

Reply with a comment on this issue containing one of the following optin or optout command options below.

✅ Opt-in to migrate

@gimsvc optin --date <target_migration_date in mm-dd-yyyy format>

Example: @gimsvc optin --date 03-15-2023

OR

❌ Opt-out of migration

@gimsvc optout --reason <staging|collaboration|delete|other>

Example: @gimsvc optout --reason staging

Options:

  • staging : This repository will ship as Open Source or go public
  • collaboration : Used for external or 3rd party collaboration with customers, partners, suppliers, etc.
  • delete : This repository will be deleted because it is no longer needed.
  • other : Other reasons not specified

Need more help? 🖐️

Improve developer experience

Improve the developer experience. As a developer I would like to clone the repository and be able to deploy this accelerator from my machine and run everything locally with the minimum of ceremony.

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

from a single command. (Happy to take ownership of this btw!)

OpenAI embedding token limit - reprocess all documents

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Take a pdf sample file of 10 MB ( the one I used is this one https://freetestdata.com/wp-content/uploads/2022/11/Free_Test_Data_10.5MB_PDF.pdf) and upload it :
image

This will results in a success for the azure function batch push results (no token limit exceeded)
image

upload at the same time more than one copy of the file in order that them combined surpass the token limitation. Let the ingestion of data process and an error will be displayed in the Batch Push Results.
image

image

(the same error can be achieved if there are several documents in your azure blob storage that when uploaded and ingested individually were okay but combined they surpass your token limitation. If, through the admin page, you click on reprocess all documents, you will encounter the same error (in fact the same azure function is called, but with the parameter to take all files in the blob storage equal to true and not only the one without the embedding).)

Any log messages given by the failure

Result: Failure Exception: RateLimitError: Requests to the Embeddings_Create Operation under Azure OpenAI API version 2023-07-01-preview have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 3 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.

Expected/desired behavior

What I would like the app to do is: when reached the token limit (or when it estimates that the next file or chunk is gonna surpass the limit) to wait that the minute finishes and then continue afterwards (if it is not possible at chunk level at least at file level), in this way it will not fail due to token per minute limitations.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Error: `No module named 'utilities'`

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. In Azure Portal, open chatsa01-website-admin > Environment variables
  2. Change value for AZURE_OPENAI_RESOURCE from "https://charris-openai4.openai.azure.com/" to "charris-openai4"
  3. Refresh Explore Data page
  4. Receive errors below
    5 . Change the value back
  5. Continue to receive errors
  6. Restart web app
  7. Continue to receive errors
  8. Navigate back to Admin page (no errors)
  9. Navigate to Explore Data page No Errors

Any log messages given by the failure

ModuleNotFoundError: No module named 'utilities'
Traceback:
File "/usr/local/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
exec(code, module.dict)
File "/usr/local/src/myscripts/admin/pages/01_Ingest_Data.py", line 14, in
from utilities.helpers.ConfigHelper import ConfigHelper

image

Expected/desired behavior

the config value can be corrected without breaking the app.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Naming convention for app components: Frontend, Backend, Admin

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

Suggestion for the naming conventions for the pieces of the app. Today it is shown as two main piece:

  1. Frontend (which includes the Flask app and the TS React UI)
  2. Backend (which includes the admin app/UI)
  3. Batch Processing Functions

For clarity sake, I'd suggest we refer to the "backend" as the "admin" application, as I believe many will conceptualize the main app's Flask component as the "backend" and the react application or UI as the "frontend." Just a naming convention, but would it make sense to organize as:

  1. App
    a. Frontend (TS React)
    b. Backend (Flask Application)
  2. Admin App (Streamlit)
  3. Batch Processing Functions

Requesting Support of PowerPoint file types

Requesting support of PowerPoint file type, currently only following file types:

PDF
JPEG
JPG
PNG
TXT
HTML
MD (Markdown)
DOCX

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Citations for URLs should include full URL including domain

Motivation

Citations should be clickable to enable users to quickly see full references.

Minimal steps to reproduce

  1. Use the Admin site to ingest a URL (https://legislature.idaho.gov/statutesrules/idstat/Title25/T25CH28/SECT25-2801/)
  2. Ask a question that can be satisfied by the URL content (Who can make an order requiring dog owners to get a license?)
  3. Click the Citation button in the response
  4. Notice that the citation does not include the domain of the website

image

Requirements

  • Citations should have a full clickable link including domain

Tasks

To be filled in by the engineer picking up the issue

  • Task 1
  • Task 2
  • ...

Cannot use embedding model deployment with a different name from the model itself

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Attempt to use an embedding deployment with any name besides the model.

Example: Instead of naming the deployment "text-ada-embedding-002" name it "ada002" and set the following env variable:
AZURE_OPENAI_EMBEDDING_MODEL=ada002

Any log messages given by the failure

2023-07-14T21:57:03.449736723Z: [ERROR] openai.error.InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.

Expected/desired behavior

The deployment and model name do not have to be the same

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

ARM Deploy or Bicep Deploy --- Issue

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Deploy the ARM manually using “Deploy a custom template”
  2. Copy and paste the deployment.json content into the editor.
  3. Update the 4 parameters (rg, prefix, AI key, AI name)
  4. Deploy
  5. All resources are successfully created.
  6. Navigate to the Admin web site.
  7. Select Explore Data.
  8. Error is returned
    image
  9. Navigate to the User web site.
  10. Type in a message.
  11. Error is returned.
    image
    image
    image

Any log messages given by the failure

Admin - Explore Data
Traceback (most recent call last): File "/usr/local/src/myscripts/pages/02_Explore_Data.py", line 36, in search_client = vector_store_helper.get_vector_store().client File "/usr/local/src/myscripts/utilities/helpers/AzureSearchHelper.py", line 33, in get_vector_store vector_search_dimensions=len(llm_helper.get_embedding_model().embed_query("Text")), File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 536, in embed_query embedding = self._embedding_func(text, engine=self.deployment) File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 467, in _embedding_func return embed_with_retry( File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 107, in embed_with_retry return _embed_with_retry(**kwargs) File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 289, in wrapped_f return self(f, *args, **kw) File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 379, in call do = self.iter(retry_state=retry_state) File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 314, in iter return fut.result() File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result return self.__get_result() File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result raise self._exception File "/usr/local/lib/python3.9/site-packages/tenacity/init.py", line 382, in call result = fn(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/langchain/embeddings/openai.py", line 104, in _embed_with_retry response = embeddings.client.create(**kwargs) File "/usr/local/lib/python3.9/site-packages/openai/api_resources/embedding.py", line 33, in create response = super().create(*args, **kwargs) File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create response, _, api_key = requestor.request( File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 298, in request resp, got_stream = self._interpret_response(result, stream) File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 700, in _interpret_response self._interpret_response_line( File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 763, in _interpret_response_line raise self.handle_error_response( openai.error.InvalidRequestError: The API deployment for this resource does not exist. If you created the deployment within the last 5 minutes, please wait a moment and try again.

User - Chat
Failed to load resource: the server responded with a status of 500 (INTERNAL SERVER ERROR)
/api/conversation/custom:1

Expected/desired behavior

Admin - Explore Data --- View the ingested data
User - Chat --- A response from the chat bot

OS and Version?

Windows 11

Versions

No published version

Mention any other details that might be useful

Based on the error messaging, the APIs for this solution were not deployed.

I experienced the same issue using the Bicep deployment with Azure CLI.


Thanks! We'll be in touch soon.

The embeddings operation does not work with the specified model, gpt-35-turbo

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Trying to use the .env file with these parameters:

AZURE_OPENAI_MODEL=gpt-35-turbo
AZURE_OPENAI_MODEL_NAME=gpt-35-turbo
AZURE_OPENAI_EMBEDDING_MODEL=35

Any log messages given by the failure

Error message: openai.error.InvalidRequestError: The embeddings operation does not work with the specified model, gpt-35-turbo. Please choose different model and try again. You can learn more about which models can be used with each operation here: https://go.microsoft.com/fwlink/?linkid=2197993.
stacktrace:

Traceback (most recent call last):
  File "c:\proj\chat-assistant-deployment-data\app.py", line 55, in <module>
    document_processor.process(source_url=file_sas, processors=processors)
  File "c:\proj\chat-assistant-deployment-data\utilities\helpers\DocumentProcessorHelper.py", line 22, in process
    vector_store = vector_store_helper.get_vector_store()
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\proj\chat-assistant-deployment-data\utilities\helpers\AzureSearchHelper.py", line 33, in get_vector_store
    vector_search_dimensions=len(llm_helper.get_embedding_model().embed_query("Text")),
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\cpalomar\AppData\Local\Programs\Python\Python311\Lib\site-packages\langchain\embeddings\openai.py", line 506, in embed_query
    return self.embed_documents([text])[0]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Expected/desired behavior

It should work with gpt-35-turbo according to the documentation.

OS and Version?

Windows 11

Mention any other details that might be useful

Thanks! We'll be in touch soon.

Add Chat History

Motivation

There is a chat history like the AOAI bring your own data deployment. This allows users to leave the session and return later and retain their previous chat.

How would you feel if this feature request was implemented?

slaps

Requirements

  • Design how to best persist the chat history
  • Chat history should be persisted for a user

To be decided:

  • How long does the chat need to be persisted for?
  • How do we link a chat history to a user?

Tasks

To be filled in by the engineer picking up the issue

  • Task 1
  • Task 2
  • ...

Response types have wrong mime-type

Response currently has "event/stream" which is the mime-type for SSE, but the response is not SSE - it's a streaming HTTP response using Transfer-Encoding: chunked.

I agree with the choice of response type (as I detail in my blog post @ http://blog.pamelafox.org/2023/08/fetching-json-over-streaming-http.html ), but the mime-type should be changed. I think it can simply be dropped entirely.

Right now, the browser shows an Event Source tab, thinking that SSE events are coming in, and then none come in. If you make the change, you should be able to see the response come in correctly with no Event Source tab in Developer Tools.

Azure click to deploy error - There was an error downloading the template...

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Navigate to the https://github.com/Azure-Samples/azure-search-openai-solution-accelerator and select Deploy to Azure
    image
    2. Azure opens and receive the following error:
    image

Any log messages given by the failure

There was an error downloading the template from URI 'https://raw.githubusercontent.com/Azure-Samples/azure-search-openai-solution-accelerator/main/infrastructure/deployment.json?token=GHSAT0AAAAAAB47C325DQBSNOF2UZNHQE2CZGTZSTA'. Ensure that the template is publicly accessible and that the publisher has enabled CORS policy on the endpoint. To deploy this template, download the template manually and paste the contents in the 'Build your own template in the editor' option below.

Expected/desired behavior

Deploy a Template would load with the deployment.json content

OS and Version?

Windows 11

Versions

No version published.

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Deploy to Azure Template Broken

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. From the README, select "Deploy to Azure"
  2. The "Form Recognizer Location" is empty (this used to be auto-populated with the Location variable)
  3. Enter the required fields and select "Next"
  4. Error: Validation failed. View error details.

Any log messages given by the failure

{"code":"InvalidTemplate","message":"Deployment template validation failed: 'The template resource 'StorageAccountName' at line '264' and column '27' is not valid: Unable to evaluate the template language function 'substring'. The index and length parameters must refer to a location within the string. The index parameter: '0', the length parameter: '24', the length of the string parameter: '22'. Please see https://aka.ms/arm-function-substring for usage details.. Please see https://aka.ms/arm-functions for usage details.'."}

Expected/desired behavior

Validation should be successful.

OS and Version?

Windows 11

Versions

This issue occurs using the most recent PR #76

Mention any other details that might be useful

image


Thanks! We'll be in touch soon.

Improvements to json.dumps(r).replace("\n", "\\n")

I discussed that line with the creator of the jsonlines package, an expert in NDJSON, and he indicated that newlines do not need to be escaped a second time (they already will be), and he recommended specifying ensure_ascii=False so that it's compatible with emoji and other unicode characters.

See this PR for how I changed it in another repo:
Azure-Samples/openai-chat-app-quickstart#22

And these tests for ensure_ascii:
https://github.com/Azure-Samples/azure-search-openai-demo/pull/532/files#diff-a67cb1853203a6f1956991a9d9881d231c4d43557f5baffd45cc672a87e41cc6R207

doc-processing deployment fails: System topic source cannot be modified. (Code: InvalidRequest)

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

When deploying using deployment.bicep

Any log messages given by the failure

{
"code": "InvalidRequest",
"message": "System topic source cannot be modified."
}
image

Expected/desired behavior

Successful deployment

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Addition of "Delete Data" Page in Admin WebApp

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

An option to add a delete data section within the Admin app to remove documents that have been indexed previously or by mistake during testing.


Thanks! We'll be in touch soon.

Use async framework, not Flask

Motivation

See my blog post here:
http://blog.pamelafox.org/2023/09/best-practices-for-openai-chat-apps.html

We ported the other sample to Quart, as it was a more 1:1 mapping from Flask, but the more popular async framework is FastAPI.

Such a change will enable users to handle more requests with less resources / lower SKUs.

How would you feel if this feature request was implemented?

efficient

Requirements

  • Switch to a more efficient application framework

Tasks

To be filled in by the engineer picking up the issue

  • Task 1
  • Task 2
  • ...

text embedding URL construction needs more validation

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. deploy using "Deploy to Azure" button
  2. Set azureOpenAIResource: https://charris-openai4.openai.azure.com/
  3. Set azureOpenAIEmbeddingModel: text-embedding-ada-002
  4. Visit the Ingest_Data page
  5. Paste in a URL and press the button to process

Receive the following error:

Error communicating with OpenAI: HTTPSConnectionPool(host='https', port=443): Max retries exceeded with url: //charris-openai4.openai.azure.com/.openai.azure.com//openai/deployments/text-embedding-ada-002/embeddings?api-version=2023-07-01-preview (Caused by NameResolutionError("<urllib3.connection.HTTPSConnection object at 0x73dd80a32740>: Failed to resolve 'https' ([Errno -2] Name or service not known)")

It looks like I provided the wrong value for azureOpenAIResource, but the instructions were not clear and there was nothing that caught the misconfiguration.

Any log messages given by the failure

image

Expected/desired behavior

page is processed without error.
but, if there is a configuration error, I would like to be able to correct it on the Configuration page

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

setting ORCHESTRATION_STRATEGY to langchain did not seem to be adhered to

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

We are deploying this repo with gpt-35-turbo model. We therefore assumed ORCHESTRATION_STRATEGY should be set to langchain. We set that in the ARM template but after we deployed we were getting errors in the front end app and noticed in the admin URL config tab that ORCHESTRATION_STRATEGY was set to OPENAI Functions option.

Any log messages given by the failure

2023-10-23T13:55:28.708618922Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 700, in _interpret_response
2023-10-23T13:55:28.708646822Z: [ERROR] self._interpret_response_line(
2023-10-23T13:55:28.708655022Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 763, in _interpret_response_line
2023-10-23T13:55:28.708663323Z: [ERROR] raise self.handle_error_response(
2023-10-23T13:55:28.708671423Z: [ERROR] openai.error.InvalidRequestError: Unrecognized request argument supplied: functions

Expected/desired behavior

manually changing ORCHESTRATION_STRATEGY in config tab and hitting save fixed issue but maybe someone can check the code

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

openai.error.InvalidRequestError: Unrecognized request argument supplied: functions

Hello,
I deployed the template (and fix the contentsafety endpoint in AppService Configuration) and I'm facing the following error

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Just initiate a conversation

Any log messages given by the failure

See below

2023-10-22T21:32:51.077875015Z: [ERROR] [pid: 1|app: 0|req: 18/18] 169.254.131.1 () {84 vars in 1953 bytes} [Sun Oct 22 21:32:51 2023] GET / => generated 0 bytes in 3 msecs (HTTP/1.1 304) 4 headers in 178 bytes (0 switches on core 0)
2023-10-22T21:32:51.180146360Z: [ERROR] [pid: 1|app: 0|req: 19/19] 169.254.131.1 () {84 vars in 2006 bytes} [Sun Oct 22 21:32:51 2023] GET /assets/index-0aaa15a4.js => generated 0 bytes in 3 msecs (HTTP/1.1 304) 4 headers in 188 bytes (0 switches on core 0)
2023-10-22T21:32:51.200854687Z: [ERROR] [pid: 1|app: 0|req: 20/20] 169.254.131.1 () {82 vars in 1961 bytes} [Sun Oct 22 21:32:51 2023] GET /assets/index-09fe54a5.css => generated 0 bytes in 1 msecs (HTTP/1.1 304) 4 headers in 187 bytes (0 switches on core 0)
2023-10-22T21:32:51.303081733Z: [ERROR] [pid: 1|app: 0|req: 21/21] 169.254.131.1 () {82 vars in 1996 bytes} [Sun Oct 22 21:32:51 2023] GET /assets/Azure-30d5e7c0.svg => generated 0 bytes in 1 msecs (HTTP/1.1 304) 4 headers in 187 bytes (0 switches on core 0)
2023-10-22T21:32:51.523325493Z: [ERROR] [pid: 1|app: 0|req: 22/22] 169.254.131.1 () {82 vars in 1940 bytes} [Sun Oct 22 21:32:51 2023] GET /favicon.ico => generated 0 bytes in 1 msecs (HTTP/1.1 304) 4 headers in 180 bytes (0 switches on core 0)
2023-10-22T21:32:53.600357142Z: [ERROR] [pid: 1|app: 0|req: 23/23] 169.254.131.1 () {82 vars in 1991 bytes} [Sun Oct 22 21:32:53 2023] GET /assets/Send-d0601aaa.svg => generated 0 bytes in 6 msecs (HTTP/1.1 304) 4 headers in 185 bytes (0 switches on core 0)
2023-10-22T21:32:54.052951162Z: [ERROR] Returning default config
2023-10-22T21:32:54.150731345Z: [ERROR] Returning default config
2023-10-22T21:32:54.152476530Z: [ERROR] New message id: 8e3dce5a-f483-4b34-99f6-6e737baab98c with tokens {'prompt': 0, 'completion': 0, 'total': 0}
2023-10-22T21:32:54.486134144Z: [ERROR] ERROR:root:Exception in /api/conversation/custom
2023-10-22T21:32:54.486178444Z: [ERROR] Traceback (most recent call last):
2023-10-22T21:32:54.486186044Z: [ERROR] File "app.py", line 271, in conversation_custom
2023-10-22T21:32:54.486190444Z: [ERROR] messages = message_orchestrator.handle_message(user_message=user_message, chat_history=chat_history, conversation_id=conversation_id, orchestrator=ConfigHelper.get_active_config_or_default().orchestrator)
2023-10-22T21:32:54.486224543Z: [ERROR] File "/usr/src/app/./backend/utilities/helpers/OrchestratorHelper.py", line 13, in handle_message
2023-10-22T21:32:54.486229443Z: [ERROR] return orchestrator.handle_message(user_message, chat_history, conversation_id)
2023-10-22T21:32:54.486233243Z: [ERROR] File "/usr/src/app/./backend/utilities/orchestrator/OrchestratorBase.py", line 33, in handle_message
2023-10-22T21:32:54.486237143Z: [ERROR] result = self.orchestrate(user_message, chat_history, **kwargs)
2023-10-22T21:32:54.486240743Z: [ERROR] File "/usr/src/app/./backend/utilities/orchestrator/OpenAIFunctions.py", line 78, in orchestrate
2023-10-22T21:32:54.486244743Z: [ERROR] result = llm_helper.get_chat_completion_with_functions(messages, self.functions, function_call="auto")
2023-10-22T21:32:54.486248543Z: [ERROR] File "/usr/src/app/./backend/utilities/helpers/LLMHelper.py", line 34, in get_chat_completion_with_functions
2023-10-22T21:32:54.486252443Z: [ERROR] return openai.ChatCompletion.create(
2023-10-22T21:32:54.486256043Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_resources/chat_completion.py", line 25, in create
2023-10-22T21:32:54.486259943Z: [ERROR] return super().create(*args, **kwargs)
2023-10-22T21:32:54.486263543Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_resources/abstract/engine_api_resource.py", line 153, in create
2023-10-22T21:32:54.486267443Z: [ERROR] response, _, api_key = requestor.request(
2023-10-22T21:32:54.486271043Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 298, in request
2023-10-22T21:32:54.486274843Z: [ERROR] resp, got_stream = self._interpret_response(result, stream)
2023-10-22T21:32:54.486278743Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 700, in _interpret_response
2023-10-22T21:32:54.486282443Z: [ERROR] self._interpret_response_line(
2023-10-22T21:32:54.486286643Z: [ERROR] File "/usr/local/lib/python3.9/site-packages/openai/api_requestor.py", line 763, in _interpret_response_line
2023-10-22T21:32:54.486290843Z: [ERROR] raise self.handle_error_response(
2023-10-22T21:32:54.486308143Z: [ERROR] openai.error.InvalidRequestError: Unrecognized request argument supplied: functions
2023-10-22T21:32:54.494254276Z: [ERROR] [pid: 1|app: 0|req: 24/24] 169.254.131.1 () {84 vars in 1946 bytes} [Sun Oct 22 21:32:53 2023] POST /api/conversation/custom => generated 62 bytes in 511 msecs (HTTP/1.1 500) 2 headers in 90 bytes (1 switches on core 0)

Any idea how to fix that?

What version of Python is required for this project to run locally?

I am having issues when I try in enter a prompt and the app is trying to call the conversation/custom api. Just get a HTTP 500 server error. What version of Python is required for this project to run locally?

  • documentation issue or request

Minimal steps to reproduce

I am just following the local dev/debug install and run in the readme.

image

Running this with VSCode and Windows 11

Can't deploy ARM template

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Copy paste the ARM template into the editor and deploy, as the instructions in the README state

Any log messages given by the failure

Forms recognizer resource fails to deploy
Resource Type: Microsoft.CognitiveServices/accounts

{
"status": "Failed",
"error": {
"code": "OperationError",
"message": "Failed to Create the resource. Provisioning state: failed",
"details": [
{
"code": "BadRequest",
"message": "The Encryption field is required."
}
]
}
}

Expected/desired behavior

ARM template deploys with no errors

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Some answers might be coming from the internet

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I asked when Microsoft was founded and was told 1975 with a reference to the 10k Then I asked again and this time was told a specific date April 4, 1975. But this date does not appear in any of our sample data docs.

Any log messages given by the failure

image

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

() Invalid expression: Could not find a property named 'content' on type 'search.document'. Parameter name: $select Code: Message: Invalid expression: Could not find a property named 'content' on type 'search.document'. Parameter name: $select

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Uploaded documents are not showing up

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Upload a document, go to explore Data and click on the dropdown

Any log messages given by the failure

Expected/desired behavior

Document uploaded should show up on explore data page

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

OpenAI Eeror

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I completed the deployment, get to the web app front end. When I plugged in a public website to ingest, I received an OpenAI API error.

Any log messages given by the failure

( openai.error.AuthenticationError: Access denied due to invalid subscription key or wrong API endpoint. Make sure to provide a valid key for an active subscription and use a correct regional API endpoint for your resource.

Expected/desired behavior

a successful data ingestion

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful

I created my models in Azure OpenAI studio and copied the API key there.
It was provided prior to the template deployment in the API field.
My question is now if I need to check or change to a different API key, where do I find it?
Which resources actually has the API key and how do I modify it?
Thanks.


Thanks! We'll be in touch soon.

Web chat error for all user input --- Included the resolution in the details section.

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Open the web chat (https://-website.azurewebsites.net/)
  2. Type any message and send.
  3. Receive the following message:
    image

Any log messages given by the failure

Debugging locally, I was able to gather the following error message:
"error": "No connection adapters were found for '-contentsafety/contentsafety/text:analyze?api-version=2023-04-30-preview'"'

Expected/desired behavior

A valid response from the server.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Windows 11

Versions

Mention any other details that might be useful

I was able to debug and determine the problem. The AZURE_CONTENT_SAFETY_ENDPOINT within the bicep file (and ARM template) is set to ContentSafetyName.
image
It should be set to the following: 'https://${Location}.api.cognitive.microsoft.com/'
image


Thanks! We'll be in touch soon.

Deployment fails with "The account type 'ContentSafety' is either invalid or unavailable in given region."

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Start deploying a custom deployment with "Build your own template in the editor".
  2. Copy the template from https://github.com/Azure-Samples/chat-with-your-data-solution-accelerator/blob/main/infrastructure/deployment.json
  3. Set the required parameters
  4. Validate step fails with the following error:
    {
    "code": "InvalidTemplateDeployment",
    "details": [
    {
    "code": "InvalidApiSetId",
    "message": "The account type 'ContentSafety' is either invalid or unavailable in given region."
    }
    ],
    "message": "The template deployment 'Microsoft.Template-20231019202539' is not valid according to the validation procedure. The tracking id is '51f356dd-67f6-47a5-af7e-586ae02a3cf1'. See inner errors for details."
    }

Note that I tried multiple regions where Content Safety is available (Canada East, West Europe) and the error is the same.

Any log messages given by the failure

Expected/desired behavior

Deployment succeeds.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.