Giter Club home page Giter Club logo

azure-search-power-skills's Introduction

python C#

Important

As of November 15, 2023, Azure Cognitive Search has been renamed to Azure AI Search.

Azure AI Search Power Skills

Power Skills are a collection of useful functions to be deployed as custom skills for Azure AI Search. The skills can be used as templates or starting points for your own custom skills, or they can be deployed and used as they are if they happen to meet your requirements. We also invite you to contribute your own work by submitting a pull request.

Skills

This project provides the following custom skills:

Skill Description Type Language Environment Deployment
GeoPointFromName retrieves coordinates from place names and addresses. Geography C# functions ARM Template
AcronymLinker provides definitions for known acronyms. Text C# functions ARM Template
Anonymizer Uses Presidio to analyze and anonymize PII entities. Text python docker Manual
BingEntitySearch finds rich and structured information about public figures, locations, or organizations. Text C# functions ARM Template
CustomEntityLookup finds custom entity names in text. A custom skill implementation of the custom entity lookup skill, consider using in the cognitive skill instead of this custom skill implementation. Text C# functions ARM Template
CustomNER extracts your custom entities, using Natural Language Processing with Text Analytics Custom NER Text python functions ARM Template
CustomTextClassifier extracts your custom text classification, using Natural Language Processing with Text Analytics Custom Text Classification Text python functions Arm Template
Distinct de-duplicates a list of terms. Text C# functions ARM Template
Summarizer Uses a HuggingFace/FaceBook BART model to summarize text BART-Large-CNN. Text python docker Manual
TextAnalyticsForHealth A wrapper for the Text Analytics for Health API Text C# functions ARM Template
TextQualityWatchdog Uses a pretrained language model to detect low quality text extracted during document cracking Text python functions Manual
Tokenizer extracts non-stop words from a text. Text C# functions
AbbyyOCR OCR to extract text from images using ABBYY Cloud OCR. Vision C# functions ARM Template
FormRecognizer Use Form Recognizer to analyze a document. Form Recognizer skill supports the following model types Layout, Invoice, Receipt, ID, Business Card, General key value pairs, Custom Form Vision python functions Manual
AutoMLVisionClassifier Gets your latest Data Labelling AML AutoML Vision model and runs inference on it Vision python docker Manual
CustomVision classifies documents using Custom Vision models. Vision C# functions ARM Template
HocrGenerator transforms the result of OCR into the hOCR format. Vision C# functions ARM Template
ImageClustering Uses clustering to automatically group and label images Vision python docker Manual
ImageSegmentation Breakdown a full image or PDF page in subimages and upload them on Azure Blob Storage Vision python functions Manual
ImageSimilarity Uses ResNet to find the top-n most similar images Vision python docker Manual
P&ID Parser Extracts equipment tags and text blocks from piping and instrumentation diagrams Vision python docker Manual
DecryptBlobFile downloads, decrypts and returns a file that was previously encrypted and stored in Azure Blob Storage. Utility C# functions ARM Template
GetFileExtension returns the filename and extension as separate values allowing you to filter on document type. Utility C# functions ARM Template
ImageStore Stores and fetches base64-encoded images to and from blob storage. The knowledge store is a cleaner implementation of the pattern to save images to storage. Utility C# functions ARM Template
Embeddings Generates vector embeddings with the HuggingFace all-MiniLM-L6-v2 model Vector python functions Manual
HelloWorld A minimal skill that can be used as a starting point or template for your own skills. Template C# functions ARM Template
PythonFastAPI A production web server and api scaffold for a python power skill Template python docker Terraform template

Getting Started

Prerequisites

In order to use the functions in this project, you'll need an active Azure subscription. Most of the functions can be used on their own for quick evaluation and experimentation, but they are meant to be used as part of an Azure AI Search pipeline. Each function may also add its own specific requirements, such as API keys for services they leverage.

Visual Studio 2019 is recommended, but not required. You need a recent version of the C# compiler. Postman is highly recommended as a way to experiment and test skills.

Installation and deployment

If using Visual Studio with the Azure workload installed, no installation is required, and the functions can just be run locally using F5.

Deployment of a function to Azure can be done through Visual Studio, the Deploy to Azure button, or continuous deployment.

Some functions may require setting environment variables or configuration entries. Please refer to the readme file in the function's directory.

Quickstart

  1. Clone the repository
  2. Open the PowerSkills solution in Visual Studio
  3. Set the project for the function to test as the startup project
  4. Hit F5
  5. Experiment with calling the function using Postman

You can also create your own skills using our Hello World template skill as a starting point or if you are using python our FastAPI template skill.

Up for grabs

Here are a few suggestions of simple contributions to get you started:

Resources

azure-search-power-skills's People

Contributors

becayesoft avatar bleroy avatar careyjmac avatar csiebler avatar danglund avatar dependabot[bot] avatar dereklegenzoff avatar gmndrg avatar graemefoster avatar hyoshioka0128 avatar ignaciofls avatar jadrefke avatar jennifermarsman avatar jonathanserbent avatar liamca avatar microsoftopensource avatar msftgits avatar mtrilbybassett avatar neharanadee avatar rob-derosa avatar ross-p-smith avatar ruoccofabrizio avatar shanepeckham avatar shiranr avatar ssflynn77 avatar stuartleeks avatar supernova-eng avatar toothache avatar vkurpad avatar willchen789 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

azure-search-power-skills's Issues

How to store 'Chunks' or Collection of String field?

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Upload vector for a Collection of String(represented document chunking)

Any log messages given by the failure

HttpResponseError: () The request is invalid. Details: An unexpected 'StartArray' node was found when reading from the JSON reader. A 'PrimitiveValue' node was expected.
Code:
Message: The request is invalid. Details: An unexpected 'StartArray' node was found when reading from the JSON reader. A 'PrimitiveValue' node was expected.

Expected/desired behavior

Mention any other details that might be useful

How can we embed and store the vector for a collection of string field? The use case I am trying to do is, I am 'chunking' a document per ~1000 tokens to manage the token limits for OpenAI.

I was able to vector the filename which works great! But the chunked collection of text is not working.


Thanks! We'll be in touch soon.

Text Summarisation PowerSkill is missing the Dockerfile

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

The Readme file https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Text/TextSummarization/README.md
talks about building a dockerfile and pushing it to your ACR. There is no Dockerfile in teh repo and the link to it 404s https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Text/TextSummarization/Dockerfile

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Broken Link to a Sample

Hi, the link to a sample provided at the end of this document is broken:

Vector/EmbeddingGenerator/README.md

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Text Summarisation downloads model each time

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ x] feature request
- [ x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Following the original readme was hard to follow and error prone. Every time you ran the application it would download 1.6Gb HuggingFace model

Any log messages given by the failure

Expected/desired behavior

Easy to follow, works in a dev container, installs all of the pre-requisites and deploys everything with a single command

Cached the Hugging Face model

OS and Version?

DevContainer - works on any Host OS

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Add support for SAS tokens in HocrGenerator

I was also working on adding some skills to this repository, but saw that the HocrGenerator was already added. 👍

This issue is for a:

  • bug report -> please search issues before submitting
  • feature request
  • documentation issue or request
  • regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Currently the HocrGenerator adds the Uri without SAS token or other authentication options. I think it would be good to have the option to work with Hocr using a private blob container. It is possible to retrieve the tokens client side and add them to the Hocr output, however this requires a regex due to the structure.

Also discussed in microsoft/AzureSearch_JFK_Files#47.

Expected/desired behavior

  • Adding a clarification to the ReadMe about public permissions vs private permissions.
  • Add the possibility to output (indefinite) SAS tokens in the HocrPage output.

Sideways/slanted text highlighting

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

N/A

Any log messages given by the failure

N/A

Expected/desired behavior

When I search for a specific text, usually the results come with that text highlighted on the specific document. I understand this is achieved with hOCR, but it only highlights text that is horizontal. How can I have highlight enabled (or is it possible) if there is slanted text or sideways text? I can still search for that text, and it shows in the transcript, but there is no highlight on the document itself. Is that possible?

OS and Version?

any

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Can we get an option to process multiple images in a single page?

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Web Api response contains data properties for a record with invalid id

error

Enrichment.WebApiSkill.#1
Could not execute skill because Web Api skill response is invalid.
Web Api response contains data properties for a record with invalid id 'r1'

from Skillset

...
{
"@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
"name": "#1",
"description": "",
"context": "/document",
"uri": "https://f10-function.azurewebsites.net/api/f10_http_trigger?code=[REDACTED]",
"httpMethod": "POST",
"timeout": "PT30S",
"batchSize": 1,
"degreeOfParallelism": 1,
"inputs": [
{
"name": "Categoria",
"source": "/document/Categoria"
}
],
"outputs": [
{
"name": "comes_from_Skillset",
"targetName": "comes_from_Skillset"
}
],
"httpHeaders": {}
}
...

from Indexer

...
"outputFieldMappings": [
{
"sourceFieldName": "/document/comes_from_Skillset",
"targetFieldName": "URL"
}
],
...
//////////////////
Hello,

It looks like my python function/ Custom.WebApiSkill returns an invalid id.
Is there any python "hello world" function to start from, for this use-case?
How can this case be troubleshoot?

Ty

Broken link to Langchain text_splitter reference.

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

./Vector/EmbeddingGenerator/README.md has a link to LangChain's text splitter. It takes you to a 404 error page

Any log messages given by the failure

404 Not Found

Expected/desired behavior

Link should take you to https://api.python.langchain.com/en/latest/api_reference.html#module-langchain.text_splitter

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

ALL

Could not execute skill because Web Api skill response is invalid.

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Deploy CNER function via template from https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Text/CustomNER. Test Code inside Azure Portal

Any log messages given by the failure

Could not execute skill because Web Api skill response is invalid.
Web Api response has invalid content type 'text/html'. Expected 'application/json'

Expected/desired behavior

No error message, succesfull processing of documents.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

In Azure search indexer after creating custom i got this errror while (check indexer status): SkillCould not execute skill. WebApi skill response contains errors.

errors": [
{
"key": "https://*****.core.windows.net/container1/Azure%20Search%20Pricing%20Table.xlsx",
"statusCode": 400,
"name": "Enrichment.WebApiSkill.technology",
"errorMessage": "Could not execute skill. WebApi skill response contains errors.",
"details": "WebApi response contains both data and errors. Will not process Data.;Cannot process record without the given key 'text' with a string value",
"documentationLink": "https://go.microsoft.com/fwlink/?linkid=2100103"
}

Index Field as Follows:
"fields": [
{
"name": "id",
"type": "Edm.String",
"searchable": true,
"filterable": true,
"retrievable": true,
"sortable": true,
"facetable": false,
"key": true,
"indexAnalyzer": null,
"searchAnalyzer": null,
"analyzer": null,
"synonymMaps": []
},
{
"name": "technology",
"type": "Collection(Edm.String)",
"searchable": true,
"filterable": true,
"retrievable": true,
"sortable": false,
"facetable": true,
"key": false,
"indexAnalyzer": null,
"searchAnalyzer": null,
"analyzer": "en.microsoft",
"synonymMaps": []
}
],

Skillset is As Follows
"skills":
[

{
  "@odata.type": "#Microsoft.Skills.Text.SplitSkill",
  "textSplitMode": "pages",
  "maximumPageLength": 4000,
  "defaultLanguageCode": "en",
  "context": "/document",
  "inputs": [
    {
      "name": "text",
      "source": "/document/merged_text"
    }
  ],
  "outputs": [
    {
      "name": "textItems",
      "targetName": "pages"
    }
  ]
},
{
        "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
        "name": "technology",
        "description": "Technology Extraction Skill",
        "context": "/document",
        "uri": "https://******-azuresearch.azurewebsites.net/api/*************************************************",
        "httpMethod": "POST",
        "timeout": "PT90S",
        "batchSize": 1,
        "inputs": [
            {
                "name": "text",
                "source": "/document/merged_content",
                "inputs": []
            }
        ],
        "outputs": [
            {
                "name": "technology",
                "targetName": "technology"
            }
        ]
    }

]

Indexer Field mapping is as follows:
"fieldMappings": [
{
"sourceFieldName": "metadata_storage_path",
"targetFieldName": "id",
"mappingFunction": {
"name": "base64Encode",
"parameters": null
}
}
],
"outputFieldMappings": [

    {
        "sourceFieldName": "/document/technology",
        "targetFieldName": "technology",
        "mappingFunction": null
    }
]

Executio failed

Please provide us with the following information:

This issue is for a: (mark with an x)

- [O] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Reset indexer & Execute

Any log messages given by the failure

このセッションは、次のエラーから作成されました:
操作: Enrichment.WebApiSkill.#8
メッセージ: Could not execute skill because Web Api skill response is invalid.
詳細: Web Api skill response contains errors: 'Web Api response contains both data and errors. Will not process Data.;custom-vision - Error processing the request record : System.ArgumentNullException: Value cannot be null. (Parameter 'source')
at System.Linq.ThrowHelper.ThrowArgumentNullException(ExceptionArgument argument)
at System.Linq.Enumerable.Where[TSource](IEnumerable1 source, Func2 predicate)
at CustomVision.CustomVision.<>c__DisplayClass4_0.<b__0>d.MoveNext() in C:\Users\nikeda\Projects\QAW\DS\Azure\azure-search-power-skills\Vision\CustomVision\CustomVision.cs:line 59
--- End of stack trace from previous location where exception was thrown ---
at AzureCognitiveSearch.PowerSkills.Common.WebApiSkillHelpers.ProcessRequestRecordsAsync(String functionName, IEnumerable1 requestRecords, Func3 processRecord) in C:\Users\nikeda\Projects\QAW\DS\Azure\azure-search-power-skills\Common\WebAPISkillHelper.cs:line 67'.
ドキュメント キー: localId=https%3a%2f%2fkmdocstore.blob.core.windows.net%2fdocuments%2f01_%25E6%2595%2599%25E8%2582%25B2%25E9%2596%25A2%25E9%2580%25A3%2f00_DX%25E9%2596%25A2%25E9%2580%25A3%2f%25E3%2580%2590%25E9%2585%258D%25E5%25B8%2583%25E7%2594%25A8%25E3%2580%2591DX%25E3%2581%25AE%25E6%259C%2580%25E6%2596%25B0%25E5%258B%2595%25E5%2590%2591%25E3%2581%25A8%25E3%2583%2587%25E3%2582%25B8%25E3%2582%25BF%25E3%2583%25AB%25E4%25BA%25BA%25E6%259D%2590%25E8%2582%25B2%25E6%2588%2590%25E3%2581%25AE%25E3%2583%259D%25E3%2582%25A4%25E3%2583%25B3%25E3%2583%2588ver1.031.pptx&documentKey=https%3a%2f%2fkmdocstore.blob.core.windows.net%2fdocuments%2f01_%25E6%2595%2599%25E8%2582%25B2%25E9%2596%25A2%25E9%2580%25A3%2f00_DX%25E9%2596%25A2%25E9%2580%25A3%2f%25E3%2580%2590%25E9%2585%258D%25E5%25B8%2583%25E7%2594%25A8%25E3%2580%2591DX%25E3%2581%25AE%25E6%259C%2580%25E6%2596%25B0%25E5%258B%2595%25E5%2590%2591%25E3%2581%25A8%25E3%2583%2587%25E3%2582%25B8%25E3%2582%25BF%25E3%2583%25AB%25E4%25BA%25BA%25E6%259D%2590%25E8%2582%25B2%25E6%2588%2590%25E3%2581%25AE%25E3%2583%259D%25E3%2582%25A4%25E3%2583%25B3%25E3%2583%2588ver1.031.pptx

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Custom Entity Serarch - Deploy to Azure not working

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

From link: https://github.com/Azure-Samples/azure-search-power-skills/tree/master/Text/CustomEntitySearch

Deploy to Azure button feature doesnt work correctly

Any log messages given by the failure

image

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Windows 10, Chrome

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Vision HOCR Deploy Button Deploys wrong Project

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Click deploy to Azure button on readme page

Any log messages given by the failure

Expected/desired behavior

Vision\HocrGenerator\HocrGenerator.csproj to be deployed. Instead "Vision\ImageStore\ImageStore.csproj" is deployed

Detection skills that returns an image count, properties for cost estimation

Costing a work load is hard in cognitive search because customers don't know what's inside PDFs or blobs. Since image analysis is so expensive, it would be great to run an initial skillset that returns information about the data itself: image count, image type,

For text, I don't know if this is even possible, but something to indicate what level of granularity would work best for splitting, or suggestions of skills that might add value.

Open AI Embedding Skill Timeout Issue

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [X] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I implemented this sample skill for Open AI Embeddings. (https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Vector/EmbeddingGenerator/README.md)
I am having some issues using it with a timeout issue with the skill. I did increase the timeout in host.json and using "AzureFunctionsJobHost__functionTimeout" but still having the 30 sec timeout.

Any idea on how to resolve this?

Any log messages given by the failure

Could not execute skill because it did not execute within the time limit '00:00:30'. This is likely transient. Please try again later. For custom skills, consider increasing the 'timeout' parameter on your skill in the skillset.

Expected/desired behavior

Embeddings and chunks will be updated after running the custom skillset.

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Query Regarding hOCR Generation Using Azure Vision JSON Format

I'm attempting to create the hOCR file by following the instructions outlined on this page (https://learn.microsoft.com/en-us/samples/azure-samples/azure-search-power-skills/azure-hocr-generator-sample/). After deployment, I conducted a successful test using a sample input provided on the same page. However, when I attempted to use a different JSON file that I generated using the Computer Vision API (https://eastus.api.cognitive.microsoft.com/computervision/imageanalysis:analyze?features=read&language=en&gender-neutral-caption=false&api-version=2023-04-01-preview), the hOCR generation process failed due to a disparity in the JSON formats. Does this imply that I always need to convert the JSON output from Azure Vision to match the format expected by the hOCR generator? Alternatively, is there a more straightforward method to directly input the Azure Vision JSON into the hOCR generator?

Azure Cog Search Skillset Error or bad doc?

Get this error when adding the Azure Function from this repo (https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator) to the Azure Cognitive Search Service as a skillset, as noted in the documentation.

ERROR: The request is invalid. Details: Incompatible type kinds were found. The type 'Microsoft.Skills.Custom.WebApiSkill' was found to be of kind 'Complex' instead of the expected kind 'Entity'.

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Deploy the function as noted here: https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator (no errors - i AM able to run a CODE+TEST run in the Portal UI after it enabled CORS. Got the vector output successfully)
  2. Go to function, click 'code + test, click Get Function URL, paste into sample skill as provided on EmbeddingGenerator READMD doc.
  3. Open Azure Cog Search (existing), click skillsets, add the code from the README doc for the search-power-skills to insert the azure fn into the Cog Search pipeline, changing the URL as copied.
  4. Click Save, get error noted above.

Any log messages given by the failure

Expected/desired behavior

It works

OS and Version?

N/a. Azure PaaS services

Versions

N/a

Mention any other details that might be useful


Thanks! We'll be in touch soon.

How to define "chunks" field of index for EmbeddingGenerator

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Select an index of Cognitive Search on Azure portal.
  2. Select "Edit JSON" button.
  3. Add the following field in fields array and save.
{
  "name": "chunks",
  "type": "Collection(Edm.ComplexType)",
  "analyzer": null,
  "synonymMaps": [],
  "fields": [
    {
        "name": "contentVector",
        "type": "Collection(Edm.Single)",
        "searchable": true,
        "retrievable": true,
        "dimensions": 1536,
        "vectorSearchConfiguration": "vectorConfig"
    }
  ]
},

-> cannot save

Any log messages given by the failure

Error message : "InvalidField: The field 'chunks/contentVector' cannot be a vector field. Only a top-level field of the index can be a vector field. Parameters: definition"

Expected/desired behavior

Please let me know how to define "chunks" field of index for EmbeddingGenerator. Maybe the above is wrong.
I think "chunks" field should be added for EmbeddingGenerator.

EmbeddingGenerator:
https://github.com/Azure-Samples/azure-search-power-skills/tree/main/Vector/EmbeddingGenerator

OS and Version?

Azure portal on windows Edge.

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Image Clustering: Cleanup comments

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [X ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Store Image generates an exception if imageName is not specified

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Deploy as is and do not specify an imageName

Any log messages given by the failure

Web Api skill response contains errors: 'Web Api response contains both data and errors. Will not process Data.;image-store - Error processing the request record : System.Collections.Generic.KeyNotFoundException: The given key 'imageName' was not present in the dictionary.
at System.Collections.Generic.Dictionary2.get_Item(TKey key) at AzureCognitiveSearch.PowerSkills.Vision.ImageStore.ImageStore.<>c__DisplayClass3_0.<<RunStoreImage>b__0>d.MoveNext() in C:\home\site\repository\Vision\ImageStore\ImageStore.cs:line 52 --- End of stack trace from previous location where exception was thrown --- at AzureCognitiveSearch.PowerSkills.Common.WebApiSkillHelpers.ProcessRequestRecordsAsync(String functionName, IEnumerable1 requestRecords, Func`3 processRecord) in C:\home\site\repository\Common\WebAPISkillHelper.cs:line 67'.

Expected/desired behavior

should generate a GUID and succeed

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Errors in running the EmbeddingGenerator Azure function locally with VSCode

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Clone the code
  2. Go to the \Vector\EmbeddinGenerator folder
  3. Create local.settings.json per instruction.
  4. Execute func start --python

Any log messages given by the failure

[2023-08-10T23:48:18.670Z] There was an error performing a read operation on the Blob Storage Secret Repository.
[2023-08-10T23:48:18.671Z] Azure.Core: No connection could be made because the target machine actively refused it. (127.0.0.1:10000). System.Net.Http: No connection could be made because the target machine actively refused it. (127.0.0.1:10000). System.Net.Sockets: No connection could be made because the target machine actively refused it.
[2023-08-10T23:48:18.703Z] A host error has occurred during startup operation 'b816d542-5f6c-4417-b694-ff2100d8e6c5'.
[2023-08-10T23:48:18.705Z] Azure.Core: No connection could be made because the target machine actively refused it. (127.0.0.1:10000). System.Net.Http: No connection could be made because the target machine actively refused it. (127.0.0.1:10000). System.Net.Sockets: No connection could be made because the target machine actively refused it.
[2023-08-10T23:48:55.647Z] There was an error performing a read operation on the Blob Storage Secret Repository.
[2023-08-10T23:48:55.649Z] Azure.Core: No connection could be made because the target machine actively refused it. (127.0.0.1:10000). System.Net.Http: No connection could be made because the target machine actively refused it. (127.0.0.1:10000). System.Net.Sockets: No connection could be made because the target machine actively refused it.
Value cannot be null. (Parameter 'provider')

image

Expected/desired behavior

The function can run locally without errors

OS and Version?

Windows 10

Versions

Mention any other details that might be useful

Forms Recognizer 2.0 is no longer preview.

Since Forms Recognizer 2.0 is not preview, the references to v2.0-preview need to be dropped from the URLs. For example:

/formrecognizer/v2.0-preview/custom/models/ should now be: /formrecognizer/v2.0/custom/models/

Issues with Sample JSON

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

The Sample JSON for "chunk-embed" in https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Vector/EmbeddingGenerator/README.md will fail when used with the function python code due to the lack of double quotes around the recordId node e.g. ("recordId": 1234 should be "recordId": "1234")

Causes the Azure Function to fail to parse when sent with Postman

Azure function fails with error messages re malformed body when Az Function accessed via Postman

Azure function succeeds when Az Function accessed via Postman and returns embeddings

OS and Version?

Win 10, VS.Code used to deploy function. Postman

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

HTTP response code 500 Internal Server Error

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Follow the instructions here - https://github.com/Azure-Samples/azure-search-power-skills/blob/main/Vision/FormRecognizer/AnalyzeDocument

Any log messages given by the failure

2023-10-12T14:56:01Z [Error] Executed 'Functions.AnalyzeDocument' (Failed, Id=b6d90265-b92f-4d30-a955-eb6d4e167210, Duration=13ms)

Expected/desired behavior

No error, Succesfull execution.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). macOS (Yosemite? El Capitan? Sierra?)
Azure Portal -> Analyze Document | Code + Test

Versions

Mention any other details that might be useful

The function is deployed successfully, but when trying to test via Code + Test in Azure Portal it always returns 500 error
Subjective opinion - might be the code from init.py since testing the same Forms Recognizer Endpoint/Key with the script at https://learn.microsoft.com/en-us/training/modules/build-form-recognizer-custom-skill-for-azure-cognitive-search/4-exercise-build-deploy (it seems the code was "re-enginered" from there) works succesfully.


Thanks! We'll be in touch soon.

HocrGenerator wrong deployment link

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

The deployment link is incorrect - it is deploying an Image Store skill.

OS and Version?

Windows 10

CustomEntityLookup not working

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ X] bug report -> custom-entity-lookup failing on request



### Minimal steps to reproduce
Right click on CustomEntityLookup in the Solution Explorer and choose "Set as StartUp Project"
Press F5 - NOTE: You may need to allow the function to run
Once the function is running it should supply you with the URL to use for POST calls. Copy this URL.
Used postman to send a POST request

### Any log messages given by the failure
Postman Error 500
[2020-10-30T10:55:21.780] Executing 'custom-entity-lookup' (Reason='This function was programmatically called via the host APIs.', Id=9aac505f-12f4-4c98-84ec-2c927c19769c)
[2020-10-30T10:55:21.840] Custom Entity Search function: C# HTTP trigger function processed a request.
[2020-10-30T10:55:21.998] Executed 'custom-entity-lookup' (Failed, Id=9aac505f-12f4-4c98-84ec-2c927c19769c, Duration=329ms)
[2020-10-30T10:55:22.000] System.Private.CoreLib: Exception while executing function: custom-entity-lookup. CustomEntityLookup: Method not found: 'System.String System.IO.Path.Join(System.String, System.String)'.

### Expected/desired behavior
Reponse

### OS and Version?
WIndows 10

### Versions
>

### Mention any other details that might be useful

> ---------------------------------------------------------------
> Thanks! We'll be in touch soon.

Deploy failed

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Deploy to some resource
デプロイ-Microsoft.Template-20210701160805.zip

Any log messages given by the failure

Conflict on deploy

Expected/desired behavior

Deploy was suceed

OS and Version?

Windows 10

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.