Comments (3)
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.
from azure-sdk-for-net.
Hello, @Freddeb! Thank you for getting in touch. I can share a quick way to get this information but would also love your feedback on how we could improve the discoverability of this information.
Where usage is: as you can see in OpenAI's API reference, usage
is not provided per embedding entry (data
array item), but rather per full response -- mirroring that, you can retrieve usage information from an EmbeddingCollection
instance that you get from the multi-embedding-returning GenerateEmbeddingsAsync()
("Embeddings" with the 's') method:
EmbeddingCollection embeddings = await client.GenerateEmbeddingsAsync(["hello, world"!]);
int totalTokensForResponse = embeddings.Usage.TotalTokens;
Embeddings theOneActualEmbedding = embeddings[0];
Why?
- We created the singular
GenerateEmbedding[Async]()
method based on the observation that single-item cases were exceedingly common; much like Chat Completions rarely providing ann
> 0 for multiple choices, thedata
array for Embeddings isn't particularly relevant in the majority of cases, where a single input at a time is provided. - But we were concerned that, in those uncommon cases where multiple inputs are provided, there may be confusion if we provided the same
EmbeddingTokenUsage
instance on multipleEmbedding
instances; e.g. if we had the below:
EmbeddingCollection multipleEmbeddings = await client.GenerateEmbeddingAsync(
[
"hello, world!",
"this is a test",
"I'd like multiple embeddings this time"
]);
// this is what's present right now, and represents actual usage
int totalTokensForOperation = multipleEmbeddings.Usage.TotalTokens;
int tokensForAllInputs = multipleEmbeddings.Usage.InputTokens;
// if we had it on each embedding, this could be misleading -- usage information isn't actually provided per data item!
int maybeMisleadingTokens = multipleEmbeddings[0].Usage.TotalTokens;
int sameAsAboveTokens = multipleEmbeddings[1].Usage.TotalTokens;
Assert.That(multipleEmbeddings[0].Usage.TotalTokens, Is.EqualTo(multipleEmbeddings[2].Usage.TotalTokens));
Question: what would make this easier? We're concerned about the confusion case (especially since it could produce an alarming misrepresentation about what's being paid for), but you've clearly hit a very troublesome discoverability problem for the single-item case we're trying to optimize for.
- Documentation? This is basically a given that we need more, but would have a README example for fetching usage have clarified sooner?
- Duplicate and rename the properties? If you had multiple
Embedding
instances as shown above and they each had e.g.TotalTokensForOperation
, aside from seeming overly wordy would you see that as clarifying the problem? - Method on
Embedding
? AGetOperationTokenUsage()
method might make it even more clear -- though also incrementally less discoverable (though still better than a different method entirely!). Would that have seemed clear? - Another approach? We're fundamentally trying to make single-embedding calls easy while not misrepresenting that we can't provide usage information on a per-single-embedding-item basis. Whatever makes this clear is worth looking into!
Thanks again; hopefully this unblocks getting the information in the immediate term, and your input is greatly appreciated for making the medium term better!
from azure-sdk-for-net.
Hello @trrwilson,
Once again, thank you for taking the time to answer my question and for providing a quick solution.
Here is my perspective as a developer.
The function GenerateEmbeddingAsync
(without the 's'):
This function is my intuitive choice when I need to retrieve a user's question in the prompt of my bot and translate it into one vector before submitting it to my vectorized database. Knowing the number of tokens used makes sense for the function GenerateEmbeddingAsync
, even for a single item. I need to record the different token usages for a complete turn.
The function GenerateEmbeddingsAsync
(with the 's'):
This function certainly makes a lot of sense when you want to prepare multiple questions for your vectorized database or when you want to convert chunks of a complete document.
To be honest, I always have apprehension about using this type of function with a large amount of data (e.g., a complete 150-page document split into multiple chunks).
What if the operation goes wrong and crashes in the middle of the process (e.g., network issue, problematic chunk, etc.)?
Would we still need to pay for the tokens used even if the function call ended with an exception? Am I wrong?
Therefore, to prepare my vectorized database, I tend to do "1 chunk = 1 API function call."
I run the process in a loop on the document chunks and go get myself a coffee; the conversion is quite fast.
Now, I'll stop elaborating and answer your question.
The answer you gave me (the first 6 lines) is clear and sufficient for me.
If I wanted to convert a few chunks with a single call to the function GenerateEmbeddingsAsync
, I would be content with the total input tokens and the total tokens retrieved on the EmbeddingCollection
object.
Currently, I am interested in the tokens usage for a complete turn (User question -> Bot response).
And if, ultimately, I really wanted to know the token usage per vector in a list of vectors, I find your second code snippet's proposal very good (InputTokens/TotalTokens on the collection and per item).
I don't think there could be any confusion if there is clear documentation to accompany it. There is always a way to draw attention through an IntelliCode message when the developer accesses the EmbeddingCollection.Usage
or Embedding.Usage
property.
I hope this response is helpful.
I don't know if it's possible via GitHub, but ideally, this is the type of problem for which I would submit different solutions to the developer community for a vote.
Thanks again.
Fred
from azure-sdk-for-net.
Related Issues (20)
- Scopes not being included by default in the OpenTelemetry distro may be insufficiently documented HOT 2
- ManagedIdentityCredential authentication failed: Response from Managed Identity was successful, but the operation timed out prior to completion[QUERY] HOT 2
- [BUG] One of BlobContainerClient constructor overloads does not set clientSideEncryptionOptions HOT 1
- [QUERY] How can we use configuration when registering an Azure client HOT 6
- [BUG] Can't get token to connect to Azure Storage in GovCloud from Visual Studio HOT 2
- Azure AI Open AI Assistant Modify Assistant cannot remove last file [BUG] HOT 2
- Should not check response body when a DELETE LRO return 204 HOT 9
- REST API version? HOT 1
- https://www.facebook.com/nickryanpedralba?mibextid=LQQJ4d HOT 2
- [BUG] Azure.Monitor.OpenTelemetry.Exporter EventId duplicate key exception causes LogRecord to be dropped HOT 2
- [BUG] Azure.Monitor.OpenTelemetry.AspNetCore package HOT 3
- Add extension method TryGetSystemEvent to Azure.Messaging.EventGrid.SystemEvents prior to GA
- Invalid model. The model argument should be left blank error HOT 2
- [QUERY] Which exception for Azure.AI.OpenAI 2.0.0-beta, System.ClientModel.ClientResultException or Azure.RequestFailedException? HOT 2
- [BUG] Azure.Search.Documents returns 'Parital Content' Error for semantic hybrid search HOT 2
- [BUG] Synonym Maps >> Index, "modelName" Error HOT 6
- [BUG] Semantic configuration properties for prioritized keyword fields not accepting string subfields of type Collection(Edm.ComplextType) HOT 8
- [BUG] 500 Error when GetToken called in Azure App Service, but not when run locally in VS 2022 HOT 6
- [FEATURE REQ] Set client LocationIP field by default when using Azure.Monitor.OpenTelemetry.AspNetCore HOT 1
- [BUG] Azure Function with UserAssigned ManagedIdentity has a 16% chance to result in Azure.Identity.CredentialUnavailableException HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from azure-sdk-for-net.