Library name Azure.AI.OpenAI 2.0.0-beta.1 Pleas

Hello, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

Hello <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-ur

[FEATURE REQ] Tokens usage not present in embedding object about azure-sdk-for-net HOT 3 CLOSED

Freddeb commented on August 17, 2024

[FEATURE REQ] Tokens usage not present in embedding object

from azure-sdk-for-net.

Comments (3)

github-actions commented on August 17, 2024

Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @jpalvarezl @ralph-msft @trrwilson.

from azure-sdk-for-net.

trrwilson commented on August 17, 2024

Hello, @Freddeb! Thank you for getting in touch. I can share a quick way to get this information but would also love your feedback on how we could improve the discoverability of this information.

Where usage is: as you can see in OpenAI's API reference, usage is not provided per embedding entry (data array item), but rather per full response -- mirroring that, you can retrieve usage information from an EmbeddingCollection instance that you get from the multi-embedding-returning GenerateEmbeddingsAsync() ("Embeddings" with the 's') method:

EmbeddingCollection embeddings = await client.GenerateEmbeddingsAsync(["hello, world"!]);
int totalTokensForResponse = embeddings.Usage.TotalTokens;
Embeddings theOneActualEmbedding = embeddings[0];

Why?

We created the singular GenerateEmbedding[Async]() method based on the observation that single-item cases were exceedingly common; much like Chat Completions rarely providing an n > 0 for multiple choices, the data array for Embeddings isn't particularly relevant in the majority of cases, where a single input at a time is provided.
But we were concerned that, in those uncommon cases where multiple inputs are provided, there may be confusion if we provided the same EmbeddingTokenUsage instance on multiple Embedding instances; e.g. if we had the below:

EmbeddingCollection multipleEmbeddings = await client.GenerateEmbeddingAsync(
[
    "hello, world!",
    "this is a test",
    "I'd like multiple embeddings this time"
]);
// this is what's present right now, and represents actual usage
int totalTokensForOperation = multipleEmbeddings.Usage.TotalTokens;
int tokensForAllInputs = multipleEmbeddings.Usage.InputTokens;
// if we had it on each embedding, this could be misleading -- usage information isn't actually provided per data item!
int maybeMisleadingTokens = multipleEmbeddings[0].Usage.TotalTokens;
int sameAsAboveTokens = multipleEmbeddings[1].Usage.TotalTokens;
Assert.That(multipleEmbeddings[0].Usage.TotalTokens, Is.EqualTo(multipleEmbeddings[2].Usage.TotalTokens));

Question: what would make this easier? We're concerned about the confusion case (especially since it could produce an alarming misrepresentation about what's being paid for), but you've clearly hit a very troublesome discoverability problem for the single-item case we're trying to optimize for.

Documentation? This is basically a given that we need more, but would have a README example for fetching usage have clarified sooner?
Duplicate and rename the properties? If you had multiple Embedding instances as shown above and they each had e.g. TotalTokensForOperation, aside from seeming overly wordy would you see that as clarifying the problem?
Method on Embedding? A GetOperationTokenUsage() method might make it even more clear -- though also incrementally less discoverable (though still better than a different method entirely!). Would that have seemed clear?
Another approach? We're fundamentally trying to make single-embedding calls easy while not misrepresenting that we can't provide usage information on a per-single-embedding-item basis. Whatever makes this clear is worth looking into!

Thanks again; hopefully this unblocks getting the information in the immediate term, and your input is greatly appreciated for making the medium term better!

from azure-sdk-for-net.

Freddeb commented on August 17, 2024

Hello @trrwilson,

Once again, thank you for taking the time to answer my question and for providing a quick solution.

Here is my perspective as a developer.

The function GenerateEmbeddingAsync(without the 's'):
This function is my intuitive choice when I need to retrieve a user's question in the prompt of my bot and translate it into one vector before submitting it to my vectorized database. Knowing the number of tokens used makes sense for the function GenerateEmbeddingAsync, even for a single item. I need to record the different token usages for a complete turn.

The function GenerateEmbeddingsAsync (with the 's'):
This function certainly makes a lot of sense when you want to prepare multiple questions for your vectorized database or when you want to convert chunks of a complete document.

To be honest, I always have apprehension about using this type of function with a large amount of data (e.g., a complete 150-page document split into multiple chunks).
What if the operation goes wrong and crashes in the middle of the process (e.g., network issue, problematic chunk, etc.)?
Would we still need to pay for the tokens used even if the function call ended with an exception? Am I wrong?
Therefore, to prepare my vectorized database, I tend to do "1 chunk = 1 API function call."
I run the process in a loop on the document chunks and go get myself a coffee; the conversion is quite fast.

Now, I'll stop elaborating and answer your question.
The answer you gave me (the first 6 lines) is clear and sufficient for me.

If I wanted to convert a few chunks with a single call to the function GenerateEmbeddingsAsync, I would be content with the total input tokens and the total tokens retrieved on the EmbeddingCollection object.
Currently, I am interested in the tokens usage for a complete turn (User question -> Bot response).

And if, ultimately, I really wanted to know the token usage per vector in a list of vectors, I find your second code snippet's proposal very good (InputTokens/TotalTokens on the collection and per item).
I don't think there could be any confusion if there is clear documentation to accompany it. There is always a way to draw attention through an IntelliCode message when the developer accesses the EmbeddingCollection.Usage or Embedding.Usage property.

I hope this response is helpful.
I don't know if it's possible via GitHub, but ideally, this is the type of problem for which I would submit different solutions to the developer community for a vote.

Thanks again.
Fred

from azure-sdk-for-net.

[FEATURE REQ] Tokens usage not present in embedding object about azure-sdk-for-net HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent