Comments (16)
closing as unable to repro but bump us to reopen if this isn't a suspected version issue
i'll try and add the langfuse version id logging today @Manouchehri
from litellm.
i don't see the user
field in this call
what would you expect to happen here? @Manouchehri
from litellm.
It should be pulled from the key alias. I think it's more than just user
that is racy. See the screenshot below, these are identical requests, but the name is changing at random sometimes.
![image](https://private-user-images.githubusercontent.com/7232674/335815973-faaef610-0ee4-4a97-ab42-a9d096496f0e.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxNTk3My1mYWFlZjYxMC0wZWU0LTRhOTctYWI0Mi1hOWQwOTY0OTZmMGUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OTRmYmZlMWMyOTAwNDYzN2ZjYjZlN2VkM2I3OTY5M2MzNWQwOTJkOTkyYTY4ZGFhNzA5YmZlMmJiMGFmOTFhZCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.uZCebHWfiPbJ12F5yfNMZi1w1AizOvY57GjmOPQ0ViU)
from litellm.
![Screenshot 2024-06-01 at 10 08 21 AM](https://private-user-images.githubusercontent.com/17561003/335816633-c975235f-cf2a-4965-af9d-69e2fe8cd74f.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii8xNzU2MTAwMy8zMzU4MTY2MzMtYzk3NTIzNWYtY2YyYS00OTY1LWFmOWQtNjllMmZlOGNkNzRmLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MDUlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjA1VDIzMDAzMVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTIxNzEyOTU1MzFkYjk0ZDUyNjc1ZmFhNGU2ZDNmZWQ1OWVlNDdhNGM0Y2UwM2NkNmIxMGM4MjA5NTc2ZjI4NzImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.zf-xY4Y3n1z2JIFq-0d-NdXp5imsAy2Q49hz49AyiVM)
i can see the line of code in the server, the only this would happen is if:
- user_id is being passed by the request (seems unlikely)
- user_id isn't being passed by user_api_key_auth (possible)
from litellm.
do you have a consistent repro of this? @Manouchehri
from litellm.
do you have a consistent repro of this? @Manouchehri
Nope. Super confused why sometimes the user_api_key_*
fields are missing, I don't see any pattern to it. Here's two identical requests I sent:
Good request shows this:
![image](https://private-user-images.githubusercontent.com/7232674/335816901-0dd8114b-d797-449e-bf26-f19f6482d069.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxNjkwMS0wZGQ4MTE0Yi1kNzk3LTQ0OWUtYmYyNi1mMTlmNjQ4MmQwNjkucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9ZGI2YTIzZjUxOWM0ZmRlZGY3ZDMyZWEzYzUzNzA5ODg0ZDRiMDU4YzVmNGUwNzc0OTQ0MmJkN2RlZjgwYWM3NSZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.G0k1oylNIfnngSsP8EavhTy0MN1OdHcWaNS6mLTwQNw)
Bad request shows this:
![image](https://private-user-images.githubusercontent.com/7232674/335816955-c0aa17c8-c83e-484f-9009-60aa4c86457c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxNjk1NS1jMGFhMTdjOC1jODNlLTQ4NGYtOTAwOS02MGFhNGM4NjQ1N2MucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MTEwOGUzNzA4NmI2NmY1YThkNjg4MDQyN2VhMjdiNjQ5NTVhM2M3ZDdjYWYyMmRmOTdkMjk1NjUxM2I4ZjY5ZiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.9YNmUgKSvSxdhtsxi--7CHIthGbxKkbPwF0IiP4MWjc)
from litellm.
Are you on cloud run with 2 instances? It just looks like one instance has an older version
from litellm.
I'm 98% sure all instances are up to date. Can't confirm retroactively though cause of #3673. 😅
from litellm.
can you restart / re-deploy on cloud run and see if the issue persists ?
I'm unable to repro your problem locally
from litellm.
Sorry, I confirmed that I am indeed only running 1 instances. I'm able to reproduce it with this loop:
for i in {1..50} ; do curl -v "${OPENAI_API_BASE}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-3.5-turbo-0125",
"max_tokens": 10,
"seed": 31337,
"messages": [
{
"role": "user",
"content": "what is 1 plus 1?"
}
],
"cache": {
"no-cache": true
},
"extra_headers": {
"cf-skip-cache": "True"
}
}' ; done
The first request works, not the following ones though.
![image](https://private-user-images.githubusercontent.com/7232674/335817781-8ec90f97-1921-40b1-b3a3-d5ca61cf5705.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxNzc4MS04ZWM5MGY5Ny0xOTIxLTQwYjEtYjNhMy1kNWNhNjFjZjU3MDUucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9OGIxMTlhMmY1MWM3M2M1MWJjOWQ1Y2Y0MDk5NTk3MmQwNTU4NTY0YTI2NGVlMDVlMTYwNjYzN2YzOTU3NDlmNiZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.DAYk44AJ_B0g_Fon5BhAfxEHaKzGoqrUGx_y05-7nUk)
from litellm.
I am using 1.39.6 100% everywhere, the Cloud Run instances don't stick around idle more than 15 minutes worst case. So that should never be an issue for me. =)
from litellm.
oh! that's interesting - thanks for this @Manouchehri
from litellm.
Seems like a API key caching issue of some sort?
for i in {1..10} ; do curl -v "${OPENAI_API_BASE}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gemini-1.5-flash-001",
"max_tokens": 10,
"messages": [
{
"role": "user",
"content": "what is 1 plus 1?"
}
],
"cache": {
"no-cache": true
}
}' ; done
This is not because of the OIDC caching I've added in the recent PRs. In the test case above, you can see I'm using gemini-1.5-flash-001
on Vertex AI, which I haven't worked on at all for that feature. :)
![image](https://private-user-images.githubusercontent.com/7232674/335818179-55954747-aaeb-4ade-9c2e-be5b7753820c.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxODE3OS01NTk1NDc0Ny1hYWViLTRhZGUtOWMyZS1iZTViNzc1MzgyMGMucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YjcyZGU3NGQ5ZmJhNDg0MWUxNmZmOGVmMjBkNjNiM2RiZGRkYWZmY2U2ODgzZmMxNThiNDI5NTg5ZTcxYTE3OCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.-A0__buRoOKYM-qb-5eZp69c8urkGQmF3oVqMH7trDI)
from litellm.
Are you caching anything for about a minute? It kinda looks like that's related to this issue. If I sleep for 61 seconds between requests, it works perfectly.
Repro that doesn't trigger the bug:
for i in {1..10} ; do sleep 61 && curl -v "${OPENAI_API_BASE}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gemini-1.5-flash-001",
"max_tokens": 10,
"messages": [
{
"role": "user",
"content": "what is 1 plus 1?"
}
],
"cache": {
"no-cache": true
}
}' ; done
![image](https://private-user-images.githubusercontent.com/7232674/335819005-388110bc-4896-439a-a6b4-81f39c5b32f6.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxOTAwNS0zODgxMTBiYy00ODk2LTQzOWEtYTZiNC04MWYzOWM1YjMyZjYucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MTVmYzZlMTYyNzUzNjYxZTAxNGQyY2ZiZjgwZjM5MWQ2NTdiZjBlMzgyOTQ5YTVkZTMxNWU0MzFmNDlmMWMxZCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.ezhLIrnqzBfTGaGdcMj9lKfBpq7_GpS2aP2EEhD2cLE)
Repro that does trigger the bug:
for i in {1..10} ; do sleep 55 && curl -v "${OPENAI_API_BASE}/chat/completions" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gemini-1.5-flash-001",
"max_tokens": 10,
"messages": [
{
"role": "user",
"content": "what is 1 plus 1?"
}
],
"cache": {
"no-cache": true
}
}' ; done
See below how everything beyond the first request is broken?
![image](https://private-user-images.githubusercontent.com/7232674/335819253-c010bc4b-236c-48a6-bac3-c5bf64017f41.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTc2Mjg3MzEsIm5iZiI6MTcxNzYyODQzMSwicGF0aCI6Ii83MjMyNjc0LzMzNTgxOTI1My1jMDEwYmM0Yi0yMzZjLTQ4YTYtYmFjMy1jNWJmNjQwMTdmNDEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI0MDYwNSUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNDA2MDVUMjMwMDMxWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9YmU5N2Q5NGM1ZjI4MjIzMWNiODE1MjBjM2IyZjI0Y2Q2OWI0ODY0MTc1ZTI2YTBhZGQyYmNkZGYzMWU5MDVkOCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QmYWN0b3JfaWQ9MCZrZXlfaWQ9MCZyZXBvX2lkPTAifQ.gKX50iZB5d_gizAyO4jS0xH76rfJ6k30Ecw8sv3wsPY)
from litellm.
I am experiencing the same issue and it seems this is really a caching bug.
When the hashed api token does not exist as key in the cache, an empty LiteLLM_VerificationTokenView object is created with only the token (
litellm/litellm/proxy/proxy_server.py
Line 1932 in e96d2e3
litellm/litellm/proxy/proxy_server.py
Line 2001 in e96d2e3
Now it seems this cache entry with the api token key is used somewhere else when loading the api key object, not 100% sure yet where this happens exactly. This object from the cache now only contains the token and api_key, missing all other data:
token='d514bb211d49d8c135f13c8bd9d4022b37e3a4a18c659b09892b1c62959e1088' key_name=None key_alias=None spend=0.00114 max_budget=None expires=None models=[] aliases={} config={} user_id=None team_id=None max_parallel_requests=None metadata={} tpm_limit=None rpm_limit=None budget_duration=None budget_reset_at=None allowed_cache_controls=[] permissions={} model_spend={} model_max_budget={} soft_budget_cooldown=False litellm_budget_table=None org_id=None team_spend=None team_alias=None team_tpm_limit=None team_rpm_limit=None team_max_budget=None team_models=[] team_blocked=False soft_budget=None team_model_aliases=None team_member_spend=None end_user_id=None end_user_tpm_limit=None end_user_rpm_limit=None end_user_max_budget=None api_key='8bb0465eb6a148ba9d969368fe55a2c23bdb63d937c61c5fa4749d95490eded3' user_role=<LitellmUserRoles.INTERNAL_USER: 'internal_user'> allowed_model_region=None
After 60s (default in_memory_cache_ttl) this cache entry is invalid and it works again for one single request.
I will continue debugging, perhaps @krrishdholakia has an idea if this is the right direction and where the cache is used as api key and how to fix this?
from litellm.
I think the cache retrieval happens here:
litellm/litellm/proxy/proxy_server.py
Line 816 in e96d2e3
The hashed token is used as a cache key as well and now the full api key object is expected, but the _update_key_cache function set the incomplete object to the cache.
from litellm.
Related Issues (20)
- [Bug]: Keys with budget set throw unauthorized error HOT 5
- [Bug]: LiteLLM proxy get error when "stream_options" is in JSON HOT 2
- [Bug]: When configuring the DALLE model through the UI, it can lead to model errors.
- [Bug]: httpx.ReadError Timeout Issue with long Completions HOT 10
- [Bug]: Health-check fails with working endpoints HOT 1
- [Feature]: allow passing presidio ad-hoc recognizers per call
- [Feature]: Azure Assistants API support
- [Feature]: Using the Amazon Bedrock Converse API HOT 9
- [Feature]: Allow Redis Semantic caching with custom Embedding models HOT 1
- [Feature]: Support Azure OpenAI in Text-To-Speech HOT 2
- [Bug]: 422 Unprocessable Entity - List index out of range
- [Bug]: Parsing nested objects from xml
- [QUESTION] Issue with LittleLM Python SDK and Custom Endpoint (Behind Proxy) HOT 4
- [Feature]: retry if openai closes connection early
- [Bug]: Fixing non-admin role UI Flow HOT 4
- [Bug]: Run `litellm --config config.yaml` but wait for a long time and nothing happened HOT 7
- [Bug]: Authentication Error, No module named 'nacl'. HOT 4
- [Feature]: Add Ollama as a provider in the proxy UI
- [Feature]: Azure logprobs handling
- [Bug]: Error using LiteLLM Proxy with Continue Autocomplete HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from litellm.