Comments (15)
it is a free key that i'm are using
So its not the key that are the problem.
from open-webui.
I have a suspicion that it's an issue on the LiteLLM-side, could you try isolating the issue by trying with LiteLLM only? Keep us updated!
from open-webui.
I have a suspicion that it's an issue on the LiteLLM-side, could you try isolating the issue by trying with LiteLLM only? Keep us updated!
I'm not so sure that it's LiteLLM to blame here, I've tried now with an older version of it and the same behaviour is happening on Claude 2 as well, from other clients than WebUI. I believe this behavior started the day that Claude 3 was released. By chance @bjornjorgensen are you using an Anthropic developer account like mine? I am beginning to wonder if they simply limit max_tokens
now for free dev keys.
from open-webui.
Update: now I'm not so sure where the blame lies. Testing in another chat app works fine with Claude 3 endpoints. So it could indeed be a LiteLLM issue then, but it also wasn't working with older versions of it that previously did work? Really need someone with actual paid API keys to test this I think.
from open-webui.
Yes I just tried it with Lobechat as well and got a full response. So ball seems to be back in LiteLLM's court but I need to do further testing with components in isolation to be really certain of this.
from open-webui.
So after futher testing... I can only observe this happening when WebUI is involved. So it seems it may be something to do with our code, I just cannot at the moment nail down what it could possibly be. It seemingly only affects Claude API via LiteLLM in Open WebUI
from open-webui.
are there any configs that over rights tolkens that it can print on outputs?
from open-webui.
@bjornjorgensen nah, in our testing we've checked we're not sending anything like that would limit the max tokens, but nonetheless the API response says the stop reason is length
which would indicate that it's been given one and reached it... very strange. Still being investigated and I hope we'll have an answer soon!
from open-webui.
Ladies and gentleman, we got em. Claude's API now requires that the max_tokens
param be sent in the payload, and LiteLLM will set a default of 256 tokens if you don't specify this. Currently the WebUI does not send a max_tokens
param when using external APIs, so the proposed fix would be to add that feature, or allow this parameter override to be set in the LiteLLM configuration UI. For now, it can be worked around by mounting and modifying the config.yaml
file as such:
- litellm_params:
api_key: your_api_key
model: anthropic/claude-3-sonnet-20240229
max_tokens: 4096
model_info:
id: 810226a0-61e2-4d97-9de0-822bd4300fcd
model_name: claude-3-sonnet
Note: the maximum value is 4096, you'll get an error from Anthropic's API if you request more.
from open-webui.
https://docs.anthropic.com/claude/docs/models-overview
from open-webui.
v0.1.111 (not merged to :main
yet) has a new field in the LiteLLM UI to configure the max_tokens
parameter override, which will make modifying your config.yaml
by hand unneccesary. This can be tested now in the :dev
branch.
from open-webui.
max_tokens: 4096 should be explicitly set from the settings!
from open-webui.
yes, I have to delete the old one and add it back.. but now it works :)
Thanks
from open-webui.
hmm.. have some issues to day with dev images.. i cant see whats wrong there but when I try main it works
but i have deleted my storage for openchat and now i have to set everything up again. I add claude3 opus without seting enything othere then the model name and api key
so must i set max_tokens: 4096 when i use claude-3 models? if so then it must be in a readme somewhere.
from open-webui.
@bjornjorgensen I haven't migrated this to the docs site yet, there's a thread:
max_tokens
must be 4096
to get the most out of Claude API, as noted there.
from open-webui.
Related Issues (20)
- enh: more mature support for external (non-SQLite) databases HOT 4
- Feat: Improve Visibility of Document Settings
- Feat: Refine Notification Behavior to Prevent Excessive Alerts During Task Generation HOT 2
- Feat: Conditional Display of Message Edit Options in Shared Chats HOT 2
- technical question about controlling a variable for speech playback during voice recording.
- feat: user cost tracking HOT 1
- feat: user throttling HOT 1
- New issues connecting to certain LiteLLM proxy models / "Expected last role to be one of" HOT 1
- A Json parse
- Offline Installation
- feat: prompt library
- Add option to send image / info to Automatic1111...
- Stopping generation doesn't actually stop interference
- bug: During pulling two models from ollama style of window is broken HOT 3
- How to turn the ollma on when the docker is build for OpenAI only
- installation problem: docker: layers from manifest don't match image configuration.
- enh: Efficient Image Handling for Multi-Modal Chats HOT 1
- Please add KoboldCPP support
- feat: Add Assistant/Custom GPT Shortcut to Side Panel HOT 1
- Cannot add CFG for AUTOMATIC 1111
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from open-webui.