Comments (6)
Does adding the chat template have an impact?
curl -X 'POST' \
'http://localhost:3000/generate' \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"inputs": "<BOS_TOKEN><|START_OF_TURN_TOKEN|><|USER_TOKEN|>My name is Olivier and I<|END_OF_TURN_TOKEN|><|START_OF_TURN_TOKEN|><|CHATBOT_TOKEN|>",
"parameters": {
"best_of": 1,
"decoder_input_details": true,
"details": true,
"do_sample": true,
"frequency_penalty": 0.1,
"max_new_tokens": 20,
"repetition_penalty": 1.03,
"return_full_text": false,
"seed": null,
"stop": [
"photographer"
],
"temperature": 0.5,
"top_k": 10,
"top_n_tokens": 5,
"top_p": 0.95,
"truncate": null,
"typical_p": 0.95,
"watermark": true
}
}'
from text-generation-inference.
Maybe it's a cuda graph issue. Can you try disabling them?
from text-generation-inference.
Adding the chat template does not help (same output as before).
Setting --cuda-graphs 0
appears to be working! I'll keep experimenting and follow up, but I am getting full responses now.
from text-generation-inference.
After testing, it appears that disabling CUDA traps has fixed the issue.
from text-generation-inference.
@michaelthreet #1729 has fixed it if you want to try it out.
from text-generation-inference.
Hi, @OlivierDehaene
I'm having the same warning in the log when serving command-r plus, and the testing seems nomal. Is that ok?
2024-04-25T02:46:33.000717Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|START_OF_TURN_TOKEN|>' was expected to have ID '255000' but was given ID 'None'
2024-04-25T02:46:33.000759Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|END_OF_TURN_TOKEN|>' was expected to have ID '255001' but was given ID 'None'
2024-04-25T02:46:33.000762Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|YES_TOKEN|>' was expected to have ID '255002' but was given ID 'None'
2024-04-25T02:46:33.000764Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|NO_TOKEN|>' was expected to have ID '255003' but was given ID 'None'
2024-04-25T02:46:33.000767Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|GOOD_TOKEN|>' was expected to have ID '255004' but was given ID 'None'
2024-04-25T02:46:33.000769Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|BAD_TOKEN|>' was expected to have ID '255005' but was given ID 'None'
2024-04-25T02:46:33.000771Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_TOKEN|>' was expected to have ID '255006' but was given ID 'None'
2024-04-25T02:46:33.000773Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|CHATBOT_TOKEN|>' was expected to have ID '255007' but was given ID 'None'
2024-04-25T02:46:33.000775Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|SYSTEM_TOKEN|>' was expected to have ID '255008' but was given ID 'None'
2024-04-25T02:46:33.000777Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_0_TOKEN|>' was expected to have ID '255009' but was given ID 'None'
2024-04-25T02:46:33.000779Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_1_TOKEN|>' was expected to have ID '255010' but was given ID 'None'
2024-04-25T02:46:33.000781Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_2_TOKEN|>' was expected to have ID '255011' but was given ID 'None'
2024-04-25T02:46:33.000802Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_3_TOKEN|>' was expected to have ID '255012' but was given ID 'None'
2024-04-25T02:46:33.000804Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_4_TOKEN|>' was expected to have ID '255013' but was given ID 'None'
2024-04-25T02:46:33.000806Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_5_TOKEN|>' was expected to have ID '255014' but was given ID 'None'
2024-04-25T02:46:33.000808Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_6_TOKEN|>' was expected to have ID '255015' but was given ID 'None'
2024-04-25T02:46:33.000810Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_7_TOKEN|>' was expected to have ID '255016' but was given ID 'None'
2024-04-25T02:46:33.000812Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_8_TOKEN|>' was expected to have ID '255017' but was given ID 'None'
2024-04-25T02:46:33.000815Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|USER_9_TOKEN|>' was expected to have ID '255018' but was given ID 'None'
2024-04-25T02:46:33.000817Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_0_TOKEN|>' was expected to have ID '255019' but was given ID 'None'
2024-04-25T02:46:33.000819Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_1_TOKEN|>' was expected to have ID '255020' but was given ID 'None'
2024-04-25T02:46:33.000821Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_2_TOKEN|>' was expected to have ID '255021' but was given ID 'None'
2024-04-25T02:46:33.000823Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_3_TOKEN|>' was expected to have ID '255022' but was given ID 'None'
2024-04-25T02:46:33.000825Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_4_TOKEN|>' was expected to have ID '255023' but was given ID 'None'
2024-04-25T02:46:33.000827Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_5_TOKEN|>' was expected to have ID '255024' but was given ID 'None'
2024-04-25T02:46:33.000829Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_6_TOKEN|>' was expected to have ID '255025' but was given ID 'None'
2024-04-25T02:46:33.000831Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_7_TOKEN|>' was expected to have ID '255026' but was given ID 'None'
2024-04-25T02:46:33.000834Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_8_TOKEN|>' was expected to have ID '255027' but was given ID 'None'
2024-04-25T02:46:33.000835Z WARN tokenizers::tokenizer::serialization: /usr/local/cargo/registry/src/index.crates.io-6f17d22bba15001f/tokenizers-0.19.1/src/tokenizer/serialization.rs:159: Warning: Token '<|EXTRA_9_TOKEN|>' was expected to have ID '255028' but was given ID 'None'
from text-generation-inference.
Related Issues (20)
- How to share memory among 2 GPUS for distributed inference? HOT 10
- text generation details not working when stream=False HOT 2
- concurrent requests permit limit is broken
- Multi-Model Endpoint support in Sagemaker
- Logging has no formating when using docker enviroment instead of command
- SnapKV support
- Question about KV cache HOT 3
- Min P generation parameter HOT 2
- Router /v1/chat/completions not compatible with openai spec HOT 1
- TGI 2.0.2 CodeLlama error `piece id is out of range.`
- LoRA Adapter from local model are leading to error HOT 4
- HF web service streaming response differs from OpenAI, breaking clients
- StarCoder2 AWQ does not work correctly
- Document Request HOT 2
- metric: tgi_request_total increments by 2 upon every request
- error: unexpected argument ‘–max-input-tokens’ found HOT 1
- Clarification and supplement to the online docs example
- Docs missing for LLaVA NeXT Model
- Phi-3 not starting on TGI 2.0.3 in kubernetes cluster HOT 2
- Wrong validations on `Parameters` in TGI python library
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from text-generation-inference.