Using the latest official Docker image, openmmlab/lmdeploy:v

ref vLLM include_stop_str_in_output <a href="http

Also, with vLLM, if finish_reason is <code class="not

[Feature Request] OpenAI-compatible `stop` param about lmdeploy HOT 7 OPEN

josephrocca commented on August 16, 2024

[Feature Request] OpenAI-compatible `stop` param

from lmdeploy.

Comments (7)

zhyncs commented on August 16, 2024 1

ref vLLM include_stop_str_in_output
https://github.com/vllm-project/vllm/blob/c96fc067479453b02e92d9378eeeaebb6b3816de/vllm/sampling_params.py#L176-L183

from lmdeploy.

irexyc commented on August 16, 2024

Thank you for your feedback. I will take a look at include_stop_str_in_output. Since it is not clear whether openai use 'stop str' or 'stop token id', I will look into the behavior of their apis with or without streaming.

from lmdeploy.

josephrocca commented on August 16, 2024

it is not clear whether openai use 'stop str' or 'stop token id'

I forgot to mention, but ["the"] works correctly as a stop param, so the fact that ["\n\n"] does not work indicates to me that this issue is related to exact token matching/alignment.

from lmdeploy.

zhyncs commented on August 16, 2024

ref https://help.openai.com/en/articles/5072263-how-do-i-use-stop-sequences-in-the-openai-api

from lmdeploy.

josephrocca commented on August 16, 2024

Also, with vLLM, if finish_reason is "stop", and for example it was due to a stop parameter like stop:["\n"], then the EventStream message JSON also has stop_reason, like this:

...
"finish_reason":"stop",
"stop_reason":"\n",
...

I.e. indicating which stop string caused the stop. This is a handy feature, and nice for compatibility with vLLM (ease of transition), but not strictly necessary if include_stop_str_in_output feature is implemented.

from lmdeploy.

josephrocca commented on August 16, 2024

For others who are hitting this issue, but who desperately want to use LMDeploy, you can of course remove the stop parameter, and then manually check for the stop strings in the full generated text each time you receive a new token, and then manually abort the request if one of those stop strings is detected. That's my current workaround.

from lmdeploy.

lvhan028 commented on August 16, 2024

In LMDeploy, the word in the stop_words list is supposed to be tokenized to ONE token id.
It does not support words that can be tokenized into multiple tokens as stop words now.
We have plans to resolve it. But it will take a while.

from lmdeploy.

Recommend Projects

[Feature Request] OpenAI-compatible `stop` param about lmdeploy HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent