Your current environment 部署指令： vllm serve /root/autodl-fs/llm_

[Bug]: vllm server 部署base和lora模型后，请求lora模型失败 about vllm HOT 3 OPEN

cat-knight commented on August 24, 2024

[Bug]: vllm server 部署base和lora模型后，请求lora模型失败

from vllm.

Comments (3)

jeejeelee commented on August 24, 2024

Which version of vLLM are you using? Additionally, could you provide more detailed information about the error you're encountering?

from vllm.

cat-knight commented on August 24, 2024

Which version of vLLM are you using? Additionally, could you provide more detailed information about the error you're encountering?
vllm=0.5.4
具体信息：我用llama-factory这个项目，lora训练了 qwen2-7b-instruct，借鉴与 longlora，训练参数里还有norm和embed。之后用vllm部署模型和 lora（包含norm 和 embed）
vllm serve /root/autodl-fs/llm_models/qwen/Qwen2-7B-Instruct --enable-lora --lora-modules bi-lora=/root/autodl-fs/saves/Qwen2-7B-Instruct/lora/sft/checkpoint-1500/ --port 6006
上面的命令里 /root/autodl-fs/saves/Qwen2-7B-Instruct/lora/sft/checkpoint-1500/ 是lora参数（包含norm和embed）所在的目录。
访问的时候，下面的请求体会报错：RuntimeError: Loading lora /root/autodl-fs/saves/Qwen2-7B-Instruct/lora/sft/checkpoint-1500/ failed 。但访问 chat模型是没问题的（model参数改成 /root/autodl-fs/llm_models/qwen/Qwen2-7B-Instruct）
{
"model": "bi-lora",
"prompt": "你是谁",
"temperature": 0.7,
"top_p": 0.8,
"repetition_penalty": 1.05,
"max_tokens": 512
}

from vllm.

jeejeelee commented on August 24, 2024

The cause of this error is that it caught an execution error from the previous code, so there should be error messages preceding this one. You should check the log information again.

from vllm.

[Bug]: vllm server 部署base和lora模型后，请求lora模型失败 about vllm HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent