Comments (3)
Which version of vLLM are you using? Additionally, could you provide more detailed information about the error you're encountering?
from vllm.
Which version of vLLM are you using? Additionally, could you provide more detailed information about the error you're encountering?
vllm=0.5.4
具体信息:我用llama-factory这个项目,lora训练了 qwen2-7b-instruct,借鉴与 longlora,训练参数里还有norm和embed。之后用vllm部署 模型 和 lora(包含norm 和 embed)
vllm serve /root/autodl-fs/llm_models/qwen/Qwen2-7B-Instruct --enable-lora --lora-modules bi-lora=/root/autodl-fs/saves/Qwen2-7B-Instruct/lora/sft/checkpoint-1500/ --port 6006
上面的命令里 /root/autodl-fs/saves/Qwen2-7B-Instruct/lora/sft/checkpoint-1500/ 是lora参数(包含norm和embed)所在的目录。
访问的时候,下面的请求体会报错:RuntimeError: Loading lora /root/autodl-fs/saves/Qwen2-7B-Instruct/lora/sft/checkpoint-1500/ failed 。但访问 chat模型是没问题的(model参数改成 /root/autodl-fs/llm_models/qwen/Qwen2-7B-Instruct)
{
"model": "bi-lora",
"prompt": "你是谁",
"temperature": 0.7,
"top_p": 0.8,
"repetition_penalty": 1.05,
"max_tokens": 512
}
from vllm.
The cause of this error is that it caught an execution error from the previous code, so there should be error messages preceding this one. You should check the log information again.
from vllm.
Related Issues (20)
- [Bug]: Phi-3-small-128k-instruct on 1 A100 GPUs - Assertion error: Does not support prefix-enabled attention. HOT 1
- [Bug]: Critical distributed executor bug HOT 7
- [New Model]: Snowflake Arctic Embed (Family)
- [Bug]: FP8 Marlin fallback out of memory regression
- [Usage]: Is there any way to hook features inside vision-language model? HOT 1
- [Bug]: my vllm phi-3-vision server runs one request correctly then returns an error for the same request stating 2509 image tokens to 0 placeholders HOT 3
- Request support for the deepseek-gptq version HOT 1
- [Bug]: Running mistral-large results in an error related to NCCL
- [Bug]: Running mistral-large results in an error related to NCCL HOT 2
- [Bug]: ModuleNotFoundError: No module named 'openai.types' HOT 5
- [Bug]: falcon-40B model support
- [Usage]: How to generate independent samples for a given input?
- [Bug]: Llama 3 answers starting with <|start_header_id|>assistant<|end_header_id|>
- [Usage]: The seed in vllm.SamplingParams and vllm.LLM HOT 1
- [Bug]: Docker build for ROCm fails for latest release and main branch HOT 2
- [Bug]: Error: No available node types can fulfill resource request HOT 1
- [Usage]: When debugging with vLLM, a CUDA error occurs. HOT 2
- [Bug]: tool_calls parsing error with CPU
- [Bug]: Nemotron 340B does not generated EOS token
- [Bug]:
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vllm.