Comments (1)
I had the same problem...
Traceback (most recent call last):
File "./swift/demo_server_vllm_xyf.py", line 106, in get_all_component_res
async for request_output in results_generator:
File "./vllm/vllm/engine/async_llm_engine.py",line 673,in generate
async for output in self._process_request(
File "./vllm/vllm/engine/async_llm_engine.py", line 780, in _process_request
raise e
File "./vllm/vllm/engine/asyncIlm_engine.py", line 776, in _process_request
async for request output in stream:
File "./vllm/vllm/engine/async_llm_engine.py", line 89, in _anext
raise result
File "./vllm/vllm/vllm/enggine/async_llm_engine.py", line 42, in _log_task_completiom
return_value = task.result()
File "./vllm/vllm/engine/async_limengine.py", line 532, in run_engine_loop
has_requests_in_progress = await asyncio.wait_for(
File "/opt/conda/envs/infer/lib/python3.10/asyncio/tasks.py", line 445in wait_for
return fut.result()
File "./vllm/vllm/vllm/engine/async_lngine.py", line 510, in engine_step
self._request_tracker.process_request_output(
File "./vllm/vllm/engine/async_llm_engine.py", line 130, in process_request_output
self._request_streams[request_id].put(request_output)
KeyError: 'cc2580f508eb473285a9e1bb47a6714f
from vllm.
Related Issues (20)
- [Usage]: 125m parameter model is also showing CUDA: Out of memory error in a Nvidia16GB 4060 HOT 14
- [Bug]: Tensor not iterable vllm serve OpenGVLab/InternVL2-8B --max-model-len 4096 --trust-remote-code --limit-mm-per-prompt image=2 HOT 1
- [New Model]: Qwen2-VL HOT 3
- [Bug]: Met a error when deploying an AWQ model on H20. HOT 1
- [Bug]: Using the same startup command, deepseek-v2-lite succeeds while deepseek-v2 236b encounters an error. HOT 2
- [Usage]: model works at vllm==0.5.4 but get an error with vllm==0.5.5 HOT 3
- Release v0.6.0 HOT 4
- [Feature]: Supporting MultiModal inputs using Llama3.1 HOT 3
- [Performance]: The impact of CPU on vLLM performance is significant. HOT 11
- [Feature]: Expert parallel for mixture-of-experts models
- [Doc]: Marlin does not support weight_bits = uint4b8 HOT 3
- [Feature]: performance optimization by nanoflow
- [Usage]: How to use vllm infer video with Internvl2 8b multimodal model HOT 1
- [Feature]: Add support for `GPTNeoXForSequenceClassification`
- [Feature]: add VisionEncoderDecoderModel based on vllm HOT 1
- [Misc]: Use of response_format and guided output in LLMEngine and AsyncLLMEngine HOT 1
- [Bug]: vllm async engine can not use adag HOT 1
- [Feature]: Allow partial context in speculative decoding when using draft models with smaller context than target model
- [Usage]: KV cache memory utilization tracking
- [New Model]: Support for allenai/OLMoE-1B-7B-0924 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vllm.