Related Issues (20)
- [Feature]: Chat Completion with Parallel Function Calling HOT 1
- [Performance]: Llama 3 70B; vLLM does not scale beyond TP=4 HOT 7
- [Bug]: vLLM 0.5.5 and FlashInfer0.1.6 HOT 4
- [Bug]: dtype float16 Failure to use enable-chunked-prefill HOT 2
- [Bug]: vllm cpu installation build from source error HOT 3
- [Bug]: Unable to serve minicpm-v2.6 with GGUF quantization HOT 6
- [Bug]: InternVL2-26B tensor_parallel_size=4, AssertionError: 25 is not divisible by 4 HOT 3
- [Installation]: Using Image to build from source get error HOT 4
- [Bug]: When use `guided choice` feature, vllm.engine.async_llm_engine.AsyncEngineDeadError HOT 2
- [Bug]: ValueError: Queue <multiprocessing.queues.Queue object at 0x7f5703d2d0f0> is closed;zipfile.BadZipFile: Bad magic number for file header
- [New Model]: ValueError: Model architectures ['UltravoxModel'] are not supported for now. HOT 2
- [Performance]: Too slow when serving for large number of prompts. HOT 1
- [Usage]: Can vLLM handle multi-turn and multi-instance at the same time?
- [Bug]: Docker image for 0.5.4 does not include package timm==0.9.10 to run MiniCPMV HOT 4
- [Bug]: The error encountered when deploying the MiniCPM-2B model in a CPU environment using the VLLM framework HOT 4
- [Usage]: How to output logprob for each possiable token about classification or determin task?
- [Bug]: JAMBA 1.5 - Beam Search Returns a few characters then stops early
- [Bug]: Mismatch in TTFT count and number of successful requests completed HOT 2
- [Bug]: Loading GPTQ-quantized GPTBigCode fails in weight_loader_v2 of qptq_marlin HOT 5
- [Doc]: `LLM.chat()` docstring incorrectly suggests multiple chats can be generated in one call HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vllm.