Related Issues (20)
- [New Model]: MiniCPM-V-2_6-int4 HOT 1
- [Usage]: Potential Hardware Failure when running vllm HOT 3
- [New Model]: ValueError: Model architectures ['PhiMoEForCausalLM'] are not supported for now HOT 1
- [Bug]: vLLM server not supporting stabilityai/stablelm-3b-4e1t model on CPU
- [Usage]: Is there an option to obtain attention matrices during inference, similar to the output_attentions=True parameter in the transformers package?
- [Usage]: About bitsandbytes HOT 1
- [Feature]: phi-3.5 is a strong model for its size, including vision support. Has multi-image support, but vllm does not support HOT 2
- [Usage]: Wait for the response for each prediction
- [Bug]: Requesting Prompt Logprobs with an MLP Speculator Crashes the Server HOT 1
- [Installation]: Failed to build vLLM from source due to https://github.com/vllm-project/vllm/pull/7174 by bisecting the most recent changes HOT 3
- [Bug]: llama 3.1 405B RuntimeError: start (1024) + length (256) exceeds dimension size (1024)
- [Performance]: MLP speculator HOT 2
- [Usage]: How do I configure Phi-3-vision for high throughput? HOT 7
- [Feature]: Integrate with `Formatron`
- Supporting new vision language model - https://huggingface.co/OpenGVLab/InternVL2-26B HOT 1
- Include Llama-405B in nightly benchmarks? HOT 3
- [Bug]: for mistral-7B, local batch inference mode causes OOM error, while serving mode does not cause error
- [Bug]: gpu-memory-utilization does not pickup enough GPU memory
- [RFC]: Keep a Changelog & Add FAQs in the Documentation HOT 1
- [Misc]: How to force generate a fixed response from llama3 HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vllm.