Comments (6)
I’m working on this and will hopefully upstream it next week. We have eagle spec decoding now without tree decoding. To support tree decoding we also need the kernel support
from mlc-llm.
Initial support for Medusa is added in #2337 , tree decoding is not yet supported as more work is required
from mlc-llm.
+1、
from mlc-llm.
I’m working on this and will hopefully upstream it next week. We have eagle spec decoding now without tree decoding. To support tree decoding we also need the kernel support
hi @vinx13
Will the tree decoding kernel be released next week or will it take longer?
from mlc-llm.
I’m working on this and will hopefully upstream it next week. We have eagle spec decoding now without tree decoding. To support tree decoding we also need the kernel support
Glad to hear that @vinx13 and thanks a bunch for your quick reply! Look forward to seeing your pull request. Are you also working on the Tree-based attention?
from mlc-llm.
Thanks a lot! We'll try Medusa list decoding first.
from mlc-llm.
Related Issues (20)
- [Question] Single forward pass through ChatModule HOT 1
- [Question] How do you convert .bin files to wasm. Also where are TVM_HOME and MLC_HOME located? HOT 4
- 执行mlc_chat命令时,提示tvm模块找不到。 HOT 1
- [Model Request] Yi-1.5 HOT 2
- [Bug] Mistral MultiRound Chat Bug HOT 1
- [Bug] java.lang.NullPointerException: Attempt to invoke virtual method 'org.apache.tvm.TVMValue org.apache.tvm.Function.invoke()' on a null object reference HOT 2
- [Question] How to generate conversational template with more than one input HOT 3
- [Bug] mlc_llm.serve server mode Error when multiple(>=4) concurrent requests HOT 4
- [Model Request] Phi-3-Vision HOT 4
- Fail to build tvm-unity from source on orin[Bug]
- [Bug] AttributeError: 'Namespace' object has no attribute 'mlc_source_dir' HOT 1
- [Bug] phi 2 q4 model doesn't work
- Phi-2 q4f16_1 runs faster when compiled without `tvm.relax.transform.FuseOps()` and `tvm.relax.transform.FuseTIR()` transformations HOT 2
- [Question] KVCache: 0.00 MB when compiling models HOT 3
- [Bug] mlc_llm chat not working: ValueError: Cannot find global var "multinomial_from_uniform1" in the Module HOT 7
- [Feature Request] phi-3 small realeased -> performs two times ebtter then Phi-3 mini
- [Bug]: Error on "The block is 1-time referenced by other blocks, thus cannot accept new KV values." HOT 2
- [Doc] Python API KV/memory reset details absent HOT 6
- [Question] Why `Chat`and `Completion` have different structure in `engine.py`? HOT 2
- Unable to serve Mistral-7B-Instruct-v0.3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlc-llm.