Anything you want to discuss about vllm. As a beginner, there are

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

how big of a feature do you want to take on? <p dir="au

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

StreamingLLM in BlockManagerv2 <a class="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

[Misc]: need "first good issue" about vllm HOT 10 CLOSED

HarryWu99 commented on June 25, 2024

[Misc]: need "first good issue"

from vllm.

Comments (10)

robertgshaw2-neuralmagic commented on June 25, 2024 1

https://github.com/vllm-project/vllm/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22

from vllm.

robertgshaw2-neuralmagic commented on June 25, 2024

@HarryWu99 how big of a feature do you want to take on?

from vllm.

HarryWu99 commented on June 25, 2024

how big of a feature do you want to take on?

@robertgshaw2-neuralmagic I don't know yet.😂 I'm still getting familiar with the code. For now, I am interested in "sparse kv cache". Hope to develop a related feature, if my ability allows.

Before that, I'm also happy to deal with some simple bug fixes.

from vllm.

robertgshaw2-neuralmagic commented on June 25, 2024

@HarryWu99 cool! A good one to take on eventually would be StreamingLLM in BlockManagerv2

https://arxiv.org/abs/2309.17453

When I was ramping on the codebase, I found that looking into metrics / monitoring is a decent place to start since you have to touch many pieces

I have a halfway done branch I could use some help pushing over the line

from vllm.

robertgshaw2-neuralmagic commented on June 25, 2024

In particular, https://github.com/ronensc/vllm/pull/1/files

Some of this PR I merged in today, but the rest of the new metrics need to be rebased onto main and ensure correctness / testing

If you want to pick it up to get the ball rolling let me know

from vllm.

HarryWu99 commented on June 25, 2024

@robertgshaw2-neuralmagic yes, I'd love to have a try.

from vllm.

robertgshaw2-neuralmagic commented on June 25, 2024

Cool - I’m going to pick this back up on Wednesday or so, so please keep me in the loop to your progress

from vllm.

Kaiyang-Chen commented on June 25, 2024

StreamingLLM in BlockManagerv2

@robertgshaw2-neuralmagic Hi, i am interested in implementing this feature, but before the FlashInfer that enable attention kernel to apply rotary embedding in place being merged, are there any decent ways for us to do that?

from vllm.

HarryWu99 commented on June 25, 2024

StreamingLLM in BlockManagerv2

I am interested in it, too.😆

from vllm.

robertgshaw2-neuralmagic commented on June 25, 2024

@HarryWu99 we also have some ongoing work for embedding models that could use a hand

from vllm.

Recommend Projects

[Misc]: need "first good issue" about vllm HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent