Giter Club home page Giter Club logo

Comments (3)

haichuan1221 avatar haichuan1221 commented on July 17, 2024

Your current environment

I set vllm to flashinfer, but i get the error below:

INFO:     127.0.0.1:38616 - "POST /v1/completions HTTP/1.1" 200 OK
ERROR:    Exception in ASGI application
Traceback (most recent call last):
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/responses.py", line 265, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/responses.py", line 261, in wrap
    await func()
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/responses.py", line 238, in listen_for_disconnect
    message = await receive()
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 553, in receive
    await self.message_event.wait()
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/asyncio/locks.py", line 226, in wait
    await fut
asyncio.exceptions.CancelledError: Cancelled by cancel scope 7f47bc642f40

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/uvicorn/protocols/http/httptools_impl.py", line 399, in run_asgi
    result = await app(  # type: ignore[func-returns-value]
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/uvicorn/middleware/proxy_headers.py", line 70, in __call__
    return await self.app(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/fastapi/applications.py", line 1054, in __call__
    await super().__call__(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/applications.py", line 123, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/middleware/errors.py", line 186, in __call__
    raise exc
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/middleware/errors.py", line 164, in __call__
    await self.app(scope, receive, _send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/middleware/cors.py", line 85, in __call__
    await self.app(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/middleware/exceptions.py", line 65, in __call__
    await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/routing.py", line 756, in __call__
    await self.middleware_stack(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/routing.py", line 776, in app
    await route.handle(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/routing.py", line 297, in handle
    await self.app(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/routing.py", line 77, in app
    await wrap_app_handling_exceptions(app, request)(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/_exception_handler.py", line 64, in wrapped_app
    raise exc
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
    await app(scope, receive, sender)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/routing.py", line 75, in app
    await response(scope, receive, send)
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/starlette/responses.py", line 265, in __call__
    await wrap(partial(self.listen_for_disconnect, receive))
  File "/mnt/harddisk/miniconda3/envs/flashinfer/lib/python3.9/site-packages/anyio/_backends/_asyncio.py", line 680, in __aexit__
    raise BaseExceptionGroup(
exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)

🐛 Describe the bug

Firstly, I set vllm attention backend to flashinfer as shown below: export VLLM_ATTENTION_BACKEND=FLASHINFER

Secondly, I run the vllm server as: python -m vllm.entrypoints.openai.api_server --model LLM-Research/Meta-Llama-3-70B-Instruct --tensor-parallel-size 4 --trust-remote-code --max-model-len 8192 --port 30000 --swap-space 16 --disable-log-requests --enable-prefix-caching --enforce-eager

It turns out that flashinfer does not support prefix caching.

from vllm.

haichuan1221 avatar haichuan1221 commented on July 17, 2024

problem solved

from vllm.

codevoyager1984 avatar codevoyager1984 commented on July 17, 2024

problem solved

I encounter the same errors here, can u share how u solve this ? Many thanks.

from vllm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.