Giter Club home page Giter Club logo

Comments (3)

alugowski avatar alugowski commented on August 16, 2024

This error is in the profile_run used to determine memory usage. If I patch the code to ignore the error, and a few lines below patch the model length to make raise_if_cache_size_invalid happy, the model starts. Sure it won't reach the full 1M context, but it will work with 200k on an 80GB GPU.

This will be more pressing for users of small GPUs as the popular models increase their context lengths beyond 8k.

I'm happy to submit my monkey patches if there isn't already a plan to support large context models.

from vllm.

DarkLight1337 avatar DarkLight1337 commented on August 16, 2024

You can manually set --max-model-len to reduce the context length.

Not sure whether it's a good idea to automatically limit the context length based on available memory. @simon-mo any thoughts?

from vllm.

alugowski avatar alugowski commented on August 16, 2024

You can manually set --max-model-len to reduce the context length.

Not sure whether it's a good idea to automatically limit the context length based on available memory. @simon-mo any thoughts?

Agreed that a purely automatic setting may give folks the wrong impression that they can use the full context of the model even if their hardware won't allow it. One alternative is --max-model-len max that would start the model no matter what and report the actual max context in the logs.

Right now someone must start vllm, see the crash, parse out the max context size from the log, and set that with --max-model-len. But that's only if the profile_run() doesn't OOM with the exception in the OP, in that case the user must guess at the max model len (the log message with the actual max is printed later, and depends on profile_run() succeeding).

from vllm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.