Giter Club home page Giter Club logo

Comments (5)

Andrey36652 avatar Andrey36652 commented on August 28, 2024

@Lissanro wouldn't it be killed by pcie latency?

from segmoe.

Lissanro avatar Lissanro commented on August 28, 2024

I think PCI-E latency is only relevant during training (not to mention it could be quite good if PCI-E 4.0 or PCI-E 5.0 with sufficient number of lanes is used, or NVLink in case of a pair 3090 cards).

For inference, PCI-E latency should not matter much, it is just independent experts doing their job once their fully loaded to the VRAM. This is how for example running Mixtral (8x7B MoE) is possible at 4-bit or higher quantization with 24GB cards - since it cannot fit in 24GB of a single card, it gets split across more than 1 GPU, and speed is comparable to running on a single GPU.

Potentially, it could be even better if parallelism across multiple GPUs is implemented (for a case when one expert is fully allocated at one GPU, and another expert at different GPU, and the gate network decided it needs to use both). In any case, even naive sequential implementation (to process experts one-by-one even if they are on different GPUs) is still better than crashing with OOM, and in terms of speed should be at least comparable to running on a single GPU with the higher VRAM.

from segmoe.

Warlord-K avatar Warlord-K commented on August 28, 2024

Thanks for the suggestion, we are working on optimizing the memory usage, but feel free to create a PR for Multi-GPU usage.

from segmoe.

g29times avatar g29times commented on August 28, 2024

@Warlord-K Hi Admin, is there any possible that the homepage README file that tells the GPU needs or specifications?

from segmoe.

Warlord-K avatar Warlord-K commented on August 28, 2024

@g29times I have added the GPU requirements, thanks for the suggestion!

from segmoe.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.