Light

Endless stream of 'Waiting for shard 0 to be ready...' about text-generation-inference HOT 8 CLOSED

huggingface commented on May 12, 2024

Endless stream of 'Waiting for shard 0 to be ready...'

from text-generation-inference.

Comments (8)

nicolasf commented on May 12, 2024 1

I still have this problem, even with commit 0ac38d3. Most recently, it happens with text-generation-lau ncher --model-id facebook/galactica-120b --num-shard 1 --quantize for me.

@ScientiaEtVeritas if you are using a 4000 series card, try using the NCCL_P2P_DISABLE=1 environment variable. I was able to go further with that. It's a RTX 4000 software bug where it reports P2P as available but it's not on RTX 4000 hardware.

from text-generation-inference.

OlivierDehaene commented on May 12, 2024

What commit are you on? It doesn't seem to be the latest one.

from text-generation-inference.

ScientiaEtVeritas commented on May 12, 2024

I'm on the latest release at least (v0.3.0, c720555).

from text-generation-inference.

OlivierDehaene commented on May 12, 2024

I published a new release. Can you try with this one?
I was unable to reproduce your issue on my end.

from text-generation-inference.

OlivierDehaene commented on May 12, 2024

Closing as stale.

from text-generation-inference.

ScientiaEtVeritas commented on May 12, 2024

I still have this problem, even with commit 0ac38d3. Most recently, it happens with text-generation-lau ncher --model-id facebook/galactica-120b --num-shard 1 --quantize for me.

from text-generation-inference.

nicolasf commented on May 12, 2024

Same issue here. While it's trying to load the GPU usage is maxed out, but VRAM usage is low (around 1.5GB)

from text-generation-inference.

OlivierDehaene commented on May 12, 2024

Yes, if you see that only a fraction of the GPUs have a 100% GPU utilization, but the other ones are idle it is usually a NCCL issue.

from text-generation-inference.

Related Issues (20)

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.