Giter Club home page Giter Club logo

Comments (10)

b-fission avatar b-fission commented on July 28, 2024

You say you selected the Multi-GPU checkbox? Just unselect it, and it should be able to launch with the given GPU IDs in the CUDA_VISIBLE_DEVICES var for you. If it's already unselected, it sounds like the option might be turned on elsewhere like the default config for accelerate.

from kohya_ss.

rafstahelin avatar rafstahelin commented on July 28, 2024

Makes sense. Will try now. BTW, do you know a command to follow each gpu individual process as when I launch the CUDA process in activated env to monitor the steps?

from kohya_ss.

b-fission avatar b-fission commented on July 28, 2024

I don't know of any easy way to track multiple training processes.

Maybe you could launch several instances of kohya gui in separate terminal shells and note which port numbers they use (7860, 7861, etc), then launch training from each gui session, and observe from each terminal.

But if you're manually running the training commands from a shell instead of the web gui, have you tried using multiplexers like screen or tmux? It'd allow you to have split-screen terminals without requiring a full desktop session or window manager.

from kohya_ss.

rafstahelin avatar rafstahelin commented on July 28, 2024

from kohya_ss.

rafstahelin avatar rafstahelin commented on July 28, 2024

I unchecked Multi GPU and selected GPU IDs 1 (i have two gpu's currently on runpod, so 0 and 1)

But I get the following message, that doesnt make sense:

The following values were not passed to `accelerate launch` and had defaults used instead:
                More than one GPU was found, enabling multi-GPU training.
                If this was unintended please pass in `--num_processes=1`.
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.

image

This worked correctly. It automatically assigned GPU=1 instead of 0

But to follow the exposed port I suppose I would add the port on Main process Port.

How I would then open this port's terminal is my only question

from kohya_ss.

b-fission avatar b-fission commented on July 28, 2024

But I get the following message, that doesnt make sense:

The following values were not passed to `accelerate launch` and had defaults used instead:
                More than one GPU was found, enabling multi-GPU training.
                If this was unintended please pass in `--num_processes=1`.
To avoid this warning pass in values for each of the problematic parameters or run `accelerate config`.

Did you set the "Number of processes" option to 1?

accel

But to follow the exposed port I suppose I would add the port on Main process Port.

How I would then open this port's terminal is my only question

What are you using to get a terminal, is it SSH? You should be able to open more SSH sessions and start additional kohya gui instances (without needing to change any server configs). That's essentially what I suggested earlier with multiple ports and terminal shells.

from kohya_ss.

rafstahelin avatar rafstahelin commented on July 28, 2024

from kohya_ss.

b-fission avatar b-fission commented on July 28, 2024

I think you've got the right idea. Each kohya instance gives a gui on a new port number (and URL) which is listed on the log output.

I don't know how it behaves on runpod (haven't used it) but if I run a local instance of the gui, the gradio port number would start at 7860. Running more instances would bump it to 7861 and 7862 in a new URL, all of which I could access directly in my web browser.

Using shell=True when running external commands...
IMPORTANT: You are using gradio version 4.26.0, however version 4.29.0 is available, please upgrade.
--------
Running on local URL:  http://127.0.0.1:7860
Running on public URL: https://_______.gradio.live

Now if you're talking about what do with the "Main process port" option, you can ignore that. It's used for training on multiple GPUs across different machines, which is not what we're doing here.

from kohya_ss.

rafstahelin avatar rafstahelin commented on July 28, 2024

from kohya_ss.

rafstahelin avatar rafstahelin commented on July 28, 2024

from kohya_ss.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.