Giter Club home page Giter Club logo

Comments (9)

FrankLeeeee avatar FrankLeeeee commented on May 13, 2024 1

Commands for ViT Hybrid Parallel updated in this PR

from colossalai.

kurisusnowdeng avatar kurisusnowdeng commented on May 13, 2024 1

Hi, @kurisusnowdeng , can you fix this issue in benchmark repository? I have fixed the one in the example repository.

@FrankLeeeee in benchmarks it's basically consistent. Add --from_torch if using torchrun. Otherwise, Colossal-AI launches in a standard way. However, in my opinion, we'd better use docker as the first choice to run benchmarks and examples, so that it can be easier to make the environment consistent as well. What do you think?

from colossalai.

FrankLeeeee avatar FrankLeeeee commented on May 13, 2024

@kurisusnowdeng can you fix the benchmark readme?

from colossalai.

FrankLeeeee avatar FrankLeeeee commented on May 13, 2024

Hi, @kurisusnowdeng , can you fix this issue in benchmark repository? I have fixed the one in the example repository.

from colossalai.

FrankLeeeee avatar FrankLeeeee commented on May 13, 2024

I think what @binmakeswell means is that we should provide sample commands for different launchers for clarity. I am ok with docker if this is to provide the user with an environment which has pre-installed dependencies. The problem with docker is that it can only run on single node if we provide pre-defined entry-point command. In multi-node environment, we still need to use srun or mpirun to start the docker and this may conflict with the entry-point command.

from colossalai.

kurisusnowdeng avatar kurisusnowdeng commented on May 13, 2024

I think what @binmakeswell means is that we should provide sample commands for different launchers for clarity. I am ok with docker if this is to provide the user with an environment which has pre-installed dependencies. The problem with docker is that it can only run on single node if we provide pre-defined entry-point command. In multi-node environment, we still need to use srun or mpirun to start the docker and this may conflict with the entry-point command.

Seems @binmakeswell mainly concerns that users don't know how to use the python commands with slurm. But I think docker may be already the most convenient way for users to run our codes. Also, we already have a tutorial that shows the usage of slurm, and maybe what we need to do is to make that tutorial compatible to more cases, rather than explain how to run slurm everywhere.

from colossalai.

FrankLeeeee avatar FrankLeeeee commented on May 13, 2024

I think putting a link to launch colossalai will do. We have provided a docker file in the Colossal-AI repository, do you mean to change the docker entrypoint command for examples?

from colossalai.

kurisusnowdeng avatar kurisusnowdeng commented on May 13, 2024

I think putting a link to launch colossalai will do. We have provided a docker file in the Colossal-AI repository, do you mean to change the docker entrypoint command for examples?

Yes. Maybe we can provide a dockerfile to pack each single example. Then users just build and run the image.

from colossalai.

FrankLeeeee avatar FrankLeeeee commented on May 13, 2024

OK, my opinion is that dockerfile is usually for complex environment setup. If an example requires complicated setup, then a dockerfile will be good.

from colossalai.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.