Giter Club home page Giter Club logo

Comments (4)

philtrade avatar philtrade commented on May 27, 2024 2

Hello @bowenroom,

I have released another library mpify, to support distributed training for fastai2 in Jupyter notebook.

Here is the link: https://github.com/philtrade/mpify/tree/v0.1.0

If you have any question, please raise issues in the above new repo. Thank you for your patience.

from ddip.

bowenroom avatar bowenroom commented on May 27, 2024

Have you ever tested on fastai2? look forward to it!

from ddip.

philtrade avatar philtrade commented on May 27, 2024

Hello @bowenroom, thank you for the inquiry.

The support for fastai2 is very much a work-in-progress, some pieces work already, but not all. I can't promise when it'll be ready, but it's on my radar 😅

from ddip.

philtrade avatar philtrade commented on May 27, 2024

Update: Checked in a prototype notebook for fastaiv2:v2_multiprocess_ddp_05_pet_breeds.ipynb that allows distributed-data-parallel in Jupyter notebook.

It is an entirely different design from the current Ddip. It does not use the %load_ext Ddip mechanism, nor does it depends on ipyparallel.

It only needs a module called multiprocess (install by pip install multiprocess), which handles object and function serialization more properly than the default python multiprocessing, and torch.multiprocessing, specifically in the Jupyter notebook environment. See the excellent explanation here.

This on-demand distributed training within the interactive notebook environment may be more suitable for v2 because of many hidden race conditions outside of the learner.fit() loop, that the old design (the big hammer of a persistent ipyparallel cluster) has to wrestle with. E.g. in learning rate finder, or anything that may involve file system read/write.

The middle of the notebook shows the raw steps: a basic "create a group of new distributed learners on the fly using explicitly dataloader creation functions etc.., and train for N epochs", all in one cell. To Do: allow restoration from checkpoints weights/learner states, then train, and refactor these into easier to use api, possibly a context manager.

from ddip.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.