Giter Club home page Giter Club logo

homura's Introduction

homura document

homura is a library for fast prototyping DL research.

๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ homura (็„ฐ) is flame or blaze in Japanese. ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

Notice: homura v2019.11+ introduces backward-incompatible changes

For older versions, install as pip install git+https://github.com/moskomule/[email protected] etc.

Requirements

minimal requirements

Python>=3.8
PyTorch>=1.5.0
torchvision>=0.6.0
tqdm # automatically installed
tensorboard # automatically installed
hydra-core # automatically installed

optional

colorlog (to log with colors)
faiss (for faster kNN)
accimage (for faster image pre-processing)
horovad (for easier distributed training)
cupy

If horovod is available, homura tries to use it for distributed training. To disable horovod and use pytorch.distributed instead, set HOMURA_DISABLE_HOROVOD=1.

test

pytest .

Installation

pip install git+https://github.com/moskomule/homura

or

git clone https://github.com/moskomule/homura
cd homura
pip install -e .

horovod installation

conda install gxx_linux-64
pip install horovod

APIs

Basics

homura aims abstract (e.g., device-agnostic) simple prototyping.

from homura import optim, lr_scheduler
from homura import trainers, callbacks, reporters
from torchvision.models import resnet50
from torch.nn import functional as F

# User does not need to care about the device
resnet = resnet50()
# Model is registered in optimizer lazily. This is convenient for distributed training and other complicated scenes.
optimizer = optim.SGD(lr=0.1, momentum=0.9)
scheduler = lr_scheduler.MultiStepLR(milestones=[30,80], gamma=0.1)

# `homura` has callbacks
c = [callbacks.AccuracyCallback(),
    reporters.TensorboardReporter(".")]
with trainers.SupervisedTrainer(resnet, optimizer, loss_f=F.cross_entropy, 
                                     callbacks=c, scheduler=scheduler) as trainer:
    # epoch-based training
    for _ in range(epochs):
        trainer.train(train_loader)
        trainer.test(test_loader)

    # otherwise, iteration-based training

    trainer.run(train_loader, test_loader, 
                total_iterations=1_000, val_intervals=10)

User can customize iteration of trainer as follows.

from homura.trainers import TrainerBase, SupervisedTrainer
from homura.utils.containers import TensorMap

trainer = SupervisedTrainer(...)

def iteration(trainer: TrainerBase, 
              data: Tuple[torch.Tensor]) -> Mapping[torch.Tensor]:
    input, target = data
    output = trainer.model(input)
    loss = trainer.loss_f(output, target)
    results = Map(loss=loss, output=output)
    if trainer.is_train:
        trainer.optimizer.zero_grad()
        loss.backward()
        trainer.optimizer.step()
    # iteration returns at least (loss, output)
    # registered value can be called in callbacks
    results.user_value = user_value
    return results

SupervisedTrainer.iteration = iteration
# or   
trainer.update_iteration(iteration) 

callbacks.Callback can access the parameters of models, loss, outputs of models and other user-defined values.

In most cases, callbacks.metric_callback_decorator is useful. The returned values are accumulated.

from homura import callbacks

@callbacks.metric_callback_decorator
def user_value(data):
    return data["user_value"]

callbacks.Callback has methods before_all, before_iteration, before_epoch, after_all, after_iteration and after_epoch. For example, callbacks.WeightSave is like:

from homura.callbacks import Callback
class WeightSave(Callback):
    ...

    def after_epoch(self, data: Mapping):
        self._epoch = data["epoch"]
        self._step = data["step"]
        if self.save_freq > 0 and data["epoch"] % self.save_freq == 0:
            self.save(data, f"{data['epoch']}.pkl")

    def after_all(self, data: Mapping):
        if self.save_freq == -1:
            self.save(data, "weight.pkl")

dict of models, optimizers, loss functions are supported.

trainer = CustomTrainer({"generator": generator, "discriminator": discriminator},
                        {"generator": gen_opt, "discriminator": dis_opt},
                        {"reconstruction": recon_loss, "generator": gen_loss},
                        **kwargs)

Distributed training

Easy distributed initializer homura.init_distributed() is available. See imagenet.py as an example.

Reproducibility

This method makes randomness deterministic in its context.

from homura.utils.reproducibility import set_deterministic, set_seed
with set_deterministic(seed):
    something()

with set_seed(seed):
    other_thing()

Examples

See examples.

  • cifar10.py: training ResNet-20 or WideResNet-28-10 with random crop on CIFAR10
  • imagenet.py: training a CNN on ImageNet on multi GPUs (single and multi process)

For imagenet.py, if you want

  • single node single gpu
  • single node multi gpus

run python imagenet.py root=/path/to/imagenet/root.

If you want

  • single node multi threads multi gpus

run python -m torch.distributed.launch --nproc_per_node=$NUM_GPUS imagenet.py root=/path/to/imagenet/root distributed.on=true.

If you want

  • multi nodes multi threads multi gpus,

run

  • python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=0 --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py root=/path/to/imagenet/root distributed.on=true on the master node
  • python -m torch.distributed.launch --nnodes=$NUM_NODES --node_rank=$RANK --master_addr=$MASTER_IP --master_port=$MASTER_PORT --nproc_per_node=$NUM_GPUS imagenet.py root=s/path/to/imagenet/root distributed.on=true on the other nodes

Here, 0<$RANK<$NUM_NODES.

Citing

@misc{homura,
    author = {Ryuichiro Hataya},
    title = {homura},
    year = {2018},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://GitHub.com/moskomule/homura}},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.