tmabraham / upit Goto Github PK

View Code? Open in Web Editor NEW

133.0 5.0 21.0 210.3 MB

A fastai/PyTorch package for unpaired image-to-image translation.

Home Page: https://tmabraham.github.io/UPIT

License: Apache License 2.0

Jupyter Notebook 99.76% Python 0.23% CSS 0.01%

pytorch cyclegan image-to-image-translation fastai nbdev deep-learning

upit's People

Stargazers

Watchers

upit's Issues

How can I use the model for prediction and how to save the model?

I am using this amazing code you have written for my job.I wanted to know how can I save the model and also how to use it for the prediction...would like to create a UI with it!

multi-GPU support

I need to check if fastai's multi-GPU support work with my package, and if not, what needs to be modified to get it to work. Additionally, I may need to add a simpler interface for DDP or at least clear examples/documentation. This will enable for quicker model training on multi-GPU servers, like those at USF.

Add more unit tests

Here are some tests of interest:

Test the loss (example fake and real images for the reconstruction loss and discriminator)
Check the batch independence and the model parameter updates
Test successful overfitting on a single batch
Test for rotation invariance and other invariance properties

'str' object has no attribute '__stored_args__'

Pops up when I try to use cycle_learner.

Expose tfms in dataset generation

Hey there,

I think it would be a good idea to expose the tfms of the Datasets in the get_dls-Function, so users can set their own.

UPIT/upit/data/unpaired.py

Line 30 in c6c769c

 dsets = Datasets(filesA, tfms=[[PILImage.create, ToTensor, Resize(load_size),RandomCrop(crop_size)], 

Same goes for dataloaders batch_tfms, (e.g. if I want to disable Flipping I have to rewrite the dataloaders)

UPIT/upit/data/unpaired.py

Line 33 in c6c769c

 batch_tfms = [IntToFloatTensor, Normalize.from_stats(mean=0.5, std=0.5), FlipItem(p=0.5)] 

Validate with Existing Model Trained on Both Classes

This is really great work! I noticed you appear to be using this code for creating new examples of pathology images. I am doing something similar, just not pathology. Let's say I already have a separate model trained to classify these types of images and I want to run it against a validation set every epoch and determine how well the cyclegan is performing in generating new examples that fool an existing classifier. I'm trying to figure out where the best place might be to add code which does that. I can see a few places where it would probably work, but was curious if you have already thought about adding this functionality to monitor training progress?

Thanks,

Bob

Add generated images to progress bar at the end of each epoch

I want to be able to view the progress of training by viewing example generated images at the end of each epoch.

The fastai GAN code had some incomplete code for achieving this. This is enabled by the fastprogress bar, which allows us to show images, diagrams, etc.

I will try to follow this example and modify the CycleGANTrainer callback in order to add this feature.

How to show images after fit?

Hi. The learner has a method learn.progress.show_cycle_gan_imgs. However, how to plot it with matplotlib's plt.show() if I use python repl.
There is an argument event_name in learn.progress.show_cycle_gan_imgs.
I would like to do it after fit:

>>> learn.fit_flat_lin(1,1,2e-4)
epoch     train_loss  id_loss_A  id_loss_B  gen_loss_A  gen_loss_B  cyc_loss_A  cyc_loss_B  D_A_loss  D_B_loss  time    
/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastprogress/fastprogress.py:74: UserWarning: Your generator is empty.
  warn("Your generator is empty.")
0         10.809340   1.621755   1.690260   0.420364    0.452442    3.359353    3.504819    0.370692  0.370692  00:08     
1         9.847495    1.283465   1.510985   0.353303    0.349454    2.682495    3.135504    0.253919  0.253919  00:07

UPIT/upit/train/cyclegan.py

Lines 167 to 176 in 020f8e2

 def after_epoch(self): 

 "Update images" 

 if (self.learn.epoch+1) % self.show_img_interval == 0: 

 if self.imgA: self.imgA_result = torch.cat((self.learn.xb[0][1].detach(),self.learn.pred[0].detach()),dim=-1); self.last_gen=self.imgA_result 

 if self.imgB: self.imgB_result = torch.cat((self.learn.xb[0][0].detach(),self.learn.pred[1].detach()),dim=-1); self.last_gen=self.imgB_result 

 if self.imgA and self.imgB : self.last_gen = torch.cat((self.imgA_result,self.imgB_result),dim=-2) 

 img = TensorImage(self.learn.dls.after_batch.decode(TensorImage(self.last_gen[0]))[0]) 

 self.imgs.append(img) 

 self.titles.append(f'Epoch {self.learn.epoch}') 

 self.progress.mbar.show_imgs(self.imgs, self.titles,imgsize=10)

Add metrics and test model tracking callbacks

I want to add support for metrics, and even potentially include some common metrics, like FID, mi-FID, KID, and segmentation metrics (for paired) etc.

Additionally, monitoring the losses and metrics, I want to be able to use fastai's built-in callbacks for saving best model, early stopping, and reducing LR on plateau.

This shouldn't be too hard to include. A major part of this feature is finding good PyTorch/numpy implementations of some of these metrics and getting it to work.

AttributeError: 'Learner' object has no attribute 'pred'

Hi. I am getting the following error:

from upit.data.unpaired import *
from upit.models.cyclegan import *
from upit.train.cyclegan import *
from fastai.vision.all import *

horse2zebra = untar_data('https://people.eecs.berkeley.edu/~taesung_park/CycleGAN/datasets/horse2zebra.zip')


folders = horse2zebra.ls().sorted()
trainA_path = folders[2]
trainB_path = folders[3]
testA_path = folders[0]
testB_path = folders[1]

dls = get_dls(trainA_path, trainB_path,num_A=100)
cycle_gan = CycleGAN(3,3,64)
learn = cycle_learner(dls, cycle_gan,show_img_interval=1)
learn.show_training_loop()
learn.lr_find()

AttributeError: 'Learner' object has no attribute 'pred'

Weights and Biases Callback

I want to have an easy integration with Weights and Biases, making it easy to track and analyze experiments. I tried the WandbCallback but I noticed a few issues:

The dataset path needs to be set manually for it to log the dataset. It also will log all the folders in the dataset, when I would rather have it log only the folders used for the experiment.
Sample predictions are are not logged during training (raises error)
logging model requires the SaveModelCallback

I have decided to write my own callback based on the WandbCallback but better suited for the experiment setups we have with UPIT and unpaired image-to-image translation problems.

How do I turn fake_A and fake_B into images and save them?

I would like to see what fake_A and fake_B look like at this step in the process saved as images. I can't seem to figure out how to convert them properly.

def forward(self, output, target):
"""
Forward function of the CycleGAN loss function. The generated images are passed in as output (which comes from the model)
and the generator loss is returned.
"""
fake_A, fake_B, idt_A, idt_B = output
#Save and look at png images of fake_A and fake_B here

Is to_fp16() Implemented Already?

BTW, the more I use this and investigate this code the more I am impressed with the quality of it! I really appreciate you making such a solid and well thought our implementation. You have no idea how helpful this has been to speeding up my work.

I tried adding to_fp16() to this:
```learn = cycle_learner(dls, cycle_gan,opt_func=partial(Adam,mom=0.5,sqr_mom=0.999)).to_fp16()`

Note: After posting this I realized you already have this as a future enhancement. So I will investigate and hopefully I can help you with a PR to take some work off of your plate. I would love to start contributing PRs to this project if you don't mind.

But received this:

RuntimeError                              Traceback (most recent call last)
~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/learner.py in _with_events(self, f, event_type, ex, final)
    154     def _with_events(self, f, event_type, ex, final=noop):
--> 155         try:       self(f'before_{event_type}')       ;f()
    156         except ex: self(f'after_cancel_{event_type}')

~/anaconda3/envs/fastai/lib/python3.7/site-packages/fastai/learner.py in _do_one_batch(self)
    165         self('after_pred')
--> 166         if len(self.yb): self.loss = self.loss_func(self.pred, *self.yb)
    167         self('after_loss')

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(

~/anaconda3/envs/fastai/lib/python3.7/site-packages/upit/train/cyclegan.py in forward(self, output, target)
     69         #Generators are trained to trick the discriminators so the following should be ones
---> 70         self.gen_loss_A = self.crit(self.cgan.D_A(fake_A), 1)
     71         self.gen_loss_B = self.crit(self.cgan.D_B(fake_B), 1)

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/container.py in forward(self, input)
    116         for module in self:
--> 117             input = module(input)
    118         return input

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
    721         else:
--> 722             result = self.forward(*input, **kwargs)
    723         for hook in itertools.chain(

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/conv.py in forward(self, input)
    418     def forward(self, input: Tensor) -> Tensor:
--> 419         return self._conv_forward(input, self.weight)
    420 

~/anaconda3/envs/fastai/lib/python3.7/site-packages/torch/nn/modules/conv.py in _conv_forward(self, input, weight)
    415         return F.conv2d(input, weight, self.bias, self.stride,
--> 416                         self.padding, self.dilation, self.groups)
    417 

RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same``

I'm sure I can track down where to make the other necessary changes, but was curious if you already have this built in somewhere or know which changes would need to be made already. The biggest difficulty I'm facing is GPU memory so to_fp16() would improve speed and memory use quite significantly.

Disable Identity Loss

Hey,
thanks for your awesome work.
If I want to set l_idt of the CycleGANLoss to zero, how would I do this? Can I pass this some argument to the cycle_learner?
On a quick look this seems to be hardcoded in the cycle_learner, right ?
So I would have to right "my own" cycle_learner?

Thanks for the answers to my - most likely - stupid questions!

Add mixed precision support

Currently, for some reason, fastai's to_fp16 and to_native_fp16 do not work. I need to figure out why this is not working and fix it.

Add demo to HuggingFace Spaces

This Gradio demo should be transferred to HuggingFace Spaces now, now that it is the main place to freely host ML demos.

How to make predictions?

With image classifier I usually do:

test_dl = object.dls.test_dl("n02381460_1052.jpg") # object is model/learner
predictions = object.get_preds(dl = test_dl)

However, it throws:

  TypeError: 'NoneType' object cannot be interpreted as an integer

Traceback (most recent call last):
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 161, in all_batches
    for o in enumerate(self.dl): self.one_batch(*o)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 179, in one_batch
    self._with_events(self._do_one_batch, 'batch', CancelBatchException)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 155, in _with_events
    try:       self(f'before_{event_type}')       ;f()
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 133, in __call__
    def __call__(self, event_name): L(event_name).map(self._call_one)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/foundation.py", line 226, in map
    def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/basics.py", line 537, in map_ex
    return list(res)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/basics.py", line 527, in __call__
    return self.fn(*fargs, **kwargs)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 137, in _call_one
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 137, in <listcomp>
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/callback/core.py", line 44, in __call__
    if self.run and _run: res = getattr(self, event_name, noop)()
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/upit/train/cyclegan.py", line 112, in before_batch
    self.learn.xb = (self.learn.xb[0],self.learn.yb[0]),
IndexError: tuple index out of range

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/torch_core.py", line 268, in to_concat
    try:    return retain_type(torch.cat(xs, dim=dim), xs[0])
TypeError: expected Tensor as element 0 in argument 0, but got NoneType

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<string>", line 2, in <module>
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 235, in get_preds
    self._do_epoch_validate(dl=dl)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 188, in _do_epoch_validate
    with torch.no_grad(): self._with_events(self.all_batches, 'validate', CancelValidException)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 157, in _with_events
    finally:   self(f'after_{event_type}')        ;final()
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 133, in __call__
    def __call__(self, event_name): L(event_name).map(self._call_one)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/foundation.py", line 226, in map
    def map(self, f, *args, gen=False, **kwargs): return self._new(map_ex(self, f, *args, gen=gen, **kwargs))
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/basics.py", line 537, in map_ex
    return list(res)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/basics.py", line 527, in __call__
    return self.fn(*fargs, **kwargs)
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 137, in _call_one
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/learner.py", line 137, in <listcomp>
    [cb(event_name) for cb in sort_by_run(self.cbs)]
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/callback/core.py", line 44, in __call__
    if self.run and _run: res = getattr(self, event_name, noop)()
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/callback/core.py", line 123, in after_validate
    if not self.save_preds: self.preds   = detuplify(to_concat(self.preds, dim=self.concat_dim))
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/torch_core.py", line 270, in to_concat
    for i in range_of(o_)) for o_ in xs], L())
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastai/torch_core.py", line 270, in <listcomp>
    for i in range_of(o_)) for o_ in xs], L())
  File "/home/turgut/.local/share/r-miniconda/envs/r-reticulate/lib/python3.6/site-packages/fastcore/basics.py", line 425, in range_of
    return list(range(a,b,step) if step is not None else range(a,b) if b is not None else range(a))
TypeError: 'NoneType' object cannot be interpreted as an integer

Add HuggingFace Hub integration

Pretrained models to be available on HuggingFace Hub, as well as allowing users to save their own models to HuggingFace Hub.

Relevant links:

Inference - Can not load state_dict

Hey,

me again. Sorry to bother you again!

So I am trying to do inference on a trained model (with default values).
I exported the Generator with the export_generatorFunction. Now I try to load my generator as shown in the Web App Example.

But I get errors in loading the state_dict. The state dict seems to have extra key for nine extra layers, if I understand the error message correctly:

Error Message

model.load_state_dict(torch.load("generator.pth", map_location=device))
  File "/Applications/Utilities/miniconda3/envs/ml/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1051, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for Sequential:
        Missing key(s) in state_dict: "10.conv_block.5.weight", "10.conv_block.5.bias", "11.conv_block.5.weight", "11.conv_block.5.bias", "12.conv_block.5.weight", "12.conv_block.5.bias", "13.conv_block.5.weight", "13.conv_block.5.bias", "14.conv_block.5.weight", "14.conv_block.5.bias", "15.conv_block.5.weight", "15.conv_block.5.bias", "16.conv_block.5.weight", "16.conv_block.5.bias", "17.conv_block.5.weight", "17.conv_block.5.bias", "18.conv_block.5.weight", "18.conv_block.5.bias". 
        Unexpected key(s) in state_dict: "10.conv_block.6.weight", "10.conv_block.6.bias", "11.conv_block.6.weight", "11.conv_block.6.bias", "12.conv_block.6.weight", "12.conv_block.6.bias", "13.conv_block.6.weight", "13.conv_block.6.bias", "14.conv_block.6.weight", "14.conv_block.6.bias", "15.conv_block.6.weight", "15.conv_block.6.bias", "16.conv_block.6.weight", "16.conv_block.6.bias", "17.conv_block.6.weight", "17.conv_block.6.bias", "18.conv_block.6.weight", "18.conv_block.6.bias".

Minimal Working Example

import torch
from upit.models.cyclegan import resnet_generator
import torchvision.transforms
from PIL import Image


device = torch.device("cpu")
model = resnet_generator(ch_in=3, ch_out=3)
model.load_state_dict(torch.load("generator.pth", map_location=device))
model.eval()


totensor = torchvision.transforms.ToTensor()
normalize_fn = torchvision.transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))
topilimage = torchvision.transforms.ToPILImage()


def predict(input):
    im = normalize_fn(totensor(input))
    print(im.shape)
    preds = model(im.unsqueeze(0)) / 2 + 0.5
    print(preds.shape)
    return topilimage(preds.squeeze(0).detach().cpu())


im = predict(Image.open("test.jpg"))
im.save("out.jpg")

Thanks again for your support!

Possible typo in the loss

UPIT/upit/train/cyclegan.py

Line 132 in 1f272ea

self.learn.loss_func.D_B_loss = loss_D_A.detach().cpu()

Was this intended or was this line supposed to be:
self.learn.loss_func.D_B_loss = loss_D_B.detach().cpu() ?

Documentation Mismatch on get_preds_cyclegan

Hey there quick note,

there is a mismatch between the documentation and the code for suffix get_preds_cyclegan. According to the documentation it is png but in the code the default value is tif.

I think I would also beneficial to check wether CUDA/GPU is available here, as inference also works well on a CPU (at least for me), but CUDA required here. What do you think?

	def after_epoch(self):
	"Update images"
	if (self.learn.epoch+1) % self.show_img_interval == 0:
	if self.imgA: self.imgA_result = torch.cat((self.learn.xb[0][1].detach(),self.learn.pred[0].detach()),dim=-1); self.last_gen=self.imgA_result
	if self.imgB: self.imgB_result = torch.cat((self.learn.xb[0][0].detach(),self.learn.pred[1].detach()),dim=-1); self.last_gen=self.imgB_result
	if self.imgA and self.imgB : self.last_gen = torch.cat((self.imgA_result,self.imgB_result),dim=-2)
	img = TensorImage(self.learn.dls.after_batch.decode(TensorImage(self.last_gen[0]))[0])
	self.imgs.append(img)
	self.titles.append(f'Epoch {self.learn.epoch}')
	self.progress.mbar.show_imgs(self.imgs, self.titles,imgsize=10)

tmabraham / upit Goto Github PK

upit's People

Stargazers

Watchers

Forkers

upit's Issues

Recommend Projects

Recommend Topics

Recommend Org