Giter Club home page Giter Club logo

boschresearch / unetgan Goto Github PK

View Code? Open in Web Editor NEW
364.0 8.0 56.0 64.65 MB

Official Implementation of the paper "A U-Net Based Discriminator for Generative Adversarial Networks" (CVPR 2020)

Home Page: https://openaccess.thecvf.com/content_CVPR_2020/papers/Schonfeld_A_U-Net_Based_Discriminator_for_Generative_Adversarial_Networks_CVPR_2020_paper.pdf

License: GNU Affero General Public License v3.0

Python 98.43% Shell 1.57%
gan image-generation cvpr2020 computer-vision machine-learning ffhq unet-gan biggan bcai u-net

unetgan's Introduction

U-Net GAN PyTorch

PyTorch implementation of the CVPR 2020 paper "A U-Net Based Discriminator for Generative Adversarial Networks". The paper and supplementary can be found here. Don't forget to have a look at the supplementary as well (the Tensorflow FIDs can be found there (Table S1)). The code allows the users to reproduce and extend the results reported in the study. Please cite the above paper when reporting, reproducing or extending the results.

Setup

Create the conda environment "unetgan" from the provided unetgan.yml file. The experiments can be reproduced with the scripts provided in the folder training_scripts (the experiment folder and dataset folder has to be set manually).

Argument Explanation
--unconditional Use this if the dataset does not have classes
(e.g. CelebA).
--unet_mixup Use CutMix.
--slow_mixup Use warmup for the CutMix-augmentation loss.
--slow_mixup_epochs Number of epochs for the warmup
--full_batch_mixup If True, a coin is tossed at every training step.
With a certain probability the whole batch is mixed
and the CutMix augmentation loss and consistency loss
is the only loss that is computed for this batch. The
probability increases from 0 to 0.5 over the course
of the specified warmup epochs. If False, the CutMix
augmentation and consistency loss are computed for
every batch and added to the default GAN loss. In the
case of a warmup, the augmentation loss is multiplied
with a factor that increases from 0 to 1 over the course
of the specified warmup epochs.
--consistency_loss Compute only the CutMix consistency loss, but not the
CutMix augmentation loss (Can increase stability but
might perform worse).
--consistency_loss_and_augmentation Compute both CutMix augmentation and consistency
loss.
--base_root Specify the path/to/folder_for_results where all
experimental results are saved.
--data_folder Specify the path to the dataset /path/to/dataset. For FFHQ,
this folder contains the 69 subfolders that can be
downloaded here. In our case, the images were
downscaled before training to resolution 256x256.
For CelebA, the folder should contain all images with their
6-digit number as file name (e.g. 016685.png)

Details

This implementation of U-Net GAN is based on the PyTorch code for BigGAN (https://github.com/ajbrock/BigGAN-PyTorch). The main differences are that (1) we use our own data-loader which does not require HDF5 pre-processing, (2) applied changes in the generator and discriminator class in BigGAN.py, and (3) modified train.py and train_fns.py. If you want to turn your own GAN into a U-Net GAN, make sure to follow the tips outlined in how_to_unetgan.pdf.

Graphical Overview of the U-Net Discriminator Architecture

Video Summary

video summary

Metrics

The inception metrics (FID and IS) are measured in the same way as in the parent repository (https://github.com/ajbrock/BigGAN-PyTorch). They can be computed on-the-fly during training, using the pre-computed inception moments (see original BigGAN repository). We included the pre-computed inception moments for CelebA and FFHQ in this repository for convenience. This means when the model trains it will automatically read the npz files in the main folder and calculate the FID. The on-the-fly FID scores are saved in the logs folder in the output directory as a pickle named inception_metrics_<experiment_name>.p. Note that for CelebA training is perfectly stable, but for FFHQ we observe frequent collapse (which is also the case for the underlying BigGAN, which is unstable for FFHQ). The FID curve for a successful FFHQ run will look like the one on the left, while the more common failed runs will look like the curve on the right. The collapse occurs early in training and is easily detectable (in contrast, for BigGAN it usually occurs later).

Pretrained Model

First, download the checkpoint for FFHQ from https://www.dropbox.com/sh/7vql1ao1u853wwf/AAABs7JO27da49_GveaEXl4Ma?dl=0 and copy the files into the folder pretrained_model. To load the pre-trained model for FFHQ, execute sh training_scripts/load_pretrained_ffhq.sh. The script will load the checkpoint and resume training from this point. Of course, you still need to set the path to the dataset and the desired output folder in this script. If you quickly want to check the outputs of the pre-trained model, set te --sample_every to a small number of training steps, like 30.

Other Implementations

While our method is based on BigGAN, this repository combines the U-Net discriminator with StyleGAN2: U-Net StyleGAN-2 https://github.com/lucidrains/unet-stylegan2. Check it out!

Citation

If you use this work please cite

@inproceedings{schonfeld2020u,
  title={A u-net based discriminator for generative adversarial networks},
  author={Schonfeld, Edgar and Schiele, Bernt and Khoreva, Anna},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8207--8216},
  year={2020}
}

License

U-Net GAN PyTorch is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.

For a list of other open source components included in unetgan, see the file 3rd-party-licenses.txt.

Purpose of the project

This software is a research prototype, solely developed for and published as part of the publication. It will neither be maintained nor monitored in any way.

Contact

If you have questions or need help, feel free to write an email to [email protected].

unetgan's People

Contributors

edgarschnfld avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

unetgan's Issues

About loss function

In the paper,the hinge loss is mentioned to use in the network.So why you use BCE loss in the code?

Unable to setup Conda environment on Colab

Hello, I have been trying to implement this project for educational purpose. To make use of the GPUs provided by Colab, I tried to setup the environment on it. However, numpy and numpy-base packages are not to be found.

So I updated the .yml file as follows:
name: unetgan
channels:
- pytorch
- bioconda
- conda-forge
- defaults
dependencies:
- asn1crypto=1.2.0=py36_0
- backcall=0.1.0=py36_0
- backports=1.0=py_2
- backports.shutil_get_terminal_size=1.0.0=py36_2
- blas=1.0=mkl
- blosc=1.12.0=he42ba99_0
- bzip2=1.0.6=h14c3975_5
- ca-certificates=2020.4.5.1=hecc5488_0
- cairo=1.14.12=h77bcde2_0
- certifi=2020.4.5.1=py36h9f0ad1d_0
- cffi=1.12.3=py36h2e261b9_0
- chardet=3.0.4=py36_1003
- cloudpickle=1.2.1=py_0
- cryptography=2.3.1=py36hc365091_0
- cudatoolkit=9.0=h13b8566_0
- cudnn=7.6.0=cuda9.0_0
- curl=7.60.0=h84994c4_0
- cycler=0.10.0=py36_0
- cytoolz=0.10.0=py36h7b6447c_0
- dask-core=2.3.0=py_0
- dbus=1.13.2=hc3f9b76_0
- decorator=4.4.0=py36_1
- dill=0.3.1.1=py36_0
- dominate=2.4.0=py_0
- expat=2.2.6=he6710b0_0
- ffmpeg=4.0=h04d0a96_0
- fontconfig=2.12.6=h49f89f6_0
- freeglut=3.0.0=hf484d3e_5
- freetype=2.8=h52ed37b_0
- get_terminal_size=1.0.0=haa9412d_0
- glib=2.53.6=hc861d11_1
- gmp=6.1.2=hb3b607b_0
- graphite2=1.3.11=h16798f4_2
- gst-plugins-base=1.12.4=h33fb286_0
- gstreamer=1.12.4=hb53b477_0
- h5py=2.8.0=py36h39dcb92_0
- harfbuzz=1.7.6=hc5b324e_0
- hdf5=1.8.18=h525d4c3_0
- icu=58.2=h211956c_0
- idna=2.8=py36_0
- imageio=2.5.0=py36_0
- intel-openmp=2018.0.0=8
- ipython=7.8.0=py36h39e3cac_0
- ipython_genutils=0.2.0=py36_0
- jasper=1.900.1=4
- jbig=2.1=hdba287a_0
- jedi=0.15.1=py36_0
- joblib=0.13.2=py36_0
- jpeg=9b=habf39ab_1
- kiwisolver=1.1.0=py36he6710b0_0
- libcurl=7.60.0=h1ad7b7a_0
- libedit=3.1.20170329=h6b74fdf_2
- libffi=3.2.1=h4deb6c0_3
- libgcc-ng=8.2.0=hdf63c60_1
- libgfortran-ng=7.2.0=hdf63c60_3
- libopencv=3.4.1=h62359dd_1
- libopus=1.3=h7b6447c_0
- libpng=1.6.34=hb9fc6fc_0
- libprotobuf=3.5.2=hd28b015_1
- libsodium=1.0.16=h1bed415_0
- libssh2=1.8.0=h9cfc8f7_4
- libstdcxx-ng=8.2.0=hdf63c60_1
- libtiff=4.0.9=he85c1e1_1
- libtool=2.4.6=h7b6447c_5
- libvpx=1.7.0=h439df22_0
- libxcb=1.13=h1bed415_1
- libxml2=2.9.8=h26e45fe_1
- libxslt=1.1.32=h1312cb7_0
- lmdb=0.9.23=he6710b0_0
- lzo=2.10=h1bfc0ba_1
- matplotlib=2.2.2=py36h0e671d2_1
- mkl=2018.0.3=1
- mkl-service=2.0.2=py36h7b6447c_0
- mkl_fft=1.0.1=py36h3010b51_0
- mkl_random=1.0.1=py36h629b387_0
- mpc=1.0.3=hf803216_4
- mpfr=3.1.5=h12ff648_1
- ncurses=6.1=hf484d3e_0
- networkx=2.3=py_0
- ninja=1.9.0=py36hfd86e86_0
- npyscreen=4.10.5=pyh4bbf42b_2
- numpy=1.14.3=py36hcd700cb_1
- numpy-base=1.14.3=py36h9be14a7_1
- olefile=0.46=py36_0
- opencv=3.4.1=py36h40b0b35_2
- openssl=1.0.2u=h516909a_0
- pandas=0.25.3=py36he6710b0_0
- pandoc=1.19.2.1=hea2e7c5_1
- pango=1.41.0=hd475d92_0
- parso=0.5.1=py_0
- patchelf=0.9=hf79760b_2
- pcre=8.42=h439df22_0
- pexpect=4.7.0=py36_0
- pickleshare=0.7.5=py36_0
- pillow=5.1.0=py36h3deb7b8_0
- pip=19.1.1=py36_0
- pixman=0.38.0=h7b6447c_0
- prompt_toolkit=2.0.9=py36_0
- ptyprocess=0.6.0=py36_0
- py-opencv=3.4.1=py36hf78e8e8_1
- pycparser=2.19=py36_0
- pygments=2.4.2=py_0
- pynvml=8.0.4=py_0
- pyopenssl=19.0.0=py36_0
- pyparsing=2.4.0=py_0
- pyqt=5.9.2=py36h751905a_0
- pysocks=1.7.1=py36_0
- python=3.6.6=h6e4f718_2
- python-dateutil=2.8.0=py36_0
- python-lmdb=0.96=py36he1b5a44_0
- python_abi=3.6=1_cp36m
- pytorch=1.1.0=py3.6_cuda9.0.176_cudnn7.5.1_0
- pytz=2019.1=py_0
- pywavelets=1.0.3=py36hdd07704_1
- qt=5.9.4=h4e5bff0_0
- readline=7.0=h7b6447c_5
- requests=2.22.0=py36_1
- scikit-image=0.15.0=py36he6710b0_0
- scikit-learn=0.21.2=py36hd81dba3_0
- scipy=1.3.0=py36h7c811a0_0
- setuptools=41.0.1=py36_0
- sip=4.19.13=py36he6710b0_0
- six=1.12.0=py36_0
- snappy=1.1.6=h4f0f562_2
- sqlite=3.26.0=h7b6447c_0
- tbb4py=2018.0.5=py36h6bb024c_0
- tk=8.6.8=hbc83047_0
- toolz=0.10.0=py_0
- torchvision=0.3.0=py36_cu9.0.176_1
- tornado=6.0.3=py36h7b6447c_0
- tqdm=4.32.1=py_0
- traitlets=4.3.2=py36_0
- urllib3=1.25.7=py36_0
- wcwidth=0.1.7=py36_0
- wheel=0.33.4=py36_0
- xz=5.2.4=h14c3975_4
- zlib=1.2.11=h7b6447c_3
- zstd=1.3.7=h0b5b093_0
- pip:
- beautifulsoup4==4.9.0
- click==7.1.2
- efficientnet-pytorch==0.6.3
- future==0.17.1
- gtts==2.1.1
- gtts-token==1.1.3
- hyperopt==0.1.2
- pymongo==3.9.0
- soupsieve==2.0
- torch==1.1.0
prefix: /usr/local/envs/unetgan

After doing this, I think there should not be any problem but the following error is showing:
/usr/local/condabin/conda
conda environments:

base /usr/local
unetgan * /usr/local/envs/unetgan

Collecting package metadata (repodata.json): ...working... done
Solving environment: ...working...
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
Solving environment: ...working...
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
# packages in environment at /usr/local/envs/unetgan:

# Name Version Build Channel

UnsatisfiableError: The following specifications were found to be incompatible with each other:

Output in format: Requested package -> Available versions

Package sqlite conflicts for:
mkl_random==1.0.1=py36h629b387_0 -> python[version='>=3.6,<3.7.0a0'] -> sqlite[version='3.13.|3.20.|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.25.3,<4.0a0|>=3.26.0,<4.0a0|>=3.28.0,<4.0a0|>=3.30.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.34.0,<4.0a0|>=3.31.1,<4.0a0|>=3.29.0,<4.0a0|>=3.23.1,<4.0a0|>=3.22.0,<4.0a0|>=3.20.1,<4.0a0']
pyopenssl==19.0.0=py36_0 -> python[version='>=3.6,<3.7.0a0'] -> sqlite[version='3.13.|3.20.|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.25.3,<4.0a0|>=3.26.0,<4.0a0|>=3.28.0,<4.0a0|>=3.30.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.34.0,<4.0a0|>=3.31.1,<4.0a0|>=3.29.0,<4.0a0|>=3.23.1,<4.0a0|>=3.22.0,<4.0a0|>=3.20.1,<4.0a0']
backports.shutil_get_terminal_size==1.0.0=py36_2 -> python[version='>=3.6,<3.7.0a0'] -> sqlite[version='3.13.|3.20.|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.25.3,<4.0a0|>=3.26.0,<4.0a0|>=3.28.0,<4.0a0|>=3.30.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.34.0,<4.0a0|>=3.31.1,<4.0a0|>=3.29.0,<4.0a0|>=3.23.1,<4.0a0|>=3.22.0,<4.0a0|>=3.20.1,<4.0a0']
dill==0.3.1.1=py36_0 -> python[version='>=3.6,<3.7.0a0'] -> sqlite[version='3.13.|3.20.|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.25.3,<4.0a0|>=3.26.0,<4.0a0|>=3.28.0,<4.0a0|>=3.30.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.34.0,<4.0a0|>=3.31.1,<4.0a0|>=3.29.0,<4.0a0|>=3.23.1,<4.0a0|>=3.22.0,<4.0a0|>=3.20.1,<4.0a0']
prompt_toolkit==2.0.9=py36_0 -> python[version='>=3.6,<3.7.0a0'] -> sqlite[version='3.13.|3.20.|>=3.24.0,<4.0a0|>=3.25.2,<4.0a0|>=3.25.3,<4.0a0|>=3.26.0,<4.0a0|>=3.28.0,<4.0a0|>=3.30.1,<4.0a0|>=3.32.3,<4.0a0|>=3.33.0,<4.0a0|>=3.34.0,<4.0a0|>=3.31.1,<4.0a0|>=3.29.0,<4.0a0|>=3.23.1,<4.0a0|>=3.22.0,<4.0a0|>=3.20.1,<4.0a0']
. . .
. . .
The output is too long and maybe mentions every package noted in the .yml file above

conda --version
python --version
#conda 4.9.2
#Python 3.6.12 :: Anaconda, Inc
sys.path
['',
'/env/python',
'/usr/lib/python36.zip',
'/usr/lib/python3.6',
'/usr/lib/python3.6/lib-dynload',
'/usr/lib/python3/dist-packages',
'/usr/local/lib/python3.6/dist-packages/IPython/extensions',
'/root/.ipython',
'/usr/local/lib/python3.6/site-packages'
]
#I even tried removed dist-packages to make it work.

Code to create environment:
conda create -p /usr/local/envs/unetgan --yes
conda env update --file unetgan.yml

Please help.

Training with COCO-Animals dataset

Is the COCO-Animals dataset you trained with available for download?

First, the classes do not line up:
The classes in you github code are ['bird','cat','dog','horse','sheep','cow','elephant','monkey','zebra','giraffe']

I've downloaded the full COCO dataset from https://cocodataset.org/ and filtered for animals and verified using the coco image explorer. Note the paper references the entire COCO dataset in the references but refers to COCO-animals:
1['bird','cat','dog','horse','sheep','cow','elephant','bear','zebra','giraffe']` # bear instead of monkey

I also downloaded the following dataset from here http://cs231n.stanford.edu/coco-animals.zip
['bird','cat','dog','horse','sheep','bear','zebra','giraffe'] # no cow, no elephant , bear instead of monkey

Second, in PytorchDatasets.py in the class CocoAnimals(VisionDataset) you are using the following pickle files. Are they available somewhere?

        with open( os.path.join(root, "merged_bbdict_v2.p"), "rb") as h:
            self.bbox = pickle.load(h)](url)

        with open(os.path.join(root,"coco_ann_dict.p"),"rb") as h:
            self.mask_coco = pickle.load(h)

Any help would be appreciated.

Thanks,
Jay Urbain

GAN training is a little unstable.

Hey there,
Awesome work, the results on FFHQ 256x256 are stunning. I was able to get this code to converge on FFHQ to a similar quality as shown in the paper (after 3 or so random seeds which diverged). On the other hand, I have not had luck with other datasets, such as LSUN tower/cat. I tried "warm-starting" the GAN with the FFHQ pre-trained (a trick that normally works pretty nice), which at least generated recognizable images, but would not improve after about 50k iterations.
I'd love to use this method, as a competitor/alternative to stylegan2, but I'm afraid it is too fickle at the moment. Any tricks I'm missing?

My docker image is pytorch/pytorch latest and I use ffhq training script in training_scripts. I'm using single gpu training on a 1080 Ti (with 11GB gpu RAM), so I've set the batchsize to 4. Perhaps this is the source of instability.

about generator

If this structure is added to the generator, will it have a good effect? Is there any Ablation Experiment in this regard

Hi~I'm confused about why the model needs Consistency Regularization.

Hi! Thanks for your great work and I want to try to use it in my work!
But, I do not understand why the model needs Consistency Regularization, although I have read the paper carefully.
For example, "The per-pixel decision of the well-trained D discriminator should be equivariant under any class-domain-
altering transformations of images.
" in the paper.
What is the meaning of any class-domain-altering transformations of images?
That is, I do not know what the problems are and what causes the problems, if without Consistency Regularization.

Implementation of UNet in BiGAN like Architecture

Hi, Thanks for the work.
I wanted to know whether we can integrate the UNet Style Architecture with BiGAN like design. I want to train BiGAN on COCO Dataset and so it seems that using local mapping as the one did in UNet makes a better sense than only using the global latent representation.

Can you give some points in integrating UNet with BiGAN where I want my inference network to learn the representations for downstream tasks like in BigBiGAN.

It seems to be a good experiment with it

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.