Giter Club home page Giter Club logo

began's Introduction

BEGAN: Boundary Equibilibrium Generative Adversarial Networks

This is an implementation of the paper on Boundary Equilibrium Generative Adversarial Networks (Berthelot, Schumm and Metz, 2017).

Dependencies

  • Python 3+
  • numpy
  • Tensorflow
  • tqdm
  • h5py
  • scipy (optional)

What are Boundary Equilibrium Generative Adversarial Networks?

Unlike standard generative adversarial networks (Goodfellow et al. 2014), boundary equilibrium generative adversarial networks (BEGAN) use an auto-encoder as a disciminator. An auto-encoder loss is defined, and an approximation of the Wasserstein distance is then computed between the pixelwise auto-encoder loss distributions of real and generated samples.

With the auto-encoder loss defined (above), the Wasserstein distance approximation simplifies to a loss function wherein the discriminating auto-encoder aims to perform well on real samples and poorly on generated samples, while the generator aims to produce adversarial samples which the discriminator can't help but perform well upon.

Additionally, a hyper-parameter gamma is introduced which gives the user the power to control sample diversity by balancing the discriminator and generator.

Gamma is put into effect through the use of a weighting parameter k which gets updated while training to adapt the loss function so that our output matches the desired diversity. The overall objective for the network is then:

Unlike most generative adversarial network architectures, where we need to update G and D independently, the Boundary Equilibrium GAN has the nice property that we can define a global loss and train the network as a whole (though we still have to make sure to update parameters with respect to the relative loss functions)

The final contribution of the paper is a derived convergence measure M which gives a good indicator as to how the network is doing. We use this parameter to track performance, as well as control learning rate.

The overall result is a surprisingly effective model which produces samples well beyond the previous state of the art.

128x128 samples generated from random points in Z, from (Berthelot, Schumm and Metz, 2017).

Usage

Data Preprocessing

You might want to use the 'CelebA' dataset (Liu et al. 2015), this can be downloaded from the project website. Make sure to download the 'Aligned and Cropped' Version. However you can modify these instructions to use an alternate dataset.

(Note: if the CelebA Dropbox is down you can alternatively use their Google Drive).

This then needs to be prepared into hdf5 through the following method:

from glob import glob 
import os
import numpy as np
import h5py
from tqdm import tqdm
from scipy.misc import imread, imresize

filenames = glob(os.path.join("img_align_celeba", "*.jpg"))
filenames = np.sort(filenames)
w, h = 64, 64  # Change this if you wish to use larger images
data = np.zeros((len(filenames), w * h * 3), dtype = np.uint8)

# This preprocessing is appriate for CelebA but should be adapted
# (or removed entirely) for other datasets.

def get_image(image_path, w=64, h=64):
    im = imread(image_path).astype(np.float)
    orig_h, orig_w = im.shape[:2]
    new_h = int(orig_h * w / orig_w)
    im = imresize(im, (new_h, w))
    margin = int(round((new_h - h)/2))
    return im[margin:margin+h]

for n, fname in tqdm(enumerate(filenames)):
    image = get_image(fname, w, h)
    data[n] = image.flatten()

with h5py.File(''.join(['datasets/celeba.h5']), 'w') as f:
    f.create_dataset("images", data=data)

Training

After your dataset has been created through the method above, change the file config.py to point to your dataset, and to point to your desired checkpoint directory.

E.g., if your dataset is stored at /home/user/data/dataset.hdf5, then alter config.py to read:

dataset_path = '/home/user/data/dataset.hdf5'
checkpoint_path = './checkpoints'

You can then begin training:

python main.py --start-epoch=0, add-epochs=100 --save-every 5

If you have limited RAM you might need to limit the number of images loaded into memory at once, e.g.

python main.py --start-epoch=0 add-epochs=100 --save-every 5 --max-images 20000

I have 12GB which works for around 60,000 images.

You can specify GPU id with the --gpuid argument. If you want to run on CPU (not recommended!) use --gpuid -1

Other parameters can be tuned if you wish (run python main.py --help for the full list). The default values are the same as in the paper (though the authors point out that their choices aren't necessarily optimal).

The main difference between this implementation's defaults and the original paper is the use of batch normalisation, we found that not using batch normalisation made training much slower.

Running

After you've trained a model and you want to generate some samples simply run

python main.py --start-epoch=N add-epochs=0 --train=False

where N is the checkpoint you want to run from. Samples will be saved to ./outputs/ by default (or add optional argument --outdir for alternative).

Tracking Progress

As discussed previously, the convergence measure gives a very nice way of tracking progress This is implemented into the code (via the dictionary loss_tracker with key convergence_measure)

Berthelot, Schumm and Metz show that it is a true-to-reality metric to use:

Convergence measure over training epochs, with generator outputs showed above (Berthelot, Schumm and Metz, 2017).

Issues / Contributing / Todo

Feel free to raise any issues in the project issue tracker, or make a pull-request if there is something you want to add.

My next plan is to upload some pre-trained weights so beginners can run the model out-of-the-box.

References

began's People

Contributors

akash9182 avatar artcg avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

began's Issues

How many epoch yu are taken for 64*64 image size

Im able to run the program with no error. the convergence measure is displying with no plotting. please help with further.. !
image size = 64 ;
total images = 20000 ;
epoch=1000;
checkpoint:975;
no of image : 1000 (at once) ;
other parameters are same:
figure_1
figure_2
plz help me.. !

problem downloading the dataset

Hello,

UPDATE: trying using the "baidu drive" option.
UPDATE2: downloading big files from baidu drive seems to require installing their executable which is not an option for me. Any idea, anyone?
not a problem with the code, just a problem downloading the celebs dataset.
http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
Dropbox doesn't like all the traffic it generated, and blocked access to the zip files.

Anyone here knows of a way to still get the dataset? any mirror or copy of the file anywhere?

different size images

Unless I'm misunderstanding something, it looks like there's some 64 px settings hard-coded into the model, e.g `grep -n 64 *.py:

config.py:2:checkpoint_prefix = 'BEGAN_64_64'
discriminator.py:23:        D_I: a batch of images [batch_size, 64 x 64 x 3]
discriminator.py:30:            batch_size * 64 * 64 * 3
discriminator.py:39:                   .reshape([-1, 64, 64, 3]))  # '-1' is batch size
discriminator.py:136:                          .apply(tf.image.resize_nearest_neighbor, [64, 64]))
generator.py:31:            batch_size * 64 * 64 * 3
generator.py:83:                   .apply(tf.image.resize_nearest_neighbor, [64, 64]))
main.py:90:                hidden_size=2048, dim=(64, 64, 3), gpu_id='/gpu:0',
main.py:171:                  ('Generated 64x64 samples.', 'Random training images.'),
main.py:273:            im_to_save = im[n].reshape([64, 64, 3])

I'd like to use 128x128 px images -- what do you think would be the best way to do that? Could I just replace all the 64s with 128s? Or would another layer be necessary?

bug? test time: only constant gray images

Hi Artur,

there were some minor fixed necessary

add-epochs=0 -> --add-epochs=0
--train=False --> --train=0
also save the images as png, not jpg
but then still the images were constant grey, as an
plt.imshow(im_to_save)
plt.show()
could verify

values were all around 0.49...0.50, normalizing them to 0...1 gives only color clouds

Typo in execution method

If you have limited RAM you might need to limit the number of images loaded into memory at once, e.g.

`python main.py --start-epoch=0, add-epochs=100 --save-every 5 --max-images 20000'

it should be,

`python main.py --start-epoch=0 --add-epochs=100 --save-every 5 --max-images 20000'

monitoring models during training

I added a couple of line to monitor the convergence of the model during training:

logging_data = np.array([loss_tracker['generator'],loss_tracker['discriminator'],loss_tracker['convergence_measure']])
            logging_data = logging_data.T
            np.savetxt("convergence_measure.txt", logging_data, fmt='%.15e', header="            generator         discriminator   convergence_measure", comments='')

and I see what I'd more or less expect in the generator and discriminator columns, but the convergence_measure is always all zeros.

I see that you're appending zeros to the convergence_measure in the loss tracker in lines 158-160:

            loss_tracker['generator'].append(G_loss_)
            loss_tracker['discriminator'].append(D_loss_)
            loss_tracker['convergence_measure'].append(0)

Why is that? Is there a (better) way to access this info?

TF model doesn't stay in memory with --save-every > 1

Hi, I'm running into a problem where training fails after the first epoch (or checkpoint, I'm not quite sure) and with this message:

https://gist.github.com/donovanr/f1e728814776e3b26a70ad545093998c

It looks like maybe the file ./checkpoints/BEGAN_64_64_0001.tfmod isn't being found or created properly.

I'm running with: python main.py --start-epoch 0 --add-epochs 500 --save-every 5 in a slightly modified tensorflow docker container, if that's useful information.

Any suggestions?

Error generating images

Hi,
Thanks for this code.python main.py --start-epoch=96 --add-epochs=0 --train=0
I was successful in training the model, but when I tried to make some images with command

python main.py --start-epoch=96 --add-epochs=0 --train=0

I got an error message

File "main.py", line 257, in
start_learn_rate=args.start_learn_rate)
File "main.py", line 173, in began_train
batch = dataIterator([images], batch_size).next()
UnboundLocalError: local variable 'images' referenced before assignment

Could you help me, what should I do to get the image generation working, thanks

data processing

when i run the code of data processing, i face the problem OSError: Unable to create file (unable to open file: name = 'datasets/celeba.h5', errno = 13, error message = 'Permission denied', flags = 13, o_flags = 302), i want to ask how to solve it?

How to Reduce The Training Time?

Hi,

Thank you for sharing the code. I have run the code successfully. However, there is a question that since I try to run this code base on CPU, I really need to reduce the training time in order to get the result in a relatively short time and all I need is to get convergence for evaluating.

I have tried to change the number of images from the dataset by reducing 'num_image'(in main.py) to only 100 and deleted the images manually, but both methods don't work well. The running time for each epoch still needs around one hour. Is this normal or if I have any misunderstanding? Do you have any way to reduce the training time for CPU?

Thank you!

How to set the N?N is a num (0,1,,,)?

Thanks,author!
I am suffer from some issues when trained a model and want to generate some samples simply run;
when I execute "python main.py --start-epoch=N add-epochs=0 --train=False"this line,some error appear.
can you share execute comand line?
where N is the checkpoint ,I have a little confused what it is a num or produced model? i have produced the model eg.BEGAN_64_64_0005.tfmod.data-00000-of-00001.
can you give some guide?
thanks again!

generated images quality?

Thanks for sharing!

In the images that you generate, are you seeing similar quality to the paper?
(From what I understand, on the paper they use a larger dataset to train on.)

typo in prepare_celeba.py

on README.md :
def get_image(image_path, w=64, h=64):

on datasets/prepare_celeba.py :
def get_image(image_path):

this causes error.
TypeError : get_image() take 1 positional argument but 3 were given

datasets/celeba.h5

Hi, thank you for you sharing.
But I can't find the celeba.h5, and error message = 'no such file or directory', 'datasets/celeba.h5'.

Practical Application?

Hi! I'm very interested in your project, I'm currently working on an open source project that should be able to interpolate the face (with expression) in video frames of person a into something that looks more like person b.

What I have so far:

Dataset Generation

  • Extracting frames from many youtube videos of person a or person b
  • Running facial recognition and alignment to save every training usable region of interest with the correct people into personA and personB folders

Translation Utility

  • Provide low fps video of person a, for each frame extract the region of interest (face)
  • Run it through some kind of image translation (I would like to use your project for that)
  • Then save the result back into the original region of interest. (possibly add color correction, etc of modified roi with pix2pix)

What I want to use your project for:

Image Translation
personA->personB conversion (or at least a high similarity, like in the male->female projects) of regions of interest in individual videoframes

What I don't understand

How to use your project for this purpose ;) Can you point me in a direction, so that I can set up your project for this specific application?

Training error

When input --start epoch=0,ValueError: 'a' must be greater than 0 unless no samples are taken

Bug suspicion on the discriminator

Just reading the code and I think I found something strange:

discriminator.py +62:
I read:
layer_3 = custom_conv2d(conv_2, 3 * num_filters, k_h=3, k_w=3, d_h=2, d_w=2, scope='el3')

Shouldn't it be:
layer_3 = custom_conv2d(conv_4, 3 * num_filters, k_h=3, k_w=3, d_h=2, d_w=2, scope='el3')

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.