stitchfix / fauxtograph Goto Github PK

View Code? Open in Web Editor NEW

226.0 110.0 54.0 2.17 MB

Tools for using a variational auto-encoder for latent image encoding and generation.

License: MIT License

Python 3.07% Jupyter Notebook 96.93%

fauxtograph's Introduction

Changelog

v1.0.x

Now has 3 different model classes available (VAE, GAN, VAEGAN)
All models have both convolution and linear mode architectures.
Python3 Compatibility
Updated to use Chainer 1.6.0
Will output intermediate generated images to give users the ability to inspect training progress when run in a Jupyter notebook.

fauxtograph

This package contains classes for training three different unsupervised, generative image models. Namely Variational Auto-encoders, Generative Adversarial Networks, and the newly developed combination of the two (VAE/GAN). Descriptions of the inner workings of these algorithms can be found in

Kingma, Diederik P., and Max Welling. "Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114 (2013).
Radford, Alec et. al.; "Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks" arXiv preprint arxiv:1511.06434 (2015).
Boesen Lindbo Larsen, Anders et. al.; "Autoencoding Beyond Pixels Using a Learned Similarity Metric" arXiv preprint arxiv:1512.09300 (2015).

respectively.

All models take in a series of images and can be trained to perform either an encoding transform step or a generative inverse_transform step (or both). It's built on top of the Chainer framework and has an easy to use command line interface for training and generating images with a Variational Auto-encoder.

Both the module itself as well as the training script are available by installing this package through PyPI. Otherwise the module itself containing the main class which does all the heavy lifting is in fauxtograph/fauxtograph.py which has dependencies in fauxtograph/vaegan.py, while the training/generation CLI script is in fauxtograph/fauxto.py

To learn more about the command line tool functionality and to get a better sense of how one might use it, please see the blog post on the Stitch Fix tech blog, multithreaded.

##Installation

The simplest step to using the module is to install via pip:

$ pip install fauxtograph

this should additionally grab all necessary dependencies including the main backend NN framework, Chainer. However, if you plan on using CUDA to train the model with a GPU you'll need to additionally install the Chainer CUDA dependencies with

$ pip install chainer-cuda-deps

##Usage

To get started, you can either find your own image set to use or use the downloading tool to grab some of the Hubble/ESA space images, which I've found make for interesting results.

To grab the images and place them in an images folder run

$ fauxtograph download ./images

This process can take some time depending on your internet connection.

Then you can train a model and output it to disk with

$ fauxtograph train --kl_ratio 0.005 ./images ./models/model_name

Finally, you can generate new images based on your trained model with

$ fauxtograph generate ./models/model_name_model.h5 ./models/model_name_opt.h5 ./models/model_name_meta.json ./generated_images_folder

Each command comes with a --help option to see possible optional arguments.

Tips

Using the CLI

In order to get the best results for generated images it'll be necessary to either have a rather large number of images (say on the order of several hundred thousand or more), or images that are all quite similar with minimal backgrounds.
As the model trains you should see the output of the KL Divergence average over the batches and the reconstruction loss average as well. You might wish to adjust the ratio of these two terms with the --kl_ratio option in order to get better performance should you find that the learning rate is driving one or the other terms to zero too quickly(slowly).
If you have an CUDA capable Nvidia GPU, use it. The model can train over 10 times faster by taking advantage of GPU processing.
Sometimes you will want to brighten your images when saving them, which can be done with the --image_multiplier argument.
If you manage to train a particularly interesting model and generate some neat images, then we'd like to see them. Use #fauxtograph if you decide to put them up on social media.

Generally

When training GAN and VAEGAN models, they are highly sensitive to the relative learning rate of the subnetworks. Particularly the learning rate of the generator to the discriminator. If you notice highly oscillatory behavior in your training losses it might be helpful to turn down the Adam alpha and beta1 parameters of either network (usually the discriminator) to help train them at a similar rate.

ENJOY

fauxtograph's People

Contributors

Stargazers

Watchers

fauxtograph's Issues

pip installation on py3 fails

Looks like the BeautifulSoup dependency does not work on py3. Is there a reason for this specific BS version as opposed to bs4?

pip install fauxtograph
Collecting fauxtograph
  Downloading fauxtograph-1.0.3.tar.gz
Collecting chainer==1.6.0 (from fauxtograph)
  Downloading chainer-1.6.0.tar.gz (904kB)
    100% |████████████████████████████████| 905kB 273kB/s
Requirement already satisfied (use --upgrade to upgrade): pillow in ./anaconda/envs/py34/lib/python3.4/site-packages (from fauxtograph)
Collecting joblib (from fauxtograph)
  Downloading joblib-0.9.4-py2.py3-none-any.whl (112kB)
    100% |████████████████████████████████| 114kB 1.7MB/s
Collecting tqdm (from fauxtograph)
  Downloading tqdm-3.8.0-py2.py3-none-any.whl
Collecting BeautifulSoup (from fauxtograph)
  Downloading BeautifulSoup-3.2.1.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 20, in <module>
      File "/private/var/folders/b1/jbv4n2f56mz8m3hxkdtc51cr0000gn/T/pip-build-9v2hhigl/BeautifulSoup/setup.py", line 22
        print "Unit tests have failed!"
                                      ^
    SyntaxError: Missing parentheses in call to 'print'

Parameter set for training over Hubble dataset

Saw your Deep Style presentation and the generated images based on trained model on Hubble dataset. I was wondering what parameter set you used to train over Hubble dataset. I used the default parameter set and only got black images generated at the end! Also played with image_multiplier argument and got a bunch of similar black images with a gray (lighter) circle in the center.
Any tips here?

Problem with VAE_GAN.ipynb

Hi, @jwhitmire @billeisenhauer @bjallen

When I tried the .ipynb on my dataset. I got the following error:

`
nvcc warning : The 'compute_20', 'sm_20', and 'sm_21' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc fatal : Could not open output file '/tmp/tmpxft_00004d81_00000000'

Traceback (most recent call last):
File "see.py", line 17, in
vg.fit(x_all, save_freq=2, pic_freq=30, n_epochs=4, model_path = m_path, img_path=im_path) #, mirroring=True)#, flag_gpu=False)
File "/usr/local/lib/python2.7/dist-packages/fauxtograph/fauxtograph.py", line 1342, in fit
kl_loss, dif_l, disc_rec, disc_batch, disc_samp = self._forward(x_batch)
File "/usr/local/lib/python2.7/dist-packages/fauxtograph/fauxtograph.py", line 1099, in _forward
encoded, means, ln_vars = self._encode(batch, test=test)
File "/usr/local/lib/python2.7/dist-packages/fauxtograph/fauxtograph.py", line 1082, in _encode
x = self.enc(data, test=test)
File "/usr/local/lib/python2.7/dist-packages/fauxtograph/vaegan.py", line 127, in call
batch = self.bn1(batch, test=test)
File "/usr/local/lib/python2.7/dist-packages/chainer/links/normalization/batch_normalization.py", line 93, in call
ret = func(x, self.gamma, self.beta)
File "/usr/local/lib/python2.7/dist-packages/chainer/function.py", line 105, in call
outputs = self.forward(in_data)
File "/usr/local/lib/python2.7/dist-packages/chainer/functions/normalization/batch_normalization.py", line 53, in forward
mean = x.mean(axis=axis)
File "cupy/core/core.pyx", line 679, in cupy.core.core.ndarray.mean (cupy/core/core.cpp:13034)
File "cupy/core/core.pyx", line 687, in cupy.core.core.ndarray.mean (cupy/core/core.cpp:12899)
File "cupy/core/reduction.pxi", line 252, in cupy.core.core.simple_reduction_function.call (cupy/core/core.cpp:44135)
File "cupy/util.pyx", line 36, in cupy.util.memoize.decorator.ret (cupy/util.cpp:1194)
File "cupy/core/reduction.pxi", line 181, in cupy.core.core._get_simple_reduction_function (cupy/core/core.cpp:42698)
File "cupy/core/reduction.pxi", line 108, in cupy.core.core._get_simple_reduction_kernel (cupy/core/core.cpp:40479)
File "cupy/core/carray.pxi", line 87, in cupy.core.core.compile_with_cache (cupy/core/core.cpp:26123)
File "/usr/local/lib/python2.7/dist-packages/cupy/cuda/compiler.py", line 112, in compile_with_cache
base = _empty_file_preprocess_cache[env] = preprocess('', options)
File "/usr/local/lib/python2.7/dist-packages/cupy/cuda/compiler.py", line 75, in preprocess
pp_src = _run_nvcc(cmd, root_dir)
File "/usr/local/lib/python2.7/dist-packages/cupy/cuda/compiler.py", line 37, in _run_nvcc
return subprocess.check_output(cmd, cwd=cwd)
File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['nvcc', '--preprocess', '/var/tmp/tmphEoIKW/kern.cu']' returned non-zero exit status 1

Please help..

Error during training on Hubble dataset: could not broadcast input array from shape (300,300,3) into shape (300)

Running on Mac for Hubble dataset:
$ fauxtograph train ./images ./models/model_name
Traceback (most recent call last):
File "/usr/local/bin/fauxtograph", line 9, in
load_entry_point('fauxtograph==1.0.2', 'console_scripts', 'fauxtograph')()
File "/Library/Python/2.7/site-packages/click/core.py", line 716, in call
return self.main(_args, *_kwargs)
File "/Library/Python/2.7/site-packages/click/core.py", line 696, in main
rv = self.invoke(ctx)
File "/Library/Python/2.7/site-packages/click/core.py", line 1060, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Library/Python/2.7/site-packages/click/core.py", line 889, in invoke
return ctx.invoke(self.callback, *_ctx.params)
File "/Library/Python/2.7/site-packages/click/core.py", line 534, in invoke
return callback(_args, **kwargs)
File "/Library/Python/2.7/site-packages/fauxtograph/fauxto.py", line 122, in train
x_all = vae.load_images(file_paths)
File "/Library/Python/2.7/site-packages/fauxtograph/fauxtograph.py", line 239, in load_images
x_all = np.array([read(fname) for fname in tqdm.tqdm(filepaths)])
ValueError: could not broadcast input array from shape (300,300,3) into shape (300)

A runtime error with PNG images.

I don't know where this error is coming from. I have a tar.gz file here of a few images that I was hoping to train with your program: https://www.dropbox.com/s/zfekwp1cspfj9r2/images.tar.gz?dl=0

$ fauxtograph train ./images ./models/model_name
Loading Image Files...
Image Files Loaded!

 Training for 200 epochs.

epoch: 1
  0%|                                          | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
  File "/Users/jonathan/anaconda/bin/fauxtograph", line 11, in <module>
    sys.exit(fauxtograph())
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/click/core.py", line 700, in __call__
    return self.main(*args, **kwargs)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/click/core.py", line 680, in main
    rv = self.invoke(ctx)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/click/core.py", line 1027, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/click/core.py", line 873, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/click/core.py", line 508, in invoke
    return callback(*args, **kwargs)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/fauxtograph/fauxto.py", line 116, in train
    iae.fit(batch_size=batch, n_epochs=epoch)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/fauxtograph/fauxtograph.py", line 269, in fit
    r_loss, kl_div, out = self._forward(x_batch)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/fauxtograph/fauxtograph.py", line 176, in _forward
    encoded = self._encode(batch)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/fauxtograph/fauxtograph.py", line 162, in _encode
    batch = F.relu(getattr(self.model, 'encode_%i' % i)(batch))
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/chainer/function.py", line 164, in __call__
    self._check_data_type_forward(in_data)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/chainer/function.py", line 191, in _check_data_type_forward
    self.check_type_forward(in_type)
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/chainer/functions/linear.py", line 105, in check_type_forward
    type_check.Variable(self.W.shape[1], 'W.shape[1]')),
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/chainer/utils/type_check.py", line 457, in expect
    expr.expect()
  File "/Users/jonathan/anaconda/lib/python2.7/site-packages/chainer/utils/type_check.py", line 428, in expect
    '{0} {1} {2}'.format(left, self.inv, right))
chainer.utils.type_check.InvalidType: Expect: prod(in_types[0].shape[1:]) == W.shape[1]
Actual: 30000 != 22500```

Script to generate face images with VAEGAN

Can you possibly add the script to train/generate face images with VAE/GAN?
As you suggested, the hyper-parameters seem to be critical. Thanks!

sample z from VAE_GAN

Hi,
I have a question confused about the latent variable z

According to the code in class VAEGAN,

    def _forward(self, batch, test=False):

        encoded, means, ln_vars = self._encode(batch, test=test)
        rec = self._decode(encoded, test=test)
        normer = reduce(lambda x, y: x*y, means.data.shape)
        kl_loss = F.gaussian_kl_divergence(means, ln_vars)/normer

        samp_p = np.random.standard_normal(means.data.shape).astype('float32')
        z_p = chainer.Variable(samp_p)

It means that this z is first sampled according to VAE from:

encoded, means, ln_vars = self._encode(batch, test=test)

    def encode(self, data, test=False):
        x = self.enc(data, test=test)
        mean, ln_var = F.split_axis(x, 2, 1)
        samp = np.random.standard_normal(mean.data.shape).astype('float32')
        samp = Variable(samp)
        if self.flag_gpu:
            samp.to_gpu()
        z = samp * F.exp(0.5*ln_var) + mean

which means that the sampler z is s.t:

then, the sampled z is input to the decoder network to generate images, according to

rec = self._decode(encoded, test=test)

then, another z is sampled by:

        samp_p = np.random.standard_normal(means.data.shape).astype('float32')
        z_p = chainer.Variable(samp_p)

which means it is s.t.

and the generated images is created by:

rec_p = self._decode(z_p)

My question is since the two types of z are sampled from different distributions,
why they are input into the same decoder network?
And which output are the generated images from decoder, since they both created by sampling z according to a distribution ?

This is just my opinion.
Please feel free if I make mistakes.
Thank you.

Automatic saving does not save meta data

The call to self.save in line #347 of fauxtograph.py always sends save_meta as False.

It appears that you should increase save_counter only when epoch % save_freq == 0.

The same issue is present in GAN and VAEGAN.

different shape/size = black

I´m testing out the --shape param, attempting to train/generate higher resolution images.
Seems like it does not like square formats, unfortunately resulting in black images.
Getting a bunch of these errors to: "RuntimeWarning: overflow encountered in exp self.y = 1 / (1 + numpy.exp(-x[0]))"

Any tips here?

installation instructions do not appear to work?

I'm on ubuntu
pip install fauxtograph appears to work fine, however
fauxtograph download images/
says fauxtograph is undefined?
I tried git cloning the code, then cding in and executing this
python fauxtograph.py download ./images
it pauses, but does nothing?

Chainer error during training: type mismatch

Followed steps outline in the blog. Got an error during training:

fauxtograph train images/ models/model_out
Loading Image Files...
Image Files Loaded!

Training for 200 epochs.

epoch: 1
0%| | 0/12 [00:00<?, ?it/s]Traceback (most recent call last):
File "/usr/local/bin/fauxtograph", line 8, in
load_entry_point('fauxtograph==0.1.8', 'console_scripts', 'fauxtograph')()
File "/Library/Python/2.7/site-packages/click/core.py", line 716, in call
return self.main(_args, *_kwargs)
File "/Library/Python/2.7/site-packages/click/core.py", line 696, in main
rv = self.invoke(ctx)
File "/Library/Python/2.7/site-packages/click/core.py", line 1060, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/Library/Python/2.7/site-packages/click/core.py", line 889, in invoke
return ctx.invoke(self.callback, *_ctx.params)
File "/Library/Python/2.7/site-packages/click/core.py", line 534, in invoke
return callback(_args, **kwargs)
File "/Library/Python/2.7/site-packages/fauxtograph/fauxto.py", line 116, in train
iae.fit(batch_size=batch, n_epochs=epoch)
File "/Library/Python/2.7/site-packages/fauxtograph/fauxtograph.py", line 271, in fit
loss.backward()
File "/Library/Python/2.7/site-packages/chainer/variable.py", line 171, in backward
self.grad = numpy.ones_like(self.data)
File "/Library/Python/2.7/site-packages/chainer/variable.py", line 110, in grad
% (type(self.data), type(g), error_msg))
TypeError: Type of data and grad mismatch: <type 'numpy.ndarray'> != <type 'numpy.float32'>
This error is occured in two cases. The first case is when the user manually
sets the Variable.grad incorrectly. The second case is when some Function
implementation has a bug.

While training on own images, getting a Invalid operation is performed in: LinearFunction (Forward) error

While training on my own images, I'm getting this error,

chainer.utils.type_check.InvalidType:
Invalid operation is performed in: LinearFunction (Forward)

Expect: prod(in_types[0].shape[1:]) == in_types[1].shape[1]
Actual: 259656 != 27648

image loss function?

VAE and VAEGAN code is currently using mean squared error as the reconstruction loss function. In most papers / implementations, I'm more used to seeing binary cross entropy with numbers reported in nats.

Curious what we think would be best here. I did do a quick look for this in the chainer docs but didn't see binary cross entropy listed as one of the built in loss functions.