eriklindernoren / keras-gan Goto Github PK

View Code? Open in Web Editor NEW

9.2K 276.0 3.1K 88.41 MB

Keras implementations of Generative Adversarial Networks.

License: MIT License

Python 99.22% Shell 0.78%

deep-learning gan keras generative-adversarial-networks neural-networks

keras-gan's People

Contributors

Stargazers

Watchers

Forkers

adsjhjqw iqbal-chowdhury aihill zilongzhong goswamig mdshopon alomdaelmasry mamonraab abdolrhman blid92 jekso gihanali dilaw2007 mahmoudzareef soaad ucefizi mahmoudshoaala orascomc tamercs2005 ahmedthabit ponach farisology fatmas1982 monjovi amoliu zhuwenxiao xiaoguozhi little1tow jdc08161063 gongqingyi-github liu-hai-yang llcf xsongx runrunliuliu allensmile kuanih kixiang llcforg arqcenick mhasnat actionone xiaonanchong96 imonkey0222 kaeflint duolajiang menggangmark mohammadriazi aruncskumar davidpeng11 123fengye741 ismorphism hedgefair visanxj airghc guokeda jeanxi jemisa mirkoknaak huster-wgm zhf459 padidehdanaee rahasayantan amagrabi alexrakowski lawrencekiba iseehz0530 aneesht90 denisse-dev zhangyang5511 mocique pranav0897 tonykuo222 hyzcn giserh stevenlol yasserotiefy gjjg1331jggj johnieli dl-deeplearning leesoon1984 guanhc zpeng1989 peakszhang nafmichael xuewengeophysics heavyflavor rakhi2104 0xqq angzz dailyactie tnmygrwl pakigya jmrinaldi akemisetti jvpoulos alsombra sajjadriaj jazzman37 baifengbai satadru5

keras-gan's Issues

C-RNN-GAN for time series

This seems like an interesting and a useful model for generating time series using GANs. Perhaps, this might be a worthy addition to this library?

Change dataset with image size not 28*28, Can't train it.

When I change dataset form mnist to cifar10(with images 32323), and changed code as:

self.img_rows = 32 # 28 to 32
self.img_cols = 32 # 28 to 32
self.channels = 3 # 1 to 3
```.

But I got an error


` _ValueError: Error when checking input: expected input_1 to have shape (32, 32, 3) but got array with shape (28, 28, 3)_`.


I found it's because that shape of gen_imgs is (28,28,3) in where 

`gen_imgs = self.generator.predict(noise)` in `def train()
`.
Could you tell me how to fix it?

Batches per epoch / training on half batches

Hey there...GANs are pretty new to me. Thank you for all the examples.

Two questions:

Reading your code it seems like you train on one batch per epoch. Is that intentional? What's the motivation for that?
It seems that training the discriminator on half batches works better than training on a single batch where you concatenate real and fake. Do you know why that is?

Hi,
You really have a great repo here. As I was looking into your repo i found out that you have folders for saved models. Can you please upload these models on google drive (if you have already trained your models). It would be really great for those who don't want to train and are only interested in results.
Thanks

Concatenate

I got the following error when I tried to apply the code to my own data set:

ValueError: A Concatenate layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 64, 64, 128), (None, 63, 63, 128)]

Any assistance will be greatly appreciated.

CycleGan

When I running your code for Cycle-Gan for some dataset it's working good like cityscapes,apple2orange
but for some it's showing the following error .Please help me to resolve .
And what can change in your code to see the result in Google Colob since don't have GPU

Traceback (most recent call last): File "cyclegan.py", line 258, in <module> gan.train(epochs=200, batch_size=1, sample_interval=200) File "cyclegan.py", line 170, in train for batch_i, (imgs_A, imgs_B) in enumerate(self.data_loader.load_batch(batch_size)): File "/content/project1211/CycleGan/project3/project344/Keras-GAN/cyclegan/data_loader.py", line 42, in load_batch path_A = np.random.choice(path_A, total_samples, replace=False) File "mtrand.pyx", line 1126, in mtrand.RandomState.choice ValueError: a must be non-empty

cGAN with multi-class labels

Thank you for the great work here it's helped me jump start my own project! Would it be possible to adapt the cGAN generator and discriminator networks to handle label inputs of shape (x,) where x is a positive integer greater than 1? This way the label could be an array of values (i.e. in a case where x=4 the label could look something like [1, 0, 1, 1]) allowing the cGAN to be trained for more complex classification tasks.

AAE Model - Encoder Output - Training Order

@eriklindernoren as always thank you for the amazing library!

Can you explain what this code is doing here for the encoder starting from the variable "mu"? Im not sure which model this corresponds to in the paper. Are you doing this merge layer as "hack" to make the latent code closer to normal to make training easier on the network?

Really curious on your interpretation of the merge layer...This seems like a good trick but I want to make sure I understand what is going on and the why behind it.

    def build_encoder(self):
        # Encoder

        img = Input(shape=self.img_shape)

        h = Flatten()(img)
        h = Dense(512)(h)
        h = LeakyReLU(alpha=0.2)(h)
        h = Dense(512)(h)
        h = LeakyReLU(alpha=0.2)(h)
        mu = Dense(self.latent_dim)(h)
        log_var = Dense(self.latent_dim)(h)
        latent_repr = merge([mu, log_var],
                mode=lambda p: p[0] + K.random_normal(K.shape(p[0])) * K.exp(p[1] / 2),
                output_shape=lambda p: p[0])

         return Model(img, latent_repr)

Concerning the training loop code I notice you do like the below when training the discrminator, but does this adversely affect performance? Isn't it better to feeds nets data randomly (e.g. send a minbatch with both positives and negatives in it and do train_on_batch once)?

            # Train the discriminator
            d_loss_real = self.discriminator.train_on_batch(latent_real, valid)
            d_loss_fake = self.discriminator.train_on_batch(latent_fake, fake)
            d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

Can you explain you rationale for this type of setup? Im NOT questioning your methods, just trying to understand.

Improvable loss printing for the WGAN critic

Two minor improvements to your WGAN code:

The discriminator is called critic now, as it's not computing a probability anymore
At line 149 you have
d_loss = 0.5 * np.add(d_loss_fake, d_loss_real)
and when printing the losses you have
print ("%d [D loss: %f] [G loss: %f]" % (epoch, 1 - d_loss[0], 1 - g_loss[0]))
this makes sense but you are essentially throwing away the information about all but the last critic training iteration. I suggest to average the complete critic training operation and print that instead, or add it along the other info.

[question]: Can I use your codes for any image size ?

Hi,

Great stuffs really on the new and difficult subject likes GAN. I have been scratching my head going through loads of documentations, codes and test myself, still a bit blur :( ...

One question before I start play around with your codes was that if your code works with ANY image size ?

Thanks,
Thu Sinh

Hi

Wgan gives error when I run the code

Traceback (most recent call last):
File "wgan.py", line 190, in
wgan = WGAN()
File "wgan.py", line 33, in init
self.critic = self.build_critic()
File "wgan.py", line 109, in build_critic
img = Input(shape=img_shape)
NameError: name 'img_shape' is not defined

Is the code finished?

What is needed to run the models?

I've tried running a few of the models but I'm missing images or json files for them. What do I need to have in place in order to run them?

aae.py discriminator.trainable

# For the adversarial_autoencoder model we will only train the generator
self.discriminator.trainable = False

# Train the discriminator
d_loss_real = self.discriminator.train_on_batch(latent_real, valid)
d_loss_fake = self.discriminator.train_on_batch(latent_fake, fake)
d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

why train if it is not trainable ?

a question about gcgan network

hello,Erik, thanks for your sharing of these kinds of GAN, and I wonder if my data is 3 channel color images like cifar, how to apply it, for if I just change the size and channel it arise an error 'number of input channels does not match corresponding dimension of filter, 1 != 3'

wgan-gp

the paper said 'No critic batch normalization',but u add batch normalization every layer,why?

PixelDA: No noise added to input of generator

Not too familiar with Keras, but it seems like you don't add the noise to the input of the generator as described in the paper.

Is there a particular reason?

Deploying in android.

How can I deploying this model in android platform? As far as I know, need a frozen .pb file and proper input output node for deploying and testing this model in android. Would you kindly help please?
Thanks in advance.

Ideas on CGAN code

Hi Erik! Great work with the repo.

I was following the CGAN code along with the paper and I did some alterations to try and match the models to exactly what the paper says. I knew the models presented in the paper would be less productive than your code but just wanted to replicate those and see. For example the generator model in the paper mentions this:

In the generator net, a noise prior z with dimensionality 100 was drawn from a uniform distribution
within the unit hypercube. Both z and y are mapped to hidden layers with Rectified Linear Unit
(ReLu) activation [4, 11], with layer sizes 200 and 1000 respectively, before both being mapped to
second, combined hidden ReLu layer of dimensionality 1200. We then have a final sigmoid unit
layer as our output for generating the 784-dimensional MNIST samples.

So with this I altered the code like this:

def build_generator(self):

       model = Sequential()

        model.add(Dense(200, input_dim=self.latent_dim))
        model.add(Activation('relu'))
        model.add(BatchNormalization(momentum=0.8))

        model.add(Dense(1000))
        model.add(Activation('relu'))
        model.add(BatchNormalization(momentum=0.8))

        model.add(Dense(1200, input_dim=self.latent_dim))
        model.add(Activation('relu'))
        model.add(BatchNormalization(momentum=0.8))

        model.add(Dropout(0.5))

        model.add(Dense(np.prod(self.img_shape), activation='sigmoid'))
        model.add(Reshape(self.img_shape))


        model.summary()

        noise = Input(shape=(self.latent_dim,))
        label = Input(shape=(1,), dtype='int32')
        label_embedding = Flatten()(Embedding(self.num_classes, self.latent_dim)(label))

        model_input = multiply([noise, label_embedding])

        img = model(model_input)
        

        return Model([noise, label], img)

But still I think this not exactly what the paper means. What I understand from the paper is the noise and labels are first fed into two different layers and then combined to one layer.

Does this mean that there should be three separate models inside the generator? Or am I mistaken thinking that? Would like to hear your thoughts.

Thanks!

Is it possible to include callbacks (tensorboard) to GANs?

I am trying the DCGAN for my project. I am wondering if there is any way callbacks can be included in the Keras model.train_on_batch so that I can trace G-loss and D-loss in the tensorboard?

About the loss fucntion of ccgan

Hi~ Grate for your awesome work! I have a doubt below:
in https://github.com/eriklindernoren/Keras-GAN/blob/master/ccgan/ccgan.py#56
'self.combined.compile(loss=['mse', 'binary_crossentropy'],'
The paper didn't say that the mse loss of generated in-painted image and real images is used?

SeqGAN?

Hi, good work mate. Are you planning on adding SeqGAN?

Paper : https://arxiv.org/abs/1609.05473
TF code : https://github.com/LantaoYu/SeqGAN

CC GAN

Hello?

I've been running the CC GAN code and the terminal output looks quite strange.
Is it normal for CC GAN?

409 [D loss: nan, acc: 50.00%, op_acc: 26.56%] [G loss: nan, mse: nan]
410 [D loss: nan, acc: 50.00%, op_acc: 21.88%] [G loss: nan, mse: nan]
411 [D loss: nan, acc: 50.00%, op_acc: 29.69%] [G loss: nan, mse: nan]
412 [D loss: nan, acc: 50.00%, op_acc: 21.88%] [G loss: nan, mse: nan]
413 [D loss: nan, acc: 50.00%, op_acc: 23.44%] [G loss: nan, mse: nan]
414 [D loss: nan, acc: 50.00%, op_acc: 28.12%] [G loss: nan, mse: nan]
415 [D loss: nan, acc: 50.00%, op_acc: 31.25%] [G loss: nan, mse: nan]
416 [D loss: nan, acc: 50.00%, op_acc: 20.31%] [G loss: nan, mse: nan]

Run on GPU

Hi,

How can I train the CycleGan on GPU instead of CPU?

why did not use Conv2DTranspose in the generator of DCGAN

related PAPER of DCGAN all say they use Conv2DTranspose rather than Conv2d

Can you please put the instructions to run and add image datsets with the different versions of GAN?

If not, I can do it myself. Thanks!

Please let me know either way in order to make this repository turn-key.

Module 'tensorflow.python.ops.nn' has no attribute 'leaky_relu'

Hi Erik,

Have you ever met with the error "module 'tensorflow.python.ops.nn' has no attribute 'leaky_relu'"?
This is probably because my tensorflow backend is in an older version 1.0.0. But it looks like I can only update the tensorflow outside keras, instead of update the "tensorflow backend" within keras.
Do you have a clue? THANKS!

Hi, I am a beginner

Can you please explain why d_loss[0] is the discriminator loss while d_loss[1] is the accuracy?

Please update scipy version and other Python packages

Hi Erik,

I am encountering a problem with scipy method named imread, which I believe it is deprecated followed the recent version 1.0.0. Probably, this would work with earlier version such as 0.19.0 however, it still causes a lot of conflictions for Keras and Tensorflow.

I am using Ananconda with Python 3.6.4. Please see more detail as follows

Using TensorFlow backend.
Traceback (most recent call last):
  File "cyclegan.py", line 244, in <module>
    gan.train(epochs=30000, batch_size=2, save_interval=200)
  File "cyclegan.py", line 161, in train
    imgs_A = self.data_loader.load_data(domain="A", batch_size=half_batch)
  File "/home/emma/Research/GAN/keras_GAN/Keras-GAN/cyclegan/data_loader.py", line 18, in load_data
    img = self.imread(img_path)
  File "/home/emma/Research/GAN/keras_GAN/Keras-GAN/cyclegan/data_loader.py", line 39, in imread
    return scipy.misc.imread(path, mode='RGB').astype(np.float)
AttributeError: module 'scipy.misc' has no attribute 'imread'

Thanks very much,

Emma

srgan doesn't run?

I receive this error when running srgan.py:

Traceback (most recent call last):
  File "srgan.py", line 263, in <module>
    gan.train(epochs=30000, batch_size=1, sample_interval=50)
  File "srgan.py", line 192, in train
    imgs_hr, imgs_lr = self.data_loader.load_data(batch_size)
  File "/home/ubuntu/Keras-GAN/srgan/data_loader.py", line 16, in load_data
    batch_images = np.random.choice(path, size=batch_size)
  File "mtrand.pyx", line 1126, in mtrand.RandomState.choice
ValueError: a must be non-empty

Additionally, the example for srgan.py references 'steps at the top of' the file. However, I cannot find such steps:

$ cd srgan/
<follow steps at the top of srgan.py>
$ python3 srgan.py

Pix2Pix Code - Generator and Discriminator questions

This repo is amazing!

Ive read the blogs and other post for pix2pix and your code seems to be the most clear but I have some questions about what exactly is going on:

In the "#Downsampling" portion of this code for the Generator, how exactly is any spatial resolution actually being lost (downsampled) if you fix the padding parameter equal to "same"? Technically arent the feature map sizes the exact same for d0 and d7 because of this regardless of stride and filter size? Padding equals same ensures this I thought. Since this part is puzzling I dont see how the "Upsampling" portion of the code (omitted here) is doing any actual upsampling back to d0 spatial resolution since we never got smaller than d0 to begin with. I think I am missing something really obvious here!

    def build_generator(self):
        """U-Net Generator"""

        def conv2d(layer_input, filters, f_size=4, bn=True):
            """Layers used during downsampling"""
            d = Conv2D(filters, kernel_size=f_size, strides=2, padding='same')(layer_input)
            d = LeakyReLU(alpha=0.2)(d)
            if bn:
                d = BatchNormalization(momentum=0.8)(d)
            return d

        def deconv2d(layer_input, skip_input, filters, f_size=4, dropout_rate=0):
            """Layers used during upsampling"""
            u = UpSampling2D(size=2)(layer_input)
            u = Conv2D(filters, kernel_size=f_size, strides=1, padding='same', activation='relu')(u)
            if dropout_rate:
                u = Dropout(dropout_rate)(u)
            u = BatchNormalization(momentum=0.8)(u)
            u = Concatenate()([u, skip_input])
            return u

        # Image input
        d0 = Input(shape=self.img_shape)

        # Downsampling
        d1 = conv2d(d0, self.gf, bn=False)
        d2 = conv2d(d1, self.gf*2)
        d3 = conv2d(d2, self.gf*4)
        d4 = conv2d(d3, self.gf*8)
        d5 = conv2d(d4, self.gf*8)
        d6 = conv2d(d5, self.gf*8)
        d7 = conv2d(d6, self.gf*8)

I am also having a hard time deciphering the code below concerning the discriminator. This patchGan idea is still pretty vague to me (not due to your code), wondering why it even matters but assuming I understood why it was neccessary I am still confused about the discriminator portion of the code since the "validity" variable would have the same spatial resolution as d1, which has the same spatial resolution as img_A/img_B for the same reason I outline above, also there is no explicit downsampling with the disriminator it seems, so there are two reasons why it seems H x W of validity should equal H x W of img_A/img_B, however when we go to train the discriminator we set "valid" and "fake" to a size much smaller than img_A or img_B, in our case img_A and img_B are supposed to be 256,256,3 but disc_patch has dimension 16,16,1. And it is this disc_patch dimension that acts as a target for the discriminator during training but if discrimnator recieves input of 256,256,3 it should return output at the same resolution resulting in 256,256,1 since I cant see anywhere in the code that would get the feature map size down to 16,16,1. Sorry for dumb questions but these are burning ones. I have more conceptual/fundemental questions about pix2pix in general but I cant ask them until I get past these roadblocks...


     # Calculate output shape of D (PatchGAN)
     patch = int(self.img_rows / 2**4)
     self.disc_patch = (patch, patch, 1) #has dim 16,16,1

    def build_discriminator(self):

        def d_layer(layer_input, filters, f_size=4, bn=True):
            """Discriminator layer"""
            d = Conv2D(filters, kernel_size=f_size, strides=2, padding='same')(layer_input)
            d = LeakyReLU(alpha=0.2)(d)
            if bn:
                d = BatchNormalization(momentum=0.8)(d)
            return d

        img_A = Input(shape=self.img_shape)
        img_B = Input(shape=self.img_shape)

        # Concatenate image and conditioning image by channels to produce input
        combined_imgs = Concatenate(axis=-1)([img_A, img_B])

        d1 = d_layer(combined_imgs, self.df, bn=False)
        d2 = d_layer(d1, self.df*2)
        d3 = d_layer(d2, self.df*4)
        d4 = d_layer(d3, self.df*8)

        validity = Conv2D(1, kernel_size=4, strides=1, padding='same')(d4)

        return Model([img_A, img_B], validity)

    def train(self, epochs, batch_size=1, sample_interval=50):

        start_time = datetime.datetime.now()

        # Adversarial loss ground truths
        valid = np.ones((batch_size,) + self.disc_patch)
        fake = np.zeros((batch_size,) + self.disc_patch)

        for epoch in range(epochs):
            for batch_i, (imgs_A, imgs_B) in enumerate(self.data_loader.load_batch(batch_size)):

                # ---------------------
                #  Train Discriminator
                # ---------------------

                # Condition on B and generate a translated version
                fake_A = self.generator.predict(imgs_B)

                # Train the discriminators (original images = real / generated = Fake)
                d_loss_real = self.discriminator.train_on_batch([imgs_A, imgs_B], valid)
                d_loss_fake = self.discriminator.train_on_batch([fake_A, imgs_B], fake)
                d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

THANKS FOR YOUR HELP!

ccgan wrong input for discriminator

on the subfolder of ccgan, I notice that you set valid image to both discrimination of valid images and fake images

` # Train the discriminator

  d_loss_real = self.discriminator.train_on_batch(imgs, [valid, labels], class_weight=class_weights)

  d_loss_fake = self.discriminator.train_on_batch(imgs, [valid, fake_labels], class_weight=class_weights)
  d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)`

is that correct or you intended to do so?

AAE not training the discriminator

The adversarial autoencoder doesn't seem to be training the discriminator. The discriminator accuracy remains stuck at around 50%. In addition, fake images that I generated from random Gaussian codes did not look anything like MNIST digits. In fact, they did not look any different from fake images when setting the discriminator's loss weight to zero in the autoencoder model, which is equivalent to a simple non-adversarial autoencoder.

The problem seems to be related to the batch normalization layers in the discriminator: after removing both of them, the discriminator accuracy fluctuates during training and fake images generated from random Gaussian codes look much more like real digits (I also increased the discriminator's loss weight in the autoencoder model).

Why mse not binary_crossentropy

cycelgan.py

Hi @eriklindernoren , thanks for your repo

I really learned a lot. But something confuses me, why you use mse as loss function for discriminator not cross entropy?

I thought that cross entropy would give much bigger gradient, isn't that helpful for training?

Real and fake reversed in AAE

In the adversarial autoencoder, lines 139 and 141 are:
latent_fake = self.encoder.predict(imgs)
latent_real = np.random.normal(size=(half_batch, self.encoded_dim))
Shouldn't this be reversed?

WGAN-GP producing bad result with CIFAR10

Hi there,

I've changed WGAN-GP to use CIFAR10 instead of MNIST and the result was quite poor. Have a look:
https://drive.google.com/file/d/0B5vxICNG1z3pd2NSdzZqMHN1R2ZIX2o2V3FqajlwYjlwQzRZ/view?usp=sharing

Here is my adaptation:
https://gist.github.com/thiagolcks/a08d76b33f2a37fe2f3806253c609528

Am I missing something?

Cheers

No connection in deconv2d in build_generator():

It looks like u1 is not connected to u2 and u2 is not connected to u3 at line 90.

# Downsampling
d1 = conv2d(img, self.gf, bn=False)
d2 = conv2d(d1, self.gf*2)
d3 = conv2d(d2, self.gf*4)
d4 = conv2d(d3, self.gf*8)

# Upsampling
u1 = deconv2d(d4, d3, self.gf*4)
u2 = deconv2d(d3, d2, self.gf*2)
u3 = deconv2d(d2, d1, self.gf)

u4 = UpSampling2D(size=2)(u3)

Unsure if this is a bug (or feature?)

Keras setting discriminator to non-trainable forever

The code written here -> https://github.com/eriklindernoren/Keras-GAN/blob/master/gan/gan.py
sets the discriminator.trainable = False once after compiling the discriminator model.

I found out that even after compiling the model, the trainable property of the original model object still holds. When I ran this code, it was not training the discriminator at all. The way I got around this problem is by explicitly setting the discriminator training off before the combined training and restoring the state right before training the discriminator.

TabError: inconsistent use of tabs and spaces in indentation

flake8 testing of https://github.com/eriklindernoren/Keras-GAN on Python 3.6.3

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./gan/gan_rgb.py:204:50: E999 TabError: inconsistent use of tabs and spaces in indentation
	        d_loss_logs_f_a = np.array(d_loss_logs_f)
                                                 ^
1     E999 TabError: inconsistent use of tabs and spaces in indentation
1

Python 3 treats these like syntax errors https://docs.python.org/3/library/exceptions.html#TabError

InfoGAN Classification

Hello. How do you find classification accuracy for InfoGAN? I am not sure I see that in the code. Any help would be greatly appreciated. Thank you!

Can't run improved_wgan on Keras2.1.5 with Theano 1.0.1

Traceback (most recent call last):
File "improved_wgan.py", line 251, in
wgan = ImprovedWGAN()
File "improved_wgan.py", line 84, in init
loss_weights=[1, 1, 10])
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 830, in compile
sample_weight, mask)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 429, in weighted
score_array = fn(y_true, y_pred)
File "improved_wgan.py", line 109, in gradient_penalty_loss
gradients = K.gradients(y_pred, averaged_samples)[0]
File "/usr/local/lib/python2.7/dist-packages/keras/backend/theano_backend.py", line 1240, in gradients
return T.grad(loss, variables)
File "/usr/local/lib/python2.7/dist-packages/theano/gradient.py", line 487, in grad
raise TypeError("cost must be a scalar.")
TypeError: cost must be a scalar.

g_loss is not iterable in lsgan

I tried to use tensorboard as callbacks to observe d_loss, g_loss and acc, and i find that d_loss and acc are iterable but g_loss is not, can you explain the reason or how to change the code?

No display name and no $DISPLAY environment variable

Hello I'm trying to run GAN script in gan\gan.py, but I'm getting an error that said

_tkinter.TclError: no display name and no $DISPLAY environment variable

and here's the screenshot

Is there any solution for this?

Adding conditional to WGAN-GP

I am extending the WGAN-GP to be conditional. I am concatenating a label input to both the discriminator input and noise input for the generator.

However, I am getting stuck in the final part where I build the combined model.

        # The generator takes noise and the target label (states) as input
        # and generates the corresponding samples of that label
        noise = Input(shape=(self.latent_size, ), name="noise")
        label = Input(shape=(self.label_size, ), name="labels")
        real_samples = Input(shape=(self.input_size,), name="real")

        self.discriminator = self.build_discriminator()
        self.generator = self.build_generator([noise, label])

        # First we train the discriminator
        self.generator.trainable = False
        fake_samples = self.generator([noise, label])

        fake = self.discriminator([fake_samples, label])
        valid = self.discriminator([real_samples, label])

        interpolated = Lambda(self.random_weighted_average)([real_samples, fake_samples])
        valid_interp = self.discriminator([interpolated, label])

        # The combined model  (stacked generator and discriminator)
        # Trains generator to fool discriminator
        self.d_model = Model([real_samples, noise, label],
                             [valid, fake, valid_interp],
                             name="discriminator")
        # Time to train the generator
        self.discriminator.trainable = False
        self.generator.trainable = True

        noise_gen = Input(shape=(self.latent_size,), name="noise_gen")

        fake_samples = self.generator([noise_gen, label])
        valid = self.discriminator([fake_samples, label])

        self.g_model = Model([noise_gen, label], valid, name="generator")
        self.g_model.compile(loss=self.wasserstein_loss, optimizer=optimizer)

I don't think this is the way to create the final model. How would I create the combined model to also include the label the right way? I'm assuming that the noise should actually be the generated output of the generator? Any help?

Cannot generate mnist digits with the InfoGAN.py script

I tried to run the script for InfoGAN applied to mnist. However, the generated images at the end of the training phase are not even vaguely similar to digits' shapes.
Is there someone else that tried the script? Do you have my same problem? Is there anything I should try to make it work?
I have already tried to use more training epochs and reducing the learning rate but nothing worked so far....

SRGAN subpixel

Hi :
Thanks very much for your comprehensive networks from gan. I am learning the SRGAN and I found that in the original paper the sub-pixel layers are used in the generator. But in your code, the deconvolution layers are upsampling+ conv2D. I'd like to know what is the difference, do the result keeps the same?

noise shape in CGAN

In cgan/cgan.py in 38 row noise = Input(shape=(100,))

I expected noise = Input(shape=(self.latent_dim,))

Is it right?

error in adversarial_autoencoder.py

if` __name__ == '__main__':
    aae = AdversarialAutoencoder()


Layer (type)                 Output Shape              Param #   
=================================================================
dense_10 (Dense)             (None, 512)               51712     
_________________________________________________________________
leaky_re_lu_7 (LeakyReLU)    (None, 512)               0         
_________________________________________________________________
batch_normalization_7 (Batch (None, 512)               2048      
_________________________________________________________________
dense_11 (Dense)             (None, 512)               262656    
_________________________________________________________________
leaky_re_lu_8 (LeakyReLU)    (None, 512)               0         
_________________________________________________________________
batch_normalization_8 (Batch (None, 512)               2048      
_________________________________________________________________
dense_12 (Dense)             (None, 1)                 513       
=================================================================
Total params: 318,977
Trainable params: 316,929
Non-trainable params: 2,048
_________________________________________________________________
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
flatten_2 (Flatten)          (None, 784)               0         
_________________________________________________________________
dense_13 (Dense)             (None, 512)               401920    
_________________________________________________________________
leaky_re_lu_9 (LeakyReLU)    (None, 512)               0         
_________________________________________________________________
batch_normalization_9 (Batch (None, 512)               2048      
_________________________________________________________________
dense_14 (Dense)             (None, 512)               262656    
_________________________________________________________________
leaky_re_lu_10 (LeakyReLU)   (None, 512)               0         
_________________________________________________________________
batch_normalization_10 (Batc (None, 512)               2048      
_________________________________________________________________
dense_15 (Dense)             (None, 100)               51300     
=================================================================
Total params: 719,972
Trainable params: 717,924
Non-trainable params: 2,048
_________________________________________________________________
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_16 (Dense)             (None, 512)               51712     
_________________________________________________________________
leaky_re_lu_11 (LeakyReLU)   (None, 512)               0         
_________________________________________________________________
batch_normalization_11 (Batc (None, 512)               2048      
_________________________________________________________________
dense_17 (Dense)             (None, 512)               262656    
_________________________________________________________________
leaky_re_lu_12 (LeakyReLU)   (None, 512)               0         
_________________________________________________________________
batch_normalization_12 (Batc (None, 512)               2048      
_________________________________________________________________
dense_18 (Dense)             (None, 784)               402192    
_________________________________________________________________
reshape_2 (Reshape)          (None, 28, 28, 1)         0         
=================================================================
Total params: 720,656
Trainable params: 718,608
Non-trainable params: 2,048

aae.train(epochs=2000, batch_size=32, save_interval=200)

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-6-159ebfef0877> in <module>()
----> 1 aae.train(epochs=2000, batch_size=32, save_interval=200)

<ipython-input-2-57f851a1818e> in train(self, epochs, batch_size, save_interval)
    113 
    114             # Generate a half batch of new images
--> 115             latent_fake, gen_imgs = self.generator.predict(imgs)
    116 
    117             latent_real = np.random.normal(size=(half_batch, self.encoded_dim))

/usr/local/lib/python2.7/dist-packages/Keras-2.0.8-py2.7.egg/keras/engine/training.pyc in predict(self, x, batch_size, verbose, steps)
   1715         f = self.predict_function
   1716         return self._predict_loop(f, ins, batch_size=batch_size,
-> 1717                                   verbose=verbose, steps=steps)
   1718 
   1719     def train_on_batch(self, x, y,

/usr/local/lib/python2.7/dist-packages/Keras-2.0.8-py2.7.egg/keras/engine/training.pyc in     _predict_loop(self, f, ins, batch_size, verbose, steps)
   1267                 else:
   1268                     ins_batch = _slice_arrays(ins, batch_ids)
-> 1269                 batch_outs = f(ins_batch)
   1270                 if not isinstance(batch_outs, list):
   1271                     batch_outs = [batch_outs]

/usr/local/lib/python2.7/dist-packages/Keras-2.0.8-py2.7.egg/keras/backend/tensorflow_backend.pyc in __call__(self, inputs)
   2255         updated = session.run(self.outputs + [self.updates_op],
   2256                               feed_dict=feed_dict,
-> 2257                               **self.session_kwargs)
   2258         return updated[:len(self.outputs)]
   2259 

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1122     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1123       results = self._do_run(handle, final_targets, final_fetches,
-> 1124                              feed_dict_tensor, options, run_metadata)
   1125     else:
   1126       results = []

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_run(self, handle,     target_list, fetch_list, feed_dict, options, run_metadata)
   1319     if handle is None:
   1320       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1321                            options, run_metadata)
   1322     else:
   1323       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

/usr/local/lib/python2.7/dist-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
   1338         except KeyError:
   1339           pass
-> 1340       raise type(e)(node_def, op, message)
   1341 
   1342   def _extend_graph(self):

InvalidArgumentError: Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
	 [[Node: sequential_5/flatten_2/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32, _device="/job:localhost/replica:0/task:0/gpu:0"](_arg_input_5_0_1/_213, sequential_5/flatten_2/stack)]]

Caused by op u'sequential_5/flatten_2/Reshape', defined at:
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/usr/local/lib/python2.7/dist-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/usr/local/lib/python2.7/dist-packages/traitlets/config/application.py", line 658, in     launch_instance
    app.start()
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/usr/local/lib/python2.7/dist-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/usr/local/lib/python2.7/dist-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
handler(stream, idents, msg)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/usr/local/lib/python2.7/dist-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2822, in     run_ast_nodes
    if self.run_code(code, result):
  File "/usr/local/lib/python2.7/dist-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-3d96210dd081>", line 2, in <module>
    aae = AdversarialAutoencoder()
  File "<ipython-input-2-57f851a1818e>", line 18, in __init__
    self.generator = self.build_generator()
  File "<ipython-input-2-57f851a1818e>", line 54, in build_generator
    encoded_repr = encoder(img)
  File "build/bdist.linux-x86_64/egg/keras/engine/topology.py", line 602, in __call__
    output = self.call(inputs, **kwargs)
  File "build/bdist.linux-x86_64/egg/keras/models.py", line 532, in call
    return self.model.call(inputs, mask)
  File "build/bdist.linux-x86_64/egg/keras/engine/topology.py", line 2058, in call
    output_tensors, _, _ = self.run_internal_graph(inputs, masks)
  File "build/bdist.linux-x86_64/egg/keras/engine/topology.py", line 2209, in run_internal_graph
    output_tensors = _to_list(layer.call(computed_tensor, **kwargs))
  File "build/bdist.linux-x86_64/egg/keras/layers/core.py", line 484, in call
    return K.batch_flatten(inputs)
  File "build/bdist.linux-x86_64/egg/keras/backend/tensorflow_backend.py", line 1918, in batch_flatten
    x = tf.reshape(x, tf.stack([-1, prod(shape(x)[1:])]))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 2619, in     reshape
name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): Reshape cannot infer the missing input size for an empty tensor unless all specified input sizes are non-zero
	 [[Node: sequential_5/flatten_2/Reshape = Reshape[T=DT_FLOAT, Tshape=DT_INT32,     _device="/job:localhost/replica:0/task:0/gpu:0"](_arg_input_5_0_1/_213, sequential_5/flatten_2/stack)]]

self.generator.predict(noise) in SGAN.py is returning [nan, ..., nan] after first training epoch.

Hello there,

As I said self.generator.predict(noise) returns nan after g_loss = self.combined.train_on_batch(noise, validity, class_weight=[cw1, cw2]) has run once:

        gen_imgs = self.generator.predict(noise)
        print(gen_imgs) # [[[[ 0.06864215] [ 0.00251574] [ 0.035886] ...,

        # Train the generator
        g_loss = self.combined.train_on_batch(noise, validity, class_weight=[cw1, cw2])

        gen_imgs = self.generator.predict(noise)
        print(gen_imgs) # [[[[ nan][ nan][ nan]...,

May you please specify the different versions (python, numpy, matplotlib, keras, tensorflow) in use? :) Would be awesome! Have a nice day and thank you very much for your effort!

Context Encoder implementation flaws

Hello, it seems that the context encoder implementation has some flaws compared to the paper described: https://arxiv.org/abs/1611.06430

First of all, the optimizer used for both generator and discriminator are the same. On the paper is described that the generator and discriminator have a different learning rate, it being 10 times bigger on the generator.

Second, one of the main points on the generator network is the fully connected layer on the bottleneck, which is missing on the implementation network aswell.

Thanks

Loss is 'nan' when running on GPU

I tried to run your sgan code on GPU but it is giving me,
[D loss: nan, acc: 0.00%, op_acc: 6.25%] [G loss: nan]
However, it is working fine on my local cpu machine. Do you know why I am getting this error?