Giter Club home page Giter Club logo

transparent_latent_gan's Introduction

TL-GAN: transparent latent-space GAN

This is the repository of my three-week project: "Draw as you can tell: controlled image synthesis and edit using TL-GAN"

Resource lists:

Alt text

A high quaility video of the above GIF on YouTube

Core ideas

  • This project provides a novel method to control the generation process of a unsupervisedly-trained generative model like GAN (generative adversarial network).
  • GANs can generate random photo-realistic images from random noise vectors in the latent space (see stunning examples of the Nvidia's PG-GAN), but we can no control over the features of the generated images.
  • Knowing that the images are determined by the noise vector in the latent space, if we can understand the latent space, we can control our generation process.
  • For a already well-trained GAN generator, I made its latent space transparent by discovering feature axes in it. When a vector moves along a feature axis in the latent space, the corresponding image morphs along that feature, which enables controlled synthesis and edit.
  • This is achieved by leveraging a coupled feature extractor network (a CNN here in this demo, but can be any other CV techniques), which enables us to find correlation between noise vectors and image features.
  • Advantages of this method over conditional GAN and AC-GAN:
    • Efficiency: To add a new controller of the generator, you do not have to re-train the GAN model, thus it only takes <1h to add 40 knobs with out methods.
    • Flexibility: You could use different feature extractors trained on different dataset and add knobs to the well-trained GAN

1. Instructions on the online demo

1.1 Why hosting the model on Kaggle

I host the demo as a Kaggle notebook instead of a more convenient web app due to cost considerations.

Kaggle generously provides kernels with GPUs for Free! Alternatively, a web app with a backend running on an AWS GPU instance costs ~$600 per month. Thanks to Kaggle that makes it possible for everyone to play with the model without downloading code/data to your local machine!

1.2 To use the demo

Open this link from your web browser: https://www.kaggle.com/summitkwan/tl-gan-demo

  1. Make sure you have a Kaggle account. If not, please register one (this can be done in seconds by linking to your Google or Facebook account). To have a Kaggle account is actually very rewarding, since allows you to participate numerous data science challenges and join the knowledgeable and friendly community.
  2. Fork the current notebook
  3. run the notebook by pressing the double right arrow button at the bottom left of the web page. If something does not work right, try to restart the kernel by pressing the circular-arrow button on the bottom right and rerun the notebook
  4. Go to the bottom of the notebook and play with the image interactively
  5. You are all set, play with the model:
    • Press the “-/+“ to control every feature
    • Toggle the name of feature to lock one particular feature. e.g. lock “Male” when playing with “Beard"

2. Instructions on running the code on your machine

Tested on Nvidia K80 GPU with CUDA 9.0, with Anaconda Python 3.6

2.1 Set up the code and environment

  1. Clone this repository
  2. cd to the root directory of the project (the folder containing the README.md)
  3. Install dependencies by running pip install -r requirements.txt in terminal. You can use virtual environment in order not to modify your current python environment.

2.2 Use the trained model on your machine

  1. Manually download the pre-trained pg-GAN model (provided by Nvidia), the trained feature extractor network, and the discovered feature axis from my personal dropbox link

  2. Decompress the downloaded files and put it in project directory as the following format

    root(d):
      asset_model(d):
        karras2018iclr-celebahq-1024x1024.pkl   # pretrained GAN from Nvidia
        cnn_face_attr_celeba(d):
          model_20180927_032934.h5              # trained feature extractor network
      asset_results(d):
        pg_gan_celeba_feature_direction_40(d):
          feature_direction_20181002_044444.pkl # feature axes
    
  3. Run the interactive demo by first enter interactive python shell from terminal (make sure you are at the project root directory), and then run the commands in python

    exec(open('./src/tl_gan/script_generation_interactive.py').read())

    Alternatively, you can run the interactive demo from the Jupyter Notebook at ./src/notebooks/tl_gan_ipywidgets_gui.ipynb

  4. A interactive GUI interface will pop up and play with the model

2.3 Instructions on training the model on your own

  1. Download celebA dataset python ./src/ingestion/process_celeba.py celebA
  2. to be continued...

3. Project structure

  • src : Put all source code for production within structured directory
  • data : Include example a small amount of data in the Github repository so tests can be run to validate installatio
  • static : Any images or content to include in the README or web framework if part of the pipeline
  • to be continueed

transparent_latent_gan's People

Contributors

summitkwan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transparent_latent_gan's Issues

Generating Feature Axes?

Hi,

This is great work, very interesting. I've managed to train my own GAN and feature extractor (I have my own labelled dataset), but am not having success with creating the feature extractor.

It's not clear to me what inputs src/tl-gan/script_label_regression.py require in order to generate the feature_direction_*.pkl file. Any help would be much appreciated.

Unable to run the demo on local machine

Hi,

As per your instruction i downloaded the code and ran the from the command prompt. The following is the error encountered:

deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "", line 1, in
File "", line 17, in
File "src/tl_gan/feature_axis.py", line 15
SyntaxError: Non-ASCII character '\xe2' in file src/tl_gan/feature_axis.py on line 16, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Pls suggest the way forward

FYI:
Python version 2.7
tensorflow version 1.7.0

Thanks

How to cite your work?

Hi, You did a brilliant job. We would like to leverage your structure, using a network to compute the gradient and then manipulate the noise(image). Is there any paper published about your work? If our work is done, how could we cite your work?

importError

File "I:/TL-gan/transparent_latent_gan-master/src/tl_gan/script_generation_interactive.py", line 63, in
G, D, Gs = pickle.load(file)
ImportError: No module named 'tfutil'

but we don't used tfuil

Is there a paper for TL-GAN?

Dear Mr. Guan,
while doing research, I came across your Medium-article about TL-GANs and this corresponding implementation. I find your idea really exciting and would like to use it for some experiments. Did you publish a paper about your work, so that I could dive deeper into the concept? If not, what would be your preferred way to be cited in a possible follow-up publication?
Kind regards,
Silvan Mertes

Fail to download the dataset

It seems like that we should creat a folder ./data/raw at first.
However, after done that, I still cannot download the data with the following instruction:
default

Try it out on Kubeflow?

This project is fantastic.

If anyone would be interested in trying this project out on Kubeflow let me know (kubeflow.slack.com) I'd be happy to support that by providing a Kubeflow cluster.

It would be great to understand how well the following works

Try to run the sample notebook on Kubeflow

  • Deploy Kubeflow
  • Navigate to JupyterHub
  • Launch Jupyter
  • Grab the notebook (e.g. by cloning the repo into your notebook pod).
    *Try running the notebook

Try training the model on Kubeflow

  • Upload the data to object storage (e.g GCS bucket)
  • Create a docker image containing the training code
    * Create a TFJob to train the model instrunctions

Deploy the interactive web app on Kubeflow

NCHW issue with Conv2D

My system does not have a Nvidia GPU, so CPU is used. This results in the following error:

tensorflow.python.framework.errors_impl.UnimplementedError: Generic conv implementation only supports NHWC tensor format for now.
	 [[Node: G_paper_1/Run/G_paper_1/4x4/Conv/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](G_paper_1/Run/G_paper_1/4x4/Dense/PixelNorm/mul, G_paper_1/Run/G_paper_1/4x4/Conv/mul)]]

Further investigation shows that for CPU only NHWC tensor format is supported, not NCHW. It seems there is a solution though: https://stackoverflow.com/questions/37689423/convert-between-nhwc-and-nchw-in-tensorflow

Application stuck at GUI loading

Hello!

It almost works, but gets stuck when the GUI loads. Pic included. Using latest version of tensorflow, cuda 9.0, windows 10. I'm sort of bad at python, so I'm unsure how to debug it properly.

I tried the online kaggle notebook, super cool stuff!

Thanks in advance!

Pic: Where it get stuck
issue_not_loading

UPDATE: I ran the notebook instead of trying to start an individual GUI and that worked just fine

Encoder Trainable?

Is image encoder somehow trainable in your repo, or we can just use the pre-trained model?

Generate Noise from Image

Hi @SummitKwan
Thanks for setting up the kaggle demo of your project. The project is just awesome & works great with the random images.

I see that the 'Gs' parameter fetched in below lines of code
G, D, Gs = pickle.load(file)
is being used to generate image from noise.

Can you please tell me the way to create noise from Images. That would be great help.
I am trying to test the project on any existing image.

Thanks
Akash

Should Linear Classifier be replace by a multilabel classifier?

As shown in demo ,many attribution of a generated image can be adjusted. When preparing my own dataset,such as Celeba-HQ which every sample has 40 attributions(Gray_Hair,Heavy_Makeup ,High_Cheekbones,Male...).This is obviously a multilabel classify problem, likewise with current repo ,simple LinearRegression seems work at th begining but fails as dataset size increase(from 4000 to 8000 image samples). Some of my result comes from LinearRegression model trained on 4000 image samples of Celeba-HQ

  • original image

ori_00036

  • with eyeglass

Eyeglasses_00036_old

  • with beard

No_Beard_00036_old

  • smiling

Smiling_00036

I want to change some other attribution but all failed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.