summitkwan / transparent_latent_gan Goto Github PK

Use supervised learning to illuminate the latent space of GAN for controlled generation and edit

License: MIT License

Python 96.52% Jupyter Notebook 3.48%

transparent_latent_gan's Introduction

TL-GAN: transparent latent-space GAN

This is the repository of my three-week project: "Draw as you can tell: controlled image synthesis and edit using TL-GAN"

Resource lists:

Blog post expaining the motivation, architecture and results is posted on this Medium link
Slides presentation explaining the core ideas of this project are available at this Google Drive link
A video presentation of this project will be available soon on YouTube
An interactive demo can be found in this Kaggle notebook: https://www.kaggle.com/summitkwan/tl-gan-demo, have fun playing with this model!

A high quaility video of the above GIF on YouTube

Core ideas

This project provides a novel method to control the generation process of a unsupervisedly-trained generative model like GAN (generative adversarial network).
GANs can generate random photo-realistic images from random noise vectors in the latent space (see stunning examples of the Nvidia's PG-GAN), but we can no control over the features of the generated images.
Knowing that the images are determined by the noise vector in the latent space, if we can understand the latent space, we can control our generation process.
For a already well-trained GAN generator, I made its latent space transparent by discovering feature axes in it. When a vector moves along a feature axis in the latent space, the corresponding image morphs along that feature, which enables controlled synthesis and edit.
This is achieved by leveraging a coupled feature extractor network (a CNN here in this demo, but can be any other CV techniques), which enables us to find correlation between noise vectors and image features.
Advantages of this method over conditional GAN and AC-GAN:
- Efficiency: To add a new controller of the generator, you do not have to re-train the GAN model, thus it only takes <1h to add 40 knobs with out methods.
- Flexibility: You could use different feature extractors trained on different dataset and add knobs to the well-trained GAN

1. Instructions on the online demo

1.1 Why hosting the model on Kaggle

I host the demo as a Kaggle notebook instead of a more convenient web app due to cost considerations.

Kaggle generously provides kernels with GPUs for Free! Alternatively, a web app with a backend running on an AWS GPU instance costs ~$600 per month. Thanks to Kaggle that makes it possible for everyone to play with the model without downloading code/data to your local machine!

1.2 To use the demo

Open this link from your web browser: https://www.kaggle.com/summitkwan/tl-gan-demo

Make sure you have a Kaggle account. If not, please register one (this can be done in seconds by linking to your Google or Facebook account). To have a Kaggle account is actually very rewarding, since allows you to participate numerous data science challenges and join the knowledgeable and friendly community.
Fork the current notebook
run the notebook by pressing the double right arrow button at the bottom left of the web page. If something does not work right, try to restart the kernel by pressing the circular-arrow button on the bottom right and rerun the notebook
Go to the bottom of the notebook and play with the image interactively
You are all set, play with the model:
- Press the “-/+“ to control every feature
- Toggle the name of feature to lock one particular feature. e.g. lock “Male” when playing with “Beard"

2. Instructions on running the code on your machine

Tested on Nvidia K80 GPU with CUDA 9.0, with Anaconda Python 3.6

2.1 Set up the code and environment

Clone this repository
cd to the root directory of the project (the folder containing the README.md)
Install dependencies by running pip install -r requirements.txt in terminal. You can use virtual environment in order not to modify your current python environment.

2.2 Use the trained model on your machine

Manually download the pre-trained pg-GAN model (provided by Nvidia), the trained feature extractor network, and the discovered feature axis from my personal dropbox link

Decompress the downloaded files and put it in project directory as the following format

root(d):
  asset_model(d):
    karras2018iclr-celebahq-1024x1024.pkl   # pretrained GAN from Nvidia
    cnn_face_attr_celeba(d):
      model_20180927_032934.h5              # trained feature extractor network
  asset_results(d):
    pg_gan_celeba_feature_direction_40(d):
      feature_direction_20181002_044444.pkl # feature axes

Run the interactive demo by first enter interactive python shell from terminal (make sure you are at the project root directory), and then run the commands in python
```
exec(open('./src/tl_gan/script_generation_interactive.py').read())
```
Alternatively, you can run the interactive demo from the Jupyter Notebook at ./src/notebooks/tl_gan_ipywidgets_gui.ipynb
A interactive GUI interface will pop up and play with the model

2.3 Instructions on training the model on your own

Download celebA dataset python ./src/ingestion/process_celeba.py celebA
to be continued...

3. Project structure

src : Put all source code for production within structured directory
data : Include example a small amount of data in the Github repository so tests can be run to validate installatio
static : Any images or content to include in the README or web framework if part of the pipeline
to be continueed

transparent_latent_gan's People

Contributors

Stargazers

Watchers

Forkers

mahmud83 jdc08161063 joizhang2012 likeucode mathieuorhan shafiahmed zweed4u rogalag chuckcho number0 tonycwu knicholes mafm aurametrix vibster jackylee1 shaunstanislauslau samangel93 hailongmrli ghosthamlet federicosan morristech jglete hbcbh1999 luantruong spencerx pelfsollution rockjicks legend23 jikkimi edroot dgreyling ericpollen leolordwz jinghongmiao mistobaan pixel-perfect-metodology circularboy arturtumasov maksymdelta paveltorgashov chorry liumaoshen onexuan kingofoz ahn19 matt-morris bradparks icecoffee2011 smolderant dirkvandelindt codeaudit zlatyba klonggan geoffmomin prhldk jaynoel zedzero yregaieg yesihernandez smsalaken tranvukhanh pppdns jhonpineda arthurj 1310aditya vsuriya93 templeblock zorrock jiths esmaeilinia dreadlord1984 jdetras richgit101 fendaq alabarga sherlock42 harsha-20 unitingcoders eddiecityu sondro smilejx dnzengou tony32769 hefv57 mkim0710 ionvision hunglethanh9 pandinosaurus andrewchan2022 marcinwal zaghlol94 ajinkyapuar sarathknv benzei himdib-tech leandrajade janicelc osirisjs victor-gun

transparent_latent_gan's Issues

Generating Feature Axes?

Hi,

This is great work, very interesting. I've managed to train my own GAN and feature extractor (I have my own labelled dataset), but am not having success with creating the feature extractor.

It's not clear to me what inputs src/tl-gan/script_label_regression.py require in order to generate the feature_direction_*.pkl file. Any help would be much appreciated.

module 'tensorflow' has no attribute 'get_default_graph'

module 'tensorflow' has no attribute 'get_default_graph' while loading pickle file

Unable to run the demo on local machine

Hi,

As per your instruction i downloaded the code and ran the from the command prompt. The following is the error encountered:

deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
Traceback (most recent call last):
File "", line 1, in
File "", line 17, in
File "src/tl_gan/feature_axis.py", line 15
SyntaxError: Non-ASCII character '\xe2' in file src/tl_gan/feature_axis.py on line 16, but no encoding declared; see http://python.org/dev/peps/pep-0263/ for details

Pls suggest the way forward

FYI:
Python version 2.7
tensorflow version 1.7.0

Thanks

How to cite your work?

Hi, You did a brilliant job. We would like to leverage your structure, using a network to compute the gradient and then manipulate the noise(image). Is there any paper published about your work? If our work is done, how could we cite your work?

importError

File "I:/TL-gan/transparent_latent_gan-master/src/tl_gan/script_generation_interactive.py", line 63, in
G, D, Gs = pickle.load(file)
ImportError: No module named 'tfutil'

but we don't used tfuil

2.3 Instructions on training the model on your own:似乎没有找到训练代码，期待指导。

似乎没有找到训练代码，期待指导。

Is there a paper for TL-GAN?

Dear Mr. Guan,
while doing research, I came across your Medium-article about TL-GANs and this corresponding implementation. I find your idea really exciting and would like to use it for some experiments. Did you publish a paper about your work, so that I could dive deeper into the concept? If not, what would be your preferred way to be cited in a possible follow-up publication?
Kind regards,
Silvan Mertes

is it possible to test on random online photo?

First of all, Great work! Thank you for sharing the code

I wonder if it is possible to test on custom online photo, e.g. using my photo on the kaggle notebook.

Thanks,

Fail to download the dataset

It seems like that we should creat a folder ./data/raw at first.
However, after done that, I still cannot download the data with the following instruction:

Try it out on Kubeflow?

This project is fantastic.

If anyone would be interested in trying this project out on Kubeflow let me know (kubeflow.slack.com) I'd be happy to support that by providing a Kubeflow cluster.

It would be great to understand how well the following works

Try to run the sample notebook on Kubeflow

Deploy Kubeflow
Navigate to JupyterHub
Launch Jupyter
Grab the notebook (e.g. by cloning the repo into your notebook pod).
*Try running the notebook

Try training the model on Kubeflow

Upload the data to object storage (e.g GCS bucket)
Create a docker image containing the training code
* Create a TFJob to train the model instrunctions

Deploy the interactive web app on Kubeflow

Build a Docker image for the web app
Create a K8s deployment and K8s service
* Add an Ambassador annotation to the K8s service to add a reverse proxy route to the Kubeflow ingress

How to train the model in my local machine

Hi, It's brilliant work. I want the code to get executed on my machine for training the model.

Instructions on training the model

Thanks for sharing the repo. I wonder when will you be able to share the training part with instructions.

Thanks!

NCHW issue with Conv2D

My system does not have a Nvidia GPU, so CPU is used. This results in the following error:

tensorflow.python.framework.errors_impl.UnimplementedError: Generic conv implementation only supports NHWC tensor format for now.
	 [[Node: G_paper_1/Run/G_paper_1/4x4/Conv/Conv2D = Conv2D[T=DT_FLOAT, data_format="NCHW", dilations=[1, 1, 1, 1], padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](G_paper_1/Run/G_paper_1/4x4/Dense/PixelNorm/mul, G_paper_1/Run/G_paper_1/4x4/Conv/mul)]]

Further investigation shows that for CPU only NHWC tensor format is supported, not NCHW. It seems there is a solution though: https://stackoverflow.com/questions/37689423/convert-between-nhwc-and-nchw-in-tensorflow

Application stuck at GUI loading

Hello!

It almost works, but gets stuck when the GUI loads. Pic included. Using latest version of tensorflow, cuda 9.0, windows 10. I'm sort of bad at python, so I'm unsure how to debug it properly.

I tried the online kaggle notebook, super cool stuff!

Thanks in advance!

Pic: Where it get stuck

UPDATE: I ran the notebook instead of trying to start an individual GUI and that worked just fine

Encoder Trainable?

Is image encoder somehow trainable in your repo, or we can just use the pre-trained model?

Generate Noise from Image

Hi @SummitKwan
Thanks for setting up the kaggle demo of your project. The project is just awesome & works great with the random images.

I see that the 'Gs' parameter fetched in below lines of code
G, D, Gs = pickle.load(file)
is being used to generate image from noise.

Can you please tell me the way to create noise from Images. That would be great help.
I am trying to test the project on any existing image.

Thanks
Akash

Should Linear Classifier be replace by a multilabel classifier?

As shown in demo ,many attribution of a generated image can be adjusted. When preparing my own dataset,such as Celeba-HQ which every sample has 40 attributions(Gray_Hair,Heavy_Makeup ,High_Cheekbones,Male...).This is obviously a multilabel classify problem, likewise with current repo ,simple LinearRegression seems work at th begining but fails as dataset size increase(from 4000 to 8000 image samples). Some of my result comes from LinearRegression model trained on 4000 image samples of Celeba-HQ

original image

with eyeglass

with beard

smiling

I want to change some other attribution but all failed