akanimax / t2f Goto Github PK

View Code? Open in Web Editor NEW

542.0 34.0 101.0 509.63 MB

T2F: text to face generation using Deep Learning

License: MIT License

Python 57.52% Shell 1.28% sed 1.43% Jupyter Notebook 39.78%

gan generative-adversarial-network adversarial-machine-learning progressively-growing-gan text-to-image

t2f's Introduction

! Attention !

This project is unfortunately not being worked upon

Please head over to the following much cooler project that takes the idea of Text-2-Image generation to the next level:

DallE: Original PyTorch

⭐ [NEW] ⭐

T2F - 2.0 Teaser (coming soon ...)

Please note that all the faces in the above samples are generated ones. The T2F 2.0 will be using MSG-GAN for the image generation module instead of ProGAN. Please refer link for more info about MSG-GAN. This update to the repository will be comeing soon 👍.

T2F

Text-to-Face generation using Deep Learning. This project combines two of the recent architectures StackGAN and ProGAN for synthesizing faces from textual descriptions.
The project uses Face2Text dataset which contains 400 facial images and textual captions for each of them. The data can be obtained by contacting either the RIVAL group or the authors of the aforementioned paper.

Some Examples:

Architecture:

The textual description is encoded into a summary vector using an LSTM network. The summary vector, i.e. Embedding (psy_t) as shown in the diagram is passed through the Conditioning Augmentation block (a single linear layer) to obtain the textual part of the latent vector (uses VAE like reparameterization technique) for the GAN as input. The second part of the latent vector is random gaussian noise. The latent vector so produced is fed to the generator part of the GAN, while the embedding is fed to the final layer of the discriminator for conditional distribution matching. The training of the GAN progresses exactly as mentioned in the ProGAN paper; i.e. layer by layer at increasing spatial resolutions. The new layer is introduced using the fade-in technique to avoid destroying previous learning.

Running the code:

The code is present in the implementation/ subdirectory. The implementation is done using the PyTorch framework. So, for running this code, please install PyTorch version 0.4.0 before continuing.

Code organization:
configs: contains the configuration files for training the network. (You can use any one, or create your own)
data_processing: package containing data processing and loading modules
networks: package contains network implementation
processed_annotations: directory stores output of running process_text_annotations.py script
process_text_annotations.py: processes the captions and stores output in processed_annotations/ directory. (no need to run this script; the pickle file is included in the repo.)
train_network.py: script for running the training the network

Sample configuration:

# All paths to different required data objects
images_dir: "../data/LFW/lfw"
processed_text_file: "processed_annotations/processed_text.pkl"
log_dir: "training_runs/11/losses/"
sample_dir: "training_runs/11/generated_samples/"
save_dir: "training_runs/11/saved_models/"

# Hyperparameters for the Model
captions_length: 100
img_dims:
  - 64
  - 64

# LSTM hyperparameters
embedding_size: 128
hidden_size: 256
num_layers: 3  # number of LSTM cells in the encoder network

# Conditioning Augmentation hyperparameters
ca_out_size: 178

# Pro GAN hyperparameters
depth: 5
latent_size: 256
learning_rate: 0.001
beta_1: 0
beta_2: 0
eps: 0.00000001
drift: 0.001
n_critic: 1

# Training hyperparameters:
epochs:
  - 160
  - 80
  - 40
  - 20
  - 10

# % of epochs for fading in the new layer
fade_in_percentage:
  - 85
  - 85
  - 85
  - 85
  - 85

batch_sizes:
  - 16
  - 16
  - 16
  - 16
  - 16

num_workers: 3
feedback_factor: 7  # number of logs generated per epoch
checkpoint_factor: 2  # save the models after these many epochs
use_matching_aware_discriminator: True  # use the matching aware discriminator

Use the requirements.txt to install all the dependencies for the project.

$ workon [your virtual environment]
$ pip install -r requirements.txt

Sample run:

$ mkdir training_runs
$ mkdir training_runs/generated_samples training_runs/losses training_runs/saved_models
$ train_network.py --config=configs/11.comf

#TODO:

1.) Create a simple demo.py for running inference on the trained models

t2f's People

Contributors

Stargazers

Watchers

Forkers

ahmedhani esmaeilinia boookmarks amoliu fjibj jdc08161063 liufeng1990 ml-lab isr-wang jalajthanaki wxb506 poisonbox templeblock friendshipity sumit33k sikandarkhan jabogithub gridl muhammedashraf9244 jbdatascience scifi85 fendaq trendingtechnology algunion drwq happog tonyle9 jiapei100 fangego haosir xiaoyun4 lancerliusong mayk93 vhsven hal2001 shanku007 akshayshende129 haif-liu seominlee 3con ashalv amirunpri2018 menlonii ncmarian wavelet2008 ersks guptam indeterminateoutcomesstudios altafr litongxin666 alanzhou0726 eexinzheng fedor-chervinskii imsrbh sethips amoonhappy zwq73 zackpashkin yes7rose makai281 kyuhwas pengdake funkegoodvibe drat mosera ripingit gsyn77 conrad-strughold laiterry nguyenpbui xrosliang utkarshx jenfung sondn141 rust512 toolzone spencer-wallace oalvay greatfeel austin-strom lx97 ridhap neeraj-satyaki cs725fer naman-k wesley-yang detrading gg-big-org ahmedman60 standardgalactic

t2f's Issues

RuntimeError: cuDNN version incompatibility: PyTorch was compiled against 7401 but linked against 7301

Since I was having issues with PyTorch 0.4.0, mainly with importing interpolate, I decided to install PyTorch 1.0.0. However when I run the train_network.py file, I get this error.

How do I fix this?

Use this repo or wait for v2?

Hi,
Thank you so much for your work! I just obtained the v2 of the dataset which has 10 times the images(4000 now) and wanted to get started on this task. Would it still be a good idea to use this repo or is T2F v2 right around the corner? Or can you suggest the changes I can do to bring this implementation as close to v2 as possible?
Thanks

Mode collapse

I replace ProGAN with MSG-StyleGAN as you mentioned before. I used 400 images from RIVAL group and mode collapse always happen. Any idea for this?
Thanks.

Broken code in train_network.py

I installed pro_gan_pytorch from your other github repo and ran train_network.py. This is the error I got.
Traceback (most recent call last): File "train_network.py", line 427, in <module> main(parse_arguments()) File "train_network.py", line 307, in main from pro_gan_pytorch.PRO_GAN import ConditionalProGAN ModuleNotFoundError: No module named 'pro_gan_pytorch'

Find/Create an open dataset

The closed nature of dataset used creates troubles for random contributors including me wishing to debug and improve.
If there is an open alternative it should be linked, if not - [collaboratively] created.

Image generated not good, not even comparable.

@akanimax @AhmedHani I trained the model till 6th depths with epoch(640,320,160,80,40,20) and batch size of 16 in each depth but the output in final pth file wasn't up to the mark. Plus it generates 16 images of a single description. Can you help it out like what exactly are those 16 images and why not a single image for a single description. The image generated isn't good as well. What can I do to improve that ? Can you provide the trained model of augmentation,encoder,gen and dis if image generated is good like you have provided for celeb using msg.
I tried for various depths and resolution but the output isn't clear.

Description:
he is an old man with a wrinkled face , gray hair and no beard he has dark eyes and seems happy about something

Image:

How do we give input to the model, and where are the processed images stored?

I have trained the model, but now I need to test it.
I took the demo.py as inspiration for the new demo, and am trying to give my custom caption as input. However I do not know how to do so.

# load the model for the demo
gen = th.nn.DataParallel(pg.Generator(depth=9))
gen.load_state_dict(th.load("GAN_GEN_SHADOW_8.pth", map_location=str(device)))

How do I change the above code for making my trained model work?

TypeError: init() got an unexpected keyword argument 'embedding_size'

When I run train_network.py,

I get the error:


Traceback (most recent call last):
  File "train_network.py", line 428, in <module>
    main(parse_arguments())
  File "train_network.py", line 381, in main
    device=device
TypeError: __init__() got an unexpected keyword argument 'embedding_size'

Does anyone know how to solve this?

Include the pytorch model of the final trained stage in the repo!

Please include the pytorch model in the repo, so that it is easy to obtain inference using demo.py or any such program.

Slack Group for GAN / Deep RL enthusiasts.

Dear watchers,

I have created a slack group for GAN and Deep RL enthusiasts. I hope we could discuss about problems faced while running code or training a GAN in general or even new potential project ideas. My hope is that if I am not available, then perhaps someone who has faced the same problem in the group could the ones in need. Proactive participation in the group will really benefit us all. I hope this group helps.

link to the group -> https://join.slack.com/t/amlrldl/shared_invite/enQtNDcyMTIxODg3NjIzLTA3MTlmMDg0YmExYjY5OTgyZTg4MTg5ZGE1YzRlYjljZmM4MzI0MTg1OTcxOTc5NDQ4ZTcwMGVkZjBjZmU5ZWM

Best regards,
Animesh

p.s. This issue will be closed in a week

Code error

Traceback (most recent call last):
File "C:/Users/zhoug/Desktop/T2F-master/implementation/train_network.py", line 427, in
main(parse_arguments())
File "C:/Users/zhoug/Desktop/T2F-master/implementation/train_network.py", line 380, in main
device=device
TypeError: init() got an unexpected keyword argument 'embedding_size'

Could you please tell me how to deal with this ERROR?

Trying a new data set which has around 200K face images and atrributes.

Dataset=> http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html
The data set is very large with respect to the LFW data set which only has 400 images in it.
I wonder if it can be used for training the model.
Given that T2F 2.0 is around the corner, the model might perform better with a larger image dataset.

TypeError: init() got an unexpected keyword argument 'embedding_size'

When I run train_network.py,

I get the error:

Traceback (most recent call last):
File "train_network.py", line 428, in
main(parse_arguments())
File "train_network.py", line 381, in main
device=device
TypeError: init() got an unexpected keyword argument 'embedding_size'

Does anyone know how to solve this?

RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #3 'index'

I am facing this error.Please help me to train the network.

RuntimeError: Expected object of type torch.cuda.LongTensor but found type torch.LongTensor for argument #3 'index'

python version incompatibility

For numpy we need python version 2.7 and for tensorflow we need 3.5., 3.6., 3.7.* . this makes the python version to be installed in conflict. what to be done in this senerio?

Generator Evaluation Metric

How would we go if we wanted to implement an evaluation metric for generator part?

I have tried to load a pre-trained discriminator addition to the discriminator that trained during training. And tested the generator with pre-trained discriminator at the end of each epoch. But I am not sure if this is a good way to measure the performance of generator.

Are there any feasible methods? I have done a little literature survey to see what are the methods of evaluating gans, but they are usually for datasets with certain classes. Since we do not have classes in T2F (or do we?) I have hard time implementing methods such as Inception Score , Frechet Inception Distance etc.

One method that I found is CrossLID (https://arxiv.org/abs/1905.00643). Which also has implementation on GitHub. However I did not try to implement it yet as I am unsure if it is suitable for this dataset-model.

I am getting this error and I checked out 1 parameter was passed in Losses.py. Can you help

Traceback (most recent call last):
File "train_network.py", line 427, in
main(parse_arguments())
File "train_network.py", line 380, in main
device=device
File "/home/mukesh/env1/local/lib/python2.7/site-packages/pro_gan_pytorch/PRO_GAN.py", line 523, in init
self.loss = self.__setup_loss(loss)
File "/home/mukesh/env1/local/lib/python2.7/site-packages/pro_gan_pytorch/PRO_GAN.py", line 552, in __setup_loss
loss = losses.CondWGAN_GP(self.device, self.dis, self.drift, use_gp=True)
File "/home/mukesh/env1/local/lib/python2.7/site-packages/pro_gan_pytorch/Losses.py", line 136, in init
super().init(device, dis)
TypeError: super() takes at least 1 argument (0 given)

should use torch==0.4.1 rather than 0.4.0

following install torch 0.4.0 as required, I encountered module 'torch.nn.functional' has no attribute 'interpolate', while is introduced in 0.4.1.

missing json

hi @akanimax，I downloaded lfw dataset but failed to find clean.json. It seemed that you have cleaned the lfw dataset，can you tell me how, or provide clean.json file? thanks very much.

RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'

I am getting this error and don't know how to solve it, can you help please?

Starting the training process ...
Traceback (most recent call last):
File "train_network.py", line 426, in
main(parse_arguments())
File "train_network.py", line 420, in main
use_matching_aware_dis=config.use_matching_aware_discriminator
File "train_network.py", line 138, in train_networks
fixed_embeddings = encoder(fixed_captions)
File "C:\Users\Giacobbe\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "D:\Download\T2F-master\T2F-master\implementation\networks\TextEncoder.py", line 42, in forward
output, (_, _) = self.network(x)
File "C:\Users\Giacobbe\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Giacobbe\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\container.py", line 92, in forward
input = module(input)
File "C:\Users\Giacobbe\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\module.py", line 493, in call
result = self.forward(*input, **kwargs)
File "C:\Users\Giacobbe\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\modules\sparse.py", line 117, in forward
self.norm_type, self.scale_grad_by_freq, self.sparse)
File "C:\Users\Giacobbe\AppData\Local\Programs\Python\Python37\lib\site-packages\torch\nn\functional.py", line 1506, in embedding
return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected object of backend CUDA but got backend CPU for argument #3 'index'