apchenstu / sofgan Goto Github PK

View Code? Open in Web Editor NEW

764.0 23.0 102.0 120.7 MB

[TOG 2022] SofGAN: A Portrait Image Generator with Dynamic Styling

License: Other

Python 82.64% C++ 0.83% Cuda 4.82% Jupyter Notebook 11.71%

faces 3drendering gan

sofgan's Introduction

SofGAN (TOG 2022)

Project page | Paper

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space of portraits into two subspaces: a geometry space and a texture space. Experiments on SofGAN show that our system can generate high quality portrait images with independently controllable geometry and texture attributes.

Colab Demo

Here we provide a Colab demo, which basically demonstrated the capbility of style transfer and free-viewpoint protrait.

Installation

Install environment:

git clone https://github.com/apchenstu/sofgan.git --recursive
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2 -c pytorch
pip install tqdm argparse scikit-image lmdb config-argparse dlib

Training

Please see each subsection for training on different datasets. Available training datasets:

FFHQ
CelebA
Your own data (portrait images or segmaps)

We also provide our pre-process ffhq and celeba segmaps (in our classes labels). You may also want to re-train the SOF model base on your own multi-view segmaps.

Run

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=9999 train.py \
    --num_worker 4  --resolution 1024
   --name $exp_name
   --iter 10000000
   --batch 1 --mixing 0.9 \
   path/to/your/image/folders \
   --condition_path path/to/your/segmap/folders

In our experiments, 4x Nividia 2080Ti GPU would take around 20 days to reach 10000k iterations. Adjusting the image resolution and max iterations to suit your own dataset. Emperically, for datasets like FFHQ and CelebA(resolution 1024x1024) the network would converge after 1000k iterations and achieve fancy results.

Notice: training on none pair-wise data (image/segmap) is encouraged. Since it's one of the key features of our SofGAN.

Rendering

We provide a rendering script in renderer.ipynb, where you can restyle your own photos, videos and generate free-viewpoint portrait images while maintaining the geometry consistency. Just to download our checkpoints and unzip to the root folder.

UI Illustration

The Painter is included in Painter, you can pull down and drawing on-the-fly. Before that, you need to install the enviroment with pip install -r ./Painter/requirements.txt

IOS App

You could download and try the Wand, an IOS App developed by Deemos.

Online Demo

New Folder

Relevant Works

StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows (TOG 2021)
Rameen Abdal, Peihao Zhu, Niloy Mitra, Peter Wonka

SEAN: Image Synthesis With Semantic Region-Adaptive Normalization (CVPR 2020)
Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka

StyleRig: Rigging StyleGAN for 3D Control over Portrait Images (CVPR 2020)
A. Tewari, M. Elgharib, G. Bharaj, F. Bernard, H.P. Seidel, P. Pérez, M. Zollhöfer, Ch. Theobalt

StyleGAN2: Analyzing and Improving the Image Quality of {StyleGAN} (CVPR 2020)
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila

SPADE: Semantic Image Synthesis with Spatially-Adaptive Normalization (CVPR 2019)
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu

Citation

If you find our code or paper helps, please cite:

@article{sofgan,
  title={Sofgan: A portrait image generator with dynamic styling},
  author={Chen, Anpei and Liu, Ruiyang and Xie, Ling and Chen, Zhang and Su, Hao and Yu, Jingyi},
  journal={ACM Transactions on Graphics (TOG)},
  volume={41},
  number={1},
  pages={1--26},
  year={2022},
  publisher={ACM New York, NY}
}

sofgan's People

Contributors

Stargazers

Watchers

Forkers

xix-void tts-nlp bzkarcade pingponglabs githubcrj c1a1o1 jia611 tlwzzy zhangxinhoney lzhbrian jack12xl trendingtechnology konstantin-pv utkudogan benjamesbabala yjhuasheng magician011001 msfrms ygexe wenlong66 catkevin guoxiansong peterzhousz mfrank2016 cv-ip tzuren dgzhdx guoyilin davidqiuhr wmjhome stjordanis sakibapon ai-machine-vision-lab evolution99 euui yuzhou164 easy-shu afidegnum chaoyangqi jaedukseo longyininitial xrosliang idsj seanwen86 syguan96 jiadong-tech miaopus popmeshgrid deepanprabhu aiaini66 kingsf5 lfpljy aaronwong sitongzhen jjandnn kudonct yonghoonkwon arthur-qiu sfdux hivefans akeboshi-min banduoba sinntalker lokitla bigbigbee233 liuqinglong110 gjzkeyframe askintution jhssugi ashq3 cellinlab hhy5277 sunsetmkt hktmxk lanceyan iramshiv chenpeng606 basemdabbour 41xu hajungong007 cv-synthesis ebartrum yuanpli uoox peterzs algcctk dancingmader xjh880000 scheuclu zouxiaoyuonly wxdayone twenty-zp dkscksals12 gg-big-org atlury luoshutu zhongwei8 jackzhousz zhangjiwei-japan osintkemaly

sofgan's Issues

what is the inference speed?

what is the inference speed of one image on RTX 2080Ti? Thanks!

Can you release the code of inversing the real images

Impressed by your elegant work!
Can you release the code of inversing the real images; That will be a great help!

subprocess.CalledProcessError: Command '['which', 'g++ -std=gnu++0x']' returned non-zero exit status 1.

执行时报错

W+空间改变纹理的问题

@apchenstu ,作者您好，向您请教一个问题，我看到文章中有提到对于真实图片进行纹理样式替换的做法，我按照您的步骤，
当我使用重建得到W+ latent [18,512]，和random得到的w+ [2,18,512]中的[1,18,512],拼接成新的W+[2,18,512]后进行头发区域风格编辑，得到的图刘海部分会有红边，如下图所示

但是当我使用重建得到的W latent [512], 和random得到的styles [2,512]中的[1,512]，拼接成新的style后转为W+ [2,18,512]就不会出现上图的问题，但是W空间重建后的图又和原图差异特别大，如下图所示，很难看出这个和上面的是一个人

。
请问作者，我对真实图片进行区域纹理样式改变的做法哪个环节出了问题，我应该怎么去做，期待您的回复，辛苦了！

What is the function 'scatter' in file 'utils.py' do?

I am curious about the role of this function 'scatter', can you give me a explanation of this function?

colab demo

please add colab demo

I want to change expressions on segmap

How do you do it here https://www.youtube.com/watch?v=xig8ZA3DVZ8 at 5:12?

Does it need alignment before sending segmap and rgb images into SIW network?

Hi, as your SIW network was trained on FFHQ, so if I test on my own real images, do I have to make alignment on those segmaps output from SOF?

Painter ran away, no result after drawing？

How to project the real-captured photos into your texture space?

Hi,

I am trying to use this model to edit real-captured photos but I think the exsisting renderer.ipynb file only using the random style. In the paper, there are visualization results for regionally real photo editing(Fig. 27).
So I would like to ask how to project the real-captured photos into your texture space? And after that, is it right that scatter_to_mask function should be used and then use the generated style_masks to control the editing region?

代码发布

请问代码什么时候发布？

Why do the Generator not use the real images in train.py

I looked through the train.py, and didn't find the Generator use the real image.
I doubt if the Generator load a pre-train model that have learned from a lot of real images.
Now, I want to use sofgan to generate an anime photo, can I achieve it after a model train only using the real anime pictures and its segmaps.

Generate image follow specific direction

Thank you for the great work. Any solution for moving input image in specific direction, for example: making it look younger or older?

Failure when loading "modules.model_seg_input2" in projector.py

As title, it seems there is no model_seg_input2 in modules. How can I load it?

how to generate mutiview segmap for own image ?

as described:how to generate mutiview segmap for own image ,just like trump

about --load_network error

windows10 cuda 11.1 pytorch1.8.2 python 3.8

My train speed is much slower than yours

How can I improve the train speed?

It takes 2 hours to make a train with 1000 iterations on 2x GeForce RTX 3090, and 10000k will need 833 days but your train only 20 days.

my train command is as follows :
python -m torch.distributed.launch --nproc_per_node=2 --master_port=9999 train.py --num_worker 4 --resolution 1024 --name Jeric --iter 1000 --batch 1 --mixing 0.9 path-to-your-image-folders --condition_path path-to-your-segmap-folders

path-to-your-image-folders, set to the CelebA-HQ-img folder of Celeb dataset.
path-to-your-segmap-folders , set to the CelebAMask-HQ folder downloaded from your pre-process ffhq and celeba segmaps.
trained on Windows 10

Thanks.

The attributes of segmap

What are the attributes of segmap for every class? Thank you！

the style_mask is not valid

when i give style-mask to generate,the fake_img change without the style_mask？did i do something wrong?

About the anime face parsing.

Hello! Thanks for you amazing work. I am very interesting about your job. I am studying on some anime projects, on the project homepage(https://apchenstu.github.io/sofgan/), I saw that you show the editing effect of an animation (Video 5b: Generation from drawing). Where did the animation parsing data you use come from? Can you share the anime data link?

Scripts for texture style transfer

Hi,
Thanks for your solid work with released codes. I wonder if there is any scripts for reproduing the texture styling results of fig 12 in your paper, i.e. given a reference image A and target image B, we can transfer A's texture to B, while keeping B's shape maintained.

关于segNet模型

想请教一下segNet-20Class.pth这个模型是怎么训练出来的呢？还有condition img那个segmaps的文件夹实际是什么意义呢？

Can you release the code of the tool of generate segmap

I can download your pre-process ffhq and celeba segmaps, but can't find the tool's code, can you release its code or the tool?
Now , I want to train the model for anime, and need to generate anime's segmap for train.

Painter跑不起来

大佬，运行Painter/run_UI.py后能画图，但是右边出不来图片。点击Render或者change style后，提示我需要指定load checkpoint。
而run_UI.py中，load checkpoint时，需要load一个model，在modules/sofgan.pt. 然而在工程下没有找到这个pt的下载路径。 Rendering中下载的checkpoint文件中，也没有这个pt文件。求问怎么操作。。。

Irises repeat themselves to fill out eyes

When drawing admittedly unrealistic eye sizes sofgan will repeat an iris and corners inside of the eye instead of making a larger iris to fill the eye shape. This is a problem when trying to draw stylized faces or caricatures.

Perhaps warping the photos and segmaps with PyTorch's grid sampler to bend, twist, shrink, expand, fisheye, etc will encourage the model to fill eyes with a single iris that fits the eye shape and size.

batch size?

Hi, this is truly a great contribution. You mentioned that training the model that predict 1024*1024 image took approximately 20 days on 4 RTX2080Ti, may I know what's the batch size you use?

Your pre-process ffhq and celeba segmaps zip file contain duplicate files

Hi, thank you for your amazing work!
I found an error in the zip file you provided : "segmaps.zip" from https://drive.google.com/file/d/1_gSENMI5hYj-JTjqtn14PkoLLnEp94oY/view?usp=sharing
This zip file contains two directories : "FFHQ" and "CelebAMask-HQ". However, these two directories contain totally duplicate files. It seems that the directory "CelebAMask-HQ" contains wrong content.
Could you please provide the pre-processed segmaps for CelebAMask-HQ?

When will the code be released?

Hi, thank you for your great work. How long will it be before you release the code and model?

有成功run起来的吗？分享下版本和步骤

which function should I use to generate "region-wise distance map P"

hi, I have some questions about "region-wise distance map P"

If we use the "scatter_to_mask" function to generate style_mask, the region-wise distance map can only be 0 or 1

style_mask = scatter_to_mask(condition_img.clone(), labels)

Fig 1. style_mask generated by "scatter_to_mask" function

If we use "scatter_to_mask_perregion" function to generate style_mask, the region-wise distance map can be a float in [0,1]

style_mask = scatter_to_mask_perregion(condition_img.clone(), labels)

Fig 2. style_mask generated by "scatter_to_mask_perregion" function

My question is, I found that the released code uses "scatter_to_mask", but the picture of "style-mixing" in the paper is not a 0-1 binary mask. So could you please tell me which function should I use? Thank you very much!

I guess you use scatter_to_mask because it has better training results, while scatter_to_mask_perregion is used for visualization. haha.

Fig 3. "style mixing"

what's the meaning of classwiseStyle parameter?

Hi, and thank you for your amazing work!

In SIW-StyleGAN architecture, the classwiseStyle parameter is always set to False, but what is it for? In which cases should it be used?

Can you add ./ckpts/* files?

I haven't found the follow 2 files when running renderer.ipynb.

./ckpts/generator.pt
./ckpts/segNet-20Class.pth

I have one GTX1660 super having 1408 stream processors.
So, it takes too long to train this model.
Can you help me?

Checkpoint file in colab doesn't exist

Trying the colab version, but:

!gdown https://drive.google.com/uc?id=1LPKU3AJVlhnyXBGzLS0UrOEhIT1gcFpD
Gives error, manually checking the link finds that the file is missing.

Avenue to explore - greenscreen background

because this model has ability to do background isolation - this seems to make it trivial to isolate the background for green screen purpose. This is big - because to swap in an generated gan image into a video - you generally get artifacts / boundary borders / or a box when you drop in a generated image.

in other words - if you take a video - run it through ffmpeg - to get all the frames -
run a face detection pass - then have sofgan spit out updated image - but run a background isolation/ green screening- you could have a high quality replacement face.

fyi - @Norod