Giter Club home page Giter Club logo

sofgan's Introduction

SofGAN (TOG 2022)

This repository contains the official PyTorch implementation for the paper: SofGAN: A Portrait Image Generator with Dynamic Styling. We propose a SofGAN image generator to decouple the latent space of portraits into two subspaces: a geometry space and a texture space. Experiments on SofGAN show that our system can generate high quality portrait images with independently controllable geometry and texture attributes.

Teaser

Colab Demo

Here we provide a Colab demo, which basically demonstrated the capbility of style transfer and free-viewpoint protrait.

Installation

version version versionversion

Install environment:

git clone https://github.com/apchenstu/sofgan.git --recursive
conda install pytorch==1.7.1 torchvision==0.8.2 cudatoolkit=10.2 -c pytorch
pip install tqdm argparse scikit-image lmdb config-argparse dlib

Training

Please see each subsection for training on different datasets. Available training datasets:

We also provide our pre-process ffhq and celeba segmaps (in our classes labels). You may also want to re-train the SOF model base on your own multi-view segmaps.

Run

CUDA_VISIBLE_DEVICES=0,1,2,3 python -m torch.distributed.launch --nproc_per_node=4 --master_port=9999 train.py \
    --num_worker 4  --resolution 1024
   --name $exp_name
   --iter 10000000
   --batch 1 --mixing 0.9 \
   path/to/your/image/folders \
   --condition_path path/to/your/segmap/folders

In our experiments, 4x Nividia 2080Ti GPU would take around 20 days to reach 10000k iterations. Adjusting the image resolution and max iterations to suit your own dataset. Emperically, for datasets like FFHQ and CelebA(resolution 1024x1024) the network would converge after 1000k iterations and achieve fancy results.

Notice: training on none pair-wise data (image/segmap) is encouraged. Since it's one of the key features of our SofGAN.

Rendering

We provide a rendering script in renderer.ipynb, where you can restyle your own photos, videos and generate free-viewpoint portrait images while maintaining the geometry consistency. Just to download our checkpoints and unzip to the root folder.

UI Illustration

The Painter is included in Painter, you can pull down and drawing on-the-fly. Before that, you need to install the enviroment with pip install -r ./Painter/requirements.txt

UI

IOS App

You could download and try the Wand, an IOS App developed by Deemos.

two-dimensions

Online Demo

New Folder

Relevant Works

StyleFlow: Attribute-conditioned Exploration of StyleGAN-Generated Images using Conditional Continuous Normalizing Flows (TOG 2021)
Rameen Abdal, Peihao Zhu, Niloy Mitra, Peter Wonka

SEAN: Image Synthesis With Semantic Region-Adaptive Normalization (CVPR 2020)
Peihao Zhu, Rameen Abdal, Yipeng Qin, Peter Wonka

StyleRig: Rigging StyleGAN for 3D Control over Portrait Images (CVPR 2020)
A. Tewari, M. Elgharib, G. Bharaj, F. Bernard, H.P. Seidel, P. Pérez, M. Zollhöfer, Ch. Theobalt

StyleGAN2: Analyzing and Improving the Image Quality of {StyleGAN} (CVPR 2020)
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, Timo Aila

SPADE: Semantic Image Synthesis with Spatially-Adaptive Normalization (CVPR 2019)
Taesung Park, Ming-Yu Liu, Ting-Chun Wang, Jun-Yan Zhu

Citation

If you find our code or paper helps, please cite:

@article{sofgan,
  title={Sofgan: A portrait image generator with dynamic styling},
  author={Chen, Anpei and Liu, Ruiyang and Xie, Ling and Chen, Zhang and Su, Hao and Yu, Jingyi},
  journal={ACM Transactions on Graphics (TOG)},
  volume={41},
  number={1},
  pages={1--26},
  year={2022},
  publisher={ACM New York, NY}
}

sofgan's People

Contributors

apchenstu avatar idsj avatar jack12xl avatar tts-nlp avatar walnut-ree avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sofgan's Issues

W+空间改变纹理的问题

@apchenstu ,作者您好,向您请教一个问题,我看到文章中有提到对于真实图片进行纹理样式替换的做法,我按照您的步骤,
当我使用重建得到W+ latent [18,512],和random得到的w+ [2,18,512]中的[1,18,512],拼接成新的W+[2,18,512]后进行头发区域风格编辑,得到的图刘海部分会有红边,如下图所示
image
但是当我使用重建得到的W latent [512], 和random得到的styles [2,512]中的[1,512],拼接成新的style后转为W+ [2,18,512]就不会出现上图的问题,但是W空间重建后的图又和原图差异特别大,如下图所示,很难看出这个和上面的是一个人
image

请问作者,我对真实图片进行区域纹理样式改变的做法哪个环节出了问题,我应该怎么去做,期待您的回复,辛苦了!

How to project the real-captured photos into your texture space?

Hi,

I am trying to use this model to edit real-captured photos but I think the exsisting renderer.ipynb file only using the random style. In the paper, there are visualization results for regionally real photo editing(Fig. 27).
So I would like to ask how to project the real-captured photos into your texture space? And after that, is it right that scatter_to_mask function should be used and then use the generated style_masks to control the editing region?

Why do the Generator not use the real images in train.py

I looked through the train.py, and didn't find the Generator use the real image.
I doubt if the Generator load a pre-train model that have learned from a lot of real images.
Now, I want to use sofgan to generate an anime photo, can I achieve it after a model train only using the real anime pictures and its segmaps.

My train speed is much slower than yours

How can I improve the train speed?

It takes 2 hours to make a train with 1000 iterations on 2x GeForce RTX 3090, and 10000k will need 833 days but your train only 20 days.

my train command is as follows :
python -m torch.distributed.launch --nproc_per_node=2 --master_port=9999 train.py --num_worker 4 --resolution 1024 --name Jeric --iter 1000 --batch 1 --mixing 0.9 path-to-your-image-folders --condition_path path-to-your-segmap-folders

  • path-to-your-image-folders, set to the CelebA-HQ-img folder of Celeb dataset.

  • path-to-your-segmap-folders , set to the CelebAMask-HQ folder downloaded from your pre-process ffhq and celeba segmaps.

  • trained on Windows 10

Thanks.

the style_mask is not valid

when i give style-mask to generate,the fake_img change without the style_mask?did i do something wrong?

About the anime face parsing.

Hello! Thanks for you amazing work. I am very interesting about your job. I am studying on some anime projects, on the project homepage(https://apchenstu.github.io/sofgan/), I saw that you show the editing effect of an animation (Video 5b: Generation from drawing). Where did the animation parsing data you use come from? Can you share the anime data link?

Scripts for texture style transfer

Hi,
Thanks for your solid work with released codes. I wonder if there is any scripts for reproduing the texture styling results of fig 12 in your paper, i.e. given a reference image A and target image B, we can transfer A's texture to B, while keeping B's shape maintained.

关于segNet模型

想请教一下segNet-20Class.pth这个模型是怎么训练出来的呢?还有condition img那个segmaps的文件夹实际是什么意义呢?

Painter跑不起来

大佬,运行Painter/run_UI.py后能画图,但是右边出不来图片。点击Render或者change style后,提示我需要指定load checkpoint。
而run_UI.py中,load checkpoint时,需要load一个model,在modules/sofgan.pt. 然而在工程下没有找到这个pt的下载路径。 Rendering中下载的checkpoint文件中,也没有这个pt文件。 求问怎么操作。。。

Irises repeat themselves to fill out eyes

When drawing admittedly unrealistic eye sizes sofgan will repeat an iris and corners inside of the eye instead of making a larger iris to fill the eye shape. This is a problem when trying to draw stylized faces or caricatures.

Perhaps warping the photos and segmaps with PyTorch's grid sampler to bend, twist, shrink, expand, fisheye, etc will encourage the model to fill eyes with a single iris that fits the eye shape and size.

batch size?

Hi, this is truly a great contribution. You mentioned that training the model that predict 1024*1024 image took approximately 20 days on 4 RTX2080Ti, may I know what's the batch size you use?

Your pre-process ffhq and celeba segmaps zip file contain duplicate files

Hi, thank you for your amazing work!
I found an error in the zip file you provided : "segmaps.zip" from https://drive.google.com/file/d/1_gSENMI5hYj-JTjqtn14PkoLLnEp94oY/view?usp=sharing
This zip file contains two directories : "FFHQ" and "CelebAMask-HQ". However, these two directories contain totally duplicate files. It seems that the directory "CelebAMask-HQ" contains wrong content.
Could you please provide the pre-processed segmaps for CelebAMask-HQ?

which function should I use to generate "region-wise distance map P"

hi, I have some questions about "region-wise distance map P"

If we use the "scatter_to_mask" function to generate style_mask, the region-wise distance map can only be 0 or 1

style_mask = scatter_to_mask(condition_img.clone(), labels)

image

Fig 1. style_mask generated by "scatter_to_mask" function

If we use "scatter_to_mask_perregion" function to generate style_mask, the region-wise distance map can be a float in [0,1]

style_mask = scatter_to_mask_perregion(condition_img.clone(), labels)

image

Fig 2. style_mask generated by "scatter_to_mask_perregion" function

My question is, I found that the released code uses "scatter_to_mask", but the picture of "style-mixing" in the paper is not a 0-1 binary mask. So could you please tell me which function should I use? Thank you very much!

I guess you use scatter_to_mask because it has better training results, while scatter_to_mask_perregion is used for visualization. haha.

image

Fig 3. "style mixing"

Can you add ./ckpts/* files?

I haven't found the follow 2 files when running renderer.ipynb.

  • ./ckpts/generator.pt
  • ./ckpts/segNet-20Class.pth

I have one GTX1660 super having 1408 stream processors.
So, it takes too long to train this model.
Can you help me?

Avenue to explore - greenscreen background

because this model has ability to do background isolation - this seems to make it trivial to isolate the background for green screen purpose. This is big - because to swap in an generated gan image into a video - you generally get artifacts / boundary borders / or a box when you drop in a generated image.

in other words - if you take a video - run it through ffmpeg - to get all the frames -
run a face detection pass - then have sofgan spit out updated image - but run a background isolation/ green screening- you could have a high quality replacement face.

fyi - @Norod

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.