Giter Club home page Giter Club logo

hfgi's Introduction

HFGI: High-Fidelity GAN Inversion for Image Attribute Editing (CVPR 2022)

High-Fidelity GAN Inversion for Image Attribute Editing
https://tengfei-wang.github.io/HFGI/

Update: We released the inference code and the pre-trained model on Oct. 31. The training code is coming soon.
Update: We provided a Colab notebook for play.
Update: We released the training code.

paper | supp. | project website | demo video | Colab | online demo

Introduction

We present a novel high-fidelity GAN inversion framework that enables attribute editing with image-specific details well-preserved (e.g., background, appearance and illumination).

To Do

  • Release the inference code
  • Release the pretrained model
  • Release the training code

Set up

Installation

git clone https://github.com/Tengfei-Wang/HFGI.git
cd HFGI

Environment

The environment can be simply set up by Anaconda (only tested for inference):

conda create -n HFGI python=3.7
conda activate HFGI
pip install torch==1.6.0+cu101 torchvision==0.7.0+cu101 -f https://download.pytorch.org/whl/torch_stable.html
pip install matplotlib
conda install ninja
conda install -c 3dhubs gcc-5

Or, you can also set up the environment from the provided environment.yml:

conda env create -f environment.yml

Quick Start

Pretrained Models

Please download our pre-trained model and put it in ./checkpoint.

Model Description
Face Editing Trained on FFHQ.

Prepare Images

We put some images from CelebA-HQ in ./test_imgs, and you can quickly try them (and other images from CelebA-HQ or FFHQ).
For customized images, it is encouraged to first pre-process (align & crop) them, and then edit with our model. See FFHQ for alignment details.

Inference

Modify inference.sh according to the follwing instructions, and run:
(It is possibly slow for the first-time running.)

bash inference.sh
Args Description
--images_dir the path of images.
--n_sample number of images that you want to infer.
--edit_attribute We provide options of 'inversion', 'age', 'smile', 'eyes', 'lip' and 'beard' in the script.
--edit_degree control the degree of editing (works for 'age' and 'smile').

Training

Preparation

  1. Download datasets and modify the dataset path in ./configs/paths_config.py accordingly.
  2. Download some pretrained models and put them in ./pretrained.
Model Description
StyleGAN2 (FFHQ) Pretrained face generator on FFHQ from rosinality.
e4e (FFHQ) Pretrained initial encoder on FFHQ from omertov.
Feature extractor (for face) Pretrained IR-SE50 model taken from TreB1eN for ID loss calculation.
Feature extractor (for car) Pretrained ResNet-50 model taken from omertov for ID loss calculation.

Start Training

Modify option and training.sh and run:

bash train.sh

Video Editing

The source videos and edited results in our paper can be found in this link.
For video editing, we first pre-process (align & crop) each frame, and then perform editing with the pre-trained model.

More Results

Citation

If you find this work useful for your research, please cite:

@inproceedings{wang2021HFGI,
  title={High-Fidelity GAN Inversion for Image Attribute Editing},
  author={Wang, Tengfei and Zhang, Yong and Fan, Yanbo and Wang, Jue and Chen, Qifeng},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2022}
}

Acknowledgement

Thanks to omertov for sharing their code.

hfgi's People

Contributors

tengfei-wang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hfgi's Issues

Problem about the results of pose editing

Thank you for the great work! I have tried the inference code with the pretrained checkpoint for pose editing, but there are obvious artifacts in the edited images. Could you please double check that the checkpoint is correct?
BTW, why the pose editing is not included in the inference code or playground notebook?

Reverse Generation

Excuse me, why is it normal only to use cuda: 0 to invert images, while images inverted by other devices are all solid colors?

Could you provide a FFHQ 256x256 model?

Hello, I'm struggling to train a model on the FFHQ 256x256 dataset. I trained an Encoder4Editing model on the entire FFHQ dataset (66k images for training, 4k for validation) and the results look comparable to the ones in the Encoder4Editing paper. Then I trained a HFGI model based on that e4e checkpoint with good results aswell. But when I try to project an image the inversion looks noticeably different than the input image. This problem doesn't appear when I use your pretrained FFHQ 1024x1024 model. I'm assuming that it should be possible to train a 256x256 model with comparable quality.

Could you share a FFHQ 256x256 checkpoint so that I can validate my results? Thank you!

The code of ADA's training

Thank you very much for sharing!
I am very interested in your ADA module and want to use it in my work.
Could you open source the training code for this model?
Thank you very much.

HFGI_playground.ipynb

Excuse me, why is the conversion image generated when I run the playground.ipynb file solid color?

question about generating edited codes

What if I want to use this model to put a mask on a person instead of modifying age and smile? How can i generate masked face attribution edited codes?
Thanks!

Usage of discriminator for adversarial loss

In your code, you do not use a discriminator and an additional adversarial loss for better reconstruction.
This is different from what is written in the paper.
Is there another version of code that leverages a well-trained discriminator, or are the checkpoint results based on the official code without discriminator???

About Editing Hair

Really thanks for your great works! However, when I implement it on styleCLIP for hair change, after the step of adding conditions to the generator, it not only fine-tune the face but also add back the original hair on it. Could you give me some suggestions on that? Really thanks!

About inferfence

When I try to use the checkpoint which is trained on my datasets, I meet this peoblem.

RuntimeError: Error(s) in loading state_dict for Encoder4Editing: Unexpected key(s) in state_dict: "styles.14.convs.0.weight", "styles.14.convs.0.bias", "styles.14.convs.2.weight", "styles.14.convs.2.bias", "styles.14.convs.4.weight", "styles.14.convs.4.bias", "styles.14.convs.6.weight", "styles.14.convs.6.bias", "styles.14.convs.8.weight", "styles.14.convs.8.bias", "styles.14.convs.10.weight", "styles.14.convs.10.bias", "styles.14.linear.weight", "styles.14.linear.bias", "styles.15.convs.0.weight", "styles.15.convs.0.bias", "styles.15.convs.2.weight", "styles.15.convs.2.bias", "styles.15.convs.4.weight", "styles.15.convs.4.bias", "styles.15.convs.6.weight", "styles.15.convs.6.bias", "styles.15.convs.8.weight", "styles.15.convs.8.bias", "styles.15.convs.10.weight", "styles.15.convs.10.bias", "styles.15.linear.weight", "styles.15.linear.bias", "styles.16.convs.0.weight", "styles.16.convs.0.bias", "styles.16.convs.2.weight", "styles.16.convs.2.bias", "styles.16.convs.4.weight", "styles.16.convs.4.bias", "styles.16.convs.6.weight", "styles.16.convs.6.bias", "styles.16.convs.8.weight", "styles.16.convs.8.bias", "styles.16.convs.10.weight", "styles.16.convs.10.bias", "styles.16.linear.weight", "styles.16.linear.bias", "styles.17.convs.0.weight", "styles.17.convs.0.bias", "styles.17.convs.2.weight", "styles.17.convs.2.bias", "styles.17.convs.4.weight", "styles.17.convs.4.bias", "styles.17.convs.6.weight", "styles.17.convs.6.bias", "styles.17.convs.8.weight", "styles.17.convs.8.bias", "styles.17.convs.10.weight", "styles.17.convs.10.bias", "styles.17.linear.weight", "styles.17.linear.bias".

but if use the checkpoint proviede by you, there is no problem.

Question about ADA

Thanks for sharing your code and your excellent work!
While I have a question about how the ADA work on X_edit ? I notice that when training the ADA module , low-fidelity X_o is taken as target image I for alignment ,but there is no X_edit. Thanks for your reply.

Generated image issues

May I ask why the image generated by changing the encoder is not very high-definition?

The resolution of consultation branch

Hi, thank for sharing code!
I have a question about the resolution of consultation branch. As the default resolution is 64x64 in layer 7. Have you test other higher resolution, like 11 for 256, 9 for 128 as shown below:

#11 for 256, 9 for 128, 7 for 64

That usually higher resolutin and later layer might imporve the details.
Hope for your reply~

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.