Giter Club home page Giter Club logo

stylegan2-projecting-images's Introduction

StyleGAN2: projecting images

The goal of this Google Colab notebook is to project images to latent space with StyleGAN2.

Usage

To discover how to project a real image using the original StyleGAN2 implementation, run:

To process the projection of a batch of images, using either W(1,*) (original) or W(18,*) (extended), run:

To edit latent vectors of projected images, run:

For more information about W(1,*) and W(18,*), please refer to the the original paper (section 5 on page 7):

Inverting the synthesis network $g$ is an interesting problem that has many applications. Manipulating a given image in the latent feature space requires finding a matching latent code $w$ for it first.

The following is about W(18,*):

Previous research suggests that instead of finding a common latent code $w$, the results improve if a separate $w$ is chosen for each layer of the generator. The same approach was used in an early encoder implementation.

The following is about W(1,*), which is the approach used in the original implementation:

While extending the latent space in this fashion finds a closer match to a given image, it also enables projecting arbitrary images that should have no latent representation. Instead, we concentrate on finding latent codes in the original, unextended latent space, as these correspond to images that the generator could have produced.

Data

Data consists of:

Original image of the French president

Pre-processing

There are two possible pre-processing methods:

  • either center-cropping (to 1024x1024 resolution) as sole pre-processing,

Center-cropping

  • or the same pre-processing as for the FFHQ dataset:
    1. first, an alignment based on 68 face landmarks returned by dlib,
    2. then reproduce recreate_aligned_images(), as detailed in FFHQ pre-processing code.

Face landmarks

FFHQ pre-processing

Finally, the pre-processed image can be projected to the latent space of the StyleGAN2 model trained with configuration f on the Flickr-Faces-HQ (FFHQ) dataset.

Results: influence of pre-processing

NB: results are different if the code is run twice, even if the same pre-processing is used.

With center-cropping as sole pre-processing

The result below is obtained with center-cropping as sole pre-processing, hence some issues with the projection.

Projection (with issues) as GIF

From left to right: the target image, the result obtained at the start of the projection, and the final result of the projection.

Target imageProjected image n°1/5Projected image n°5/5

From left to right: the target image, the result obtained at the start of the projection, intermediate results, and the final result.

Projection results (with issues) as PNG

The background, the hair, the ears, and the suit are relatively well reproduced, but the face is wrong, especially the neck (in the original image) is confused with the chin (in the projected images). It is possible that the face is too small relatively to the rest of the image, compared to the FFHQ training dataset, hence the poor results of the projection.

With the same pre-processing as for the FFHQ dataset

The result below is obtained with the same pre-processing as for the FFHQ dataset, which allows to avoid the projection issues mentioned above.

Projection (without issues) as GIF

From left to right: the target image, the result obtained at the start of the projection, and the final result of the projection.

Target imageProjected image n°1/5Projected image n°5/5

From left to right: the target image, the result obtained at the start of the projection, intermediate results, and the final result.

Projection results as PNG

Results: comparison with the extended projection

For the rest of the repository, the same-preprocessing as for the FFHQ dataset is used.

Shared data on Google Drive

Additional projection results are shown on the Wiki.

To make it easier to download them, they are also shared on Google Drive.

The directory structure is as follows:

stylegan2_projections/
├ aligned_images/
├ └ emmanuel-macron_01.png    # FFHQ-aligned image
├ generated_images_no_tiled/  # projections with `W(18,*)`
├ ├ emmanuel-macron_01.npy    # - latent code
├ └ emmanuel-macron_01.png    # - projected image
├ generated_images_tiled/     # projections with `W(1,*)`
├ ├ emmanuel-macron_01.npy    # - latent code
├ └ emmanuel-macron_01.png    # - projected image
├ aligned_images.tar.gz             # folder archive
├ generated_images_no_tiled.tar.gz  # folder archive
└ generated_images_tiled.tar.gz     # folder archive

Projection results

Images below allow us to compare results obtained with the original projection W(1,*) and the extended projection W(18,*).

A projected image obtained with W(18,*) is expected to be closer to the target image, at the expense of semantics.

If image fidelity is very important, W(18,*) can be run for a higher number of iterations (default is 1000 steps), but truncation might be needed for later applications.

French politicians

From top to bottom: aligned target image, projection with W(1,*), projection with W(18,*).

Aligned target imageAligned target imageAligned target image

W1 projected imageW1 projected imageW1 projected image

W18 projected imageW18 projected imageW18 projected image

From top to bottom: aligned target image, projection with W(1,*), projection with W(18,*).

Aligned target imageAligned target imageAligned target image

W1 projected imageW1 projected imageW1 projected image

W18 projected imageW18 projected imageW18 projected image

Art

From top to bottom: aligned target image, projection with W(1,*), projection with W(18,*).

Aligned target imageAligned target imageAligned target image

W1 projected imageW1 projected imageW1 projected image

W18 projected imageW18 projected imageW18 projected image

Applications

In the following, we assume that real images have been projected, so that we have access to their latent codes, of shape (1, 512) or (18, 512) depending on the projection method.

There are three main applications:

  1. morphing (linear interpolation),
  2. style transfer (crossover),
  3. expression transfer (adding a vector and a scaled difference vector).

Shared data on Google Drive

Results corresponding to each application are:

The directory structure is as follows:

stylegan2_editing/
├ expression/                   # expression transfer
| ├ no_tiled/                   # - `W(18,*)`
| | └ expression_01_age.jpg     # face n°1 ; age
| └ tiled/                      # - `W(1,*)`
|   └ expression_01_age.jpg
├ morphing/                     # morphing
| ├ no_tiled/                   # - `W(18,*)`
| | └ morphing_07_01.jpg        # face n°7 to face n°1
| └ tiled/                      # - `W(1,*)`
|   └ morphing_07_01.jpg
├ style_mixing/                 # style transfer
| ├ no_tiled/                   # - `W(18,*)`
| | └ style_mixing_42-07-10-29-41_42-07-22-39.jpg
| └ tiled/                      # - `W(1,*)`
|   └ style_mixing_42-07-10-29-41_42-07-22-39.jpg
├ video_style_mixing/           # style transfer
| ├ no_tiled/                   # - `W(18,*)`
| | └ video_style_mixing_000.000.jpg
| ├ tiled/                      # - `W(1,*)`
| | └ video_style_mixing_000.000.jpg
| ├ no_tiled_small.mp4          # with 2 reference faces
| ├ no_ tiled.mp4               # with 4 reference faces
| ├ tiled_small.mp4
| └ tiled.mp4
├ expression_transfer.tar.gz    # folder archive
├ morphing.tar.gz               # folder archive
├ style_mixing.tar.gz           # folder archive
└ video_style_mixing.tar.gz     # folder archive

1. Morphing

Morphing consists in a linear interpolation between two latent vectors (two faces).

Results are shown on the Wiki.

With the original projection W(1,*)

Morphing Morphing

Morphing Morphing

With the extended projection W(18,*)

Morphing Morphing

Morphing Morphing

2. Style transfer

Style transfer consists in a crossover of latent vectors at the layer level (cf. this piece of code).

There are 18 layers for the generator. The latent vector of the reference face is used for the first 7 layers. The latent vector of the face whose style has to be copied is used for the remaining 11 layers.

Results are shown on the Wiki.

With the original projection W(1,*)

Thanks to morphing of the faces whose style is copied, style transfer can be watched as a video.

Style Transfer

With the extended projection W(18,*)

Thanks to morphing of the faces whose style is copied, style transfer can be watched as a video.

Style Transfer

3. Expression transfer

Expression transfer consists in the addition of:

  • a latent vector (a face),
  • a scaled difference vector (an expression).

Expressions were defined, learnt, and shared on Github by a Chinese speaker:

  1. age
  2. angle_horizontal
  3. angle_pitch
  4. beauty
  5. emotion_angry
  6. emotion_disgust
  7. emotion_easy
  8. emotion_fear
  9. emotion_happy
  10. emotion_sad
  11. emotion_surprise
  12. eyes_open
  13. face_shape
  14. gender
  15. glasses
  16. height
  17. race_black
  18. race_white
  19. race_yellow
  20. smile
  21. width

Results are shown on the Wiki.

With the original projection W(1,*)

  • Age: Expression Transfer

  • Smile: Expression Transfer

  • Age: Expression Transfer

  • Smile: Expression Transfer

With the extended projection W(18,*)

  • Age: Expression Transfer

  • Smile: Expression Transfer

  • Age: Expression Transfer

  • Smile: Expression Transfer

References

stylegan2-projecting-images's People

Contributors

pyup-bot avatar woctezuma avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

stylegan2-projecting-images's Issues

Use latent vector

Why does latent vector editing result in a slightly different image from the original image?
image

style transfer

When I was training the '. NPY' file, '-- num steps' I used 50000 steps and' -- no tiled '. The training picture was close to the original picture. I started to carry out style transfer, The result is not good. I think it may be that the '-- num steps' in the training'. NPY 'file is not enough. Now I am continuing training with the goal of 300000' - num steps'. Once again, I would like to express my most sincere thanks!
2020-09-24 18-15-44

Display the landmarks


NameError Traceback (most recent call last)
in ()
5 dlib_output_faces=faces,
6 face_no=face_no,
----> 7 fig_size=fig_size)

in display_landmarks(image_name, dlib_output_faces, face_no, fig_size)
31 face_parts = current_face.parts()
32
---> 33 preds = np.array([
34 [v.x, v.y]
35 for v in face_parts

NameError: name 'np' is not defined

issue in project images.py

tying to project some images on coolab , with tf1.14.0 got this issue
File "project_images.py", line 103, in main _G, _D, Gs = pretrained_networks.load_networks(args.network_pkl) File "/content/stylegan2/pretrained_networks.py", line 76, in load_networks G, D, Gs = pickle.load(stream, encoding='latin1') File "/content/stylegan2/dnnlib/tflib/network.py", line 297, in __setstate__ self._init_graph() File "/content/stylegan2/dnnlib/tflib/network.py", line 154, in _init_graph out_expr = self._build_func(*self.input_templates, **build_kwargs) File "<string>", line 439, in G_synthesis File "<string>", line 392, in layer File "<string>", line 105, in modulated_conv2d_layer File "<string>", line 50, in apply_bias_act TypeError: fused_bias_act() got an unexpected keyword argument 'clamp'

--num-steps value

python project_images.py datasets/aligned_images/ results/generated_images_tiled_10000steps/ --video True --video-mode 2

I want to get night pictures's ".npy"

Can you tell me your value of "--num-steps"?

Expression transfer -- angle_horizontal

I tested Expression transfer, but the effect is average. You mentioned "https://github.com/a312863063/seeprettyface-face_editor". I looked at the code, which contains png and. txt files randomly generated through the trained model, and then the properties can be modified, and the performance effect is still very good. If I take a selfie with one of my photos, how to deal with the property modification? Looking forward to your answer, thank you!

Initial Update

The bot created this issue to inform you that pyup.io has been set up on this repo.
Once you have closed it, the bot will open pull requests for updates as soon as they are available.

python project_images.py datasets/aligned_images/ results/generated_images_tiled/ --video True --video-mode 2

root@25f892c013fc:/stylegan2# python project_images.py datasets/aligned_images/ results/generated_images_tiled/ --video True --video-mode 2
Loading networks from "./stylegan2-ffhq-config-f.pkl"...
Setting up TensorFlow plugin "fused_bias_act.cu": Preprocessing... Loading... Done.
Setting up TensorFlow plugin "upfirdn_2d.cu": Preprocessing... Loading... Done.
Traceback (most recent call last):
File "project_images.py", line 126, in
main()
File "project_images.py", line 110, in main
tiled = args.tiled
TypeError: init() got an unexpected keyword argument 'vgg16_pkl'


The face data file in "datasets / aligned"_ Images / "inside," results / generated_ images_ Tiled / "training export file," stylegan2-ffhq-config-f.pkl "and" vgg16 "_ zhang_ perceptual.pkl "Both files are local, using". / "to call, I am more confused about the cause of the current error, looking forward to your help me, thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.