Giter Club home page Giter Club logo

stargan-v2's Issues

IndexError: index 2 is out of bounds for dimension 1 with size 2

File "main.py", line 182, in
main(args)
File "main.py", line 59, in main
solver.train(loaders)
File "stargan-v2/core/solver.py", line 110, in train
nets, args, x_real, y_org, y_trg, z_trg=z_trg, masks=masks)
File "stargan-v2/core/solver.py", line 212, in compute_d_loss
s_trg = nets.mapping_network(z_trg, y_trg)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "stargan-v2/core/model.py", line 221, in forward
s = out[idx, y] # (batch, style_dim)
IndexError: index 2 is out of bounds for dimension 1 with size 2

How to solve this problem?
please help
tensor shape as follows:
idx, y tensor([0, 1, 2, 3, 4, 5, 6, 7]) tensor([2, 1, 1, 1, 1, 1, 1, 1])
out.shape torch.Size([8, 2, 64])
idx.shape, y.shape torch.Size([8]) torch.Size([8])

domain translation using latents

Is it possible to easily translate a source image from one domain to another using latent and not ref images?

I see a fucntion translate_using_latent(nets, args, x_src, y_trg_list, z_trg_list, psi, filename): in utils.py (line 78) but it is never used and I am not sure of how "y_trg_list" and "z_trg_list" are supposed to be.

Question in total GAN loss

Hi,

thanks for your paper and code. I was wondering if you explain why you defined your total GAN loss as: loss = loss_adv + args.lambda_sty * loss_sty - args.lambda_ds * loss_ds + args.lambda_cyc * loss_cyc. Why did you combine the GAN loss and style loss and subtract it from the diversity and cyclic loss? What is the intuition behind that?

Thanks for your time!

Is there any rule for number of image data, or validation set size when use custom dataset?

I re-runned the training with provided dataset and training code, and guess the previous errors are due to mismatch some 'number' between my custom dataset and AFHQ or CELEBA.

Is there any mandatory fixed number of files in each modalities of val folder, or representative folder?
I matched many numbers (number of domains, image size, etc... ) but the numbers of images are quite small (train : 4-600 per domain, val : 100 per each domain), and the images in representative folders are also smaller than examples.

[Q] keeping background from the content image

@yunjey @youngjung
First, I want to thank you for your good works including the high-quality paper and very organized codes.
StarGANv2 can generate realistic synthetic images that following the given reference images. But, the generated image has the background of the reference image. Do you have some ideas to maintain not only the pose and identity of the content image but also keep the background the content image in generated results?

Working on expr/results/celeba_hq/video_ref.mp4 Killed

Hey!

I am following the readme tutorial but when I run the command

python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1
--checkpoint_dir expr/checkpoints/celeba_hq
--result_dir expr/results/celeba_hq
--src_dir assets/representative/celeba_hq/src
--ref_dir assets/representative/celeba_hq/ref

The video generation seems to go well but then after 100% it just print a "Killed" message and the video is not generated:

Working on expr/results/celeba_hq/reference.jpg...
/home/ubuntu/anaconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/functional.py:2506: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Working on expr/results/celeba_hq/video_ref.mp4...
video_ref: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [04:01<00:00, 7.54s/it]
Killed

eval bug

when compute FID and LPIPS , mode = "latent" then ,the bug
UnBoundLocalError: local variable 'loader_ref' referenced before assigment

Curiosity about Adam Beta1=0

Hi,
first of all congrats for the paper. I am curious about your choice of Adam parameters. Could you give more insights about Beta1=0? Why don't you use Momentum?

Thank you!

Get output image

Is there an option to get the output image only. On running the model, I am getting the video as the output.
I only want to save the generated images.

Thank you for open sourcing such an amazing work.

Score numbers differ from the paper

Hello, the numbers you report here differ from the paper.

  1. Is the version of the paper using the heatmaps? If yes, do you have numbers without heatmaps? They would help a lot.
  2. Are these new numbers using the same experimental framework reported in the paper? Batch size, number of iterations, etc.?
  3. I also found that some of the architectures do not match with the ones reported in the paper. For instance, the discriminator max_dim goes up to 512 (in the paper says 1024), and the mapping network is different as well. Is there going to be an updated arxiv version?
  4. [EDIT]. If AFHQ does not use heatmaps, why the numbers are also different from the paper?

Thank you.

Generate Image resolution higher than 256

Is it possible to generate image with resolution 512 or 1024? I tried the img_size argument in main.py to change it to 512, yet I got following errors, seems like the model doesn't support other resolution?

RuntimeError: Error(s) in loading state_dict for Generator:
Missing key(s) in state_dict: "encode.3.conv1x1.weight", "encode.7.conv1.weight", "encode.7.conv1.bias", "encode.7.conv2.weight", "encode.7.conv2.bias", "encode.7.norm1.weight", "encode.7.norm1.bias", "encode.7.norm2.weight", "encode.7.norm2.bias", "decode.7.conv1.weight", "decode.7.conv1.bias", "decode.7.conv2.weight", "decode.7.conv2.bias", "decode.7.norm1.fc.weight", "decode.7.norm1.fc.bias", "decode.7.norm2.fc.weight", "decode.7.norm2.fc.bias", "decode.7.conv1x1.weight".
size mismatch for from_rgb.weight: copying a param with shape torch.Size([64, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 3, 3, 3]).
size mismatch for from_rgb.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).

Error

Getting the issue in Windows 10 CMD
I tried ffmpeg uninstall and install.
Nothing works.

Error message:

Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "C:\Users\xxx\stargan-v2\core\solver.py", line 58, in init
self.ckptios = [CheckpointIO(ospj(args.checkpoint_dir, '{:06d}_nets_ema.ckpt'), **self.nets_ema)]
File "C:\Users\xxx\stargan-v2\core\checkpoint.py", line 17, in init
os.makedirs(os.path.dirname(fname_template), exist_ok=True)
File "C:\Users\xxx.conda\envs\stargan-v2\lib\os.py", line 220, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '{:'

different update frequencies of different modules

Hi~ thanks a lot for your awesome work!

I find that during each iteration, D is optimized for twice:

d_loss, d_losses_latent = compute_d_loss(
    nets, args, x_real, y_org, y_trg, z_trg=z_trg, masks=masks)
self._reset_grad()
d_loss.backward()
optims.discriminator.step()

d_loss, d_losses_ref = compute_d_loss(
    nets, args, x_real, y_org, y_trg, x_ref=x_ref, masks=masks)
self._reset_grad()
d_loss.backward()
optims.discriminator.step()

G is also optimized for twice:

g_loss, g_losses_latent = compute_g_loss(
    nets, args, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], masks=masks)
self._reset_grad()
g_loss.backward()
optims.generator.step()
optims.mapping_network.step()
optims.style_encoder.step()

g_loss, g_losses_ref = compute_g_loss(
    nets, args, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], masks=masks)
self._reset_grad()
g_loss.backward()
optims.generator.step()

However, mapping_network and style_encoder are only optimized for once.

Could you explain it for me? many thanks again.

multi-gpu training

hi,I found that the actual training time was longer than the time mentioned in the paper,could you release the multi-gpu code? or is there any tips for me to change this code to multi-gpu? (i have tried to make the change , but there is some problem, maybe the library of Munch not support multi-gpu operation)

tks

HighPass

hi, there is no any explanations about HighPass In Paper, could you tell me what role this function plays?
tks!

Segmentation fault when align custom images

Hi @yunjey , thanks for great work!
I followed your instruction to manually crop my own image and run the wing alignment, yet I get segmentation fault without any more error message. Please help

Error below:
UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Segmentation fault

update generator

when updating the generator, the discriminator parameters should be fixed, but i found that you did not

this part in solver.py (239-241):
x_fake = nets.generator(x_real, s_trg, masks=masks)
out = nets.discriminator(x_fake, y_trg)
loss_adv = adv_loss(out, 1)

i think maybe the following is right:
x_fake = nets.generator(x_real, s_trg, masks=masks)
with torch.no_grad():
out = nets.discriminator(x_fake, y_trg)

loss_adv = adv_loss(out, 1)

could tell me this right? tks

mode of sample and align

when i execute command of python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1
--checkpoint_dir expr/checkpoints/celeba_hq
--result_dir expr/results/celeba_hq
--src_dir assets/representative/celeba_hq/src
--ref_dir assets/representative/celeba_hq/ref
or python main.py --mode align
--inp_dir assets/representative/custom/female
--out_dir assets/representative/celeba_hq/src/female

the following error occurred:
Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "/root/work/stargan-v2/core/solver.py", line 34, in init
self.nets, self.nets_ema = build_model(args)
File "/root/work/stargan-v2/core/model.py", line 300, in build_model
fan = FAN(fname_pretrained=args.wing_path).eval()
File "/root/work/stargan-v2/core/wing.py", line 213, in init
self.load_pretrained_weights(fname_pretrained)
File "/root/work/stargan-v2/core/wing.py", line 217, in load_pretrained_weights
checkpoint = torch.load(fname) # map_location=torch.device('cpu'))
File "/root/anaconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/serialization.py", line 526, in load
if _is_zipfile(opened_file):
File "/root/anaconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/serialization.py", line 76, in _is_zipfile
if ord(magic_byte) != ord(read_byte):
TypeError: ord() expected a character, but string of length 0 found

and i have installed all dependencies and downloaded the corresponding datasets and checkpoints as your description in repository. could you tell me how to solve this problem? tks much

No module named 'core.data_loader'

when start to train my data, run "python3 main.py --mode train.............."

Traceback (most recent call last):
File "main.py", line 18, in
from core.data_loader import get_train_loader
ImportError: No module named 'core.data_loader'

About new datasets

Hello, I would like to ask, if I use a new dataset, how do I prepare it, and how is the data in the assets folder selected, if I want to test an entire test-dataset, can you provide a guidance?

Heatmaps

Hello, nice work.
I have a couple of doubts regarding the heatmaps.

  1. Could you please elaborate on these values? Why resizing and shifting heatmaps and why those numbers for different regions of the face (x and x2)? In the main paper, there is nothing about heatmaps or keypoints, so I am trying to understand the intuition.

  2. Are wing.ckpt pre-trained weights the same as in this work, or do they differ in some way?

  3. Does CelebA_HQ work without heatmaps?

Thanks :)

Training hangs on fetching images and labels

Hi, training with afhq train script mentioned in your README the code seems to hang for me on fetching a batch of images and labels. So it never actually begins to train as it gets stuck on line 101 of solver.py:

inputs = next(fetcher)

Any ideas about this problem?

Thanks in advance!

Sam

About AdaIN

Hello! I have a question about class AdaIN.
In your implementation, you used (1 + gamma) * self.norm (x) + beta . Why don't you use gamma * self.norm (x) + beta ? Thank you.

Please help me

I'm new to Python, I've been studying it for a while. I found your project by chance and it really impressed me, I want to figure out how everything works and test it myself, but using istruction nothing works. When I write the command:
bash download.sh celeba-hq-dataset

download.sh: line 9:
StarGAN v2
Copyright (c) 2020-present NAVER Corp.

This work is licensed under the Creative Commons Attribution-NonCommercial
4.0 International License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to
Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
: No such file or directory
download.sh: line 38: wget: command not found
unzip: cannot find or open ./data/celeba_hq.zip, ./data/celeba_hq.zip.zip or ./data/celeba_hq.zip.ZIP.
rm: ./data/celeba_hq.zip: No such file or directory

Update step with reference images

Hi, I was wondering why the style encoder isn't updated when computing the losses with reference images.
Thank you for your work.

Inference on GPU

I am trying to test the model, But it runs on CPU and takes , 16GB of memory. How can we run the model on GPU ?

--style_dim', type=int, default=64

--style_dim', type=int, default=64,It's a great honor to see the project developed by your team. I wonder how face style can find these 64 kinds. For example, I need to use this model to modify the hair color, face color and skin color of a figure in a photo

Path error upon running the example you provided to transform a custom image

Probably caused by windows, edited the line 58 of solver to link directly to the checkpoint instead

Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "D:\Documents\Desktop\StarGAN\core\solver.py", line 58, in init
self.ckptios = [CheckpointIO(ospj(args.checkpoint_dir, '{:06d}_nets_ema.ckpt'), **self.nets_ema)]
File "D:\Documents\Desktop\StarGAN\core\checkpoint.py", line 17, in init
os.makedirs(os.path.dirname(fname_template), exist_ok=True)
File "D:\Dev\Python\lib\os.py", line 220, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] Path not found: '{:'

RuntimeError: CUDA out of memory

I'm running the training with default --batch_size 8 and I get:

RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 15.75 GiB total capacity; 14.58 GiB already a llocated; 22.88 MiB free; 14.75 GiB reserved in total by PyTorch)

Server details:

  • GPU: 1 x NVIDIA Tesla V100
  • n1-highmem-4 (4 vCPU, 26 GB memory)

running this training on Google Cloud Platform.

some problems about updating style encoder network

train the generator

g_loss, g_losses_latent = compute_g_loss(nets, args, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], masks=masks, face_mask=face_mask)
self._reset_grad()
g_loss.backward()
optims.generator.step()
optims.mapping_network.step()
optims.style_encoder.step()

g_loss, g_losses_ref = compute_g_loss(nets, args, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], masks=masks, face_mask=face_mask)
self._reset_grad()
g_loss.backward()
optims.generator.step()

Hi, I was very curious why the style encoder isn't updated when computing the loss with reference images.

Questions about the batch size 4 model

  Hello, according to the source code you provided,  I keep the same parameters, except batchsize changed to 4, training a model.
  However, it was found that the batch size 4 model  was quite different from the official model, especially in terms of hairstyle and style diversification.
  Could you please help me to analyze the reason? Is it batch size or the parameter used by the official model is different from the source code?
  Thanks a million!

confusion about training custom and AFHQ data.

When I run AFHQ training code

python main.py --mode train --num_domains 3 --w_hpf 0 \
                --lambda_reg 1 --lambda_sty 1 --lambda_ds 2 --lambda_cyc 1 \
                --train_img_dir data/afhq/train \
                --val_img_dir data/afhq/val

The printed namespace is including celeba values.(Bold part below).
Is it normal? Since I face same phenomenon in custom dataset training, and suspect that it's a possible cause of errors.

Namespace(batch_size=8, beta1=0.0, beta2=0.99, checkpoint_dir='expr/checkpoints', ds_iter=100000, eval_dir='expr/eval', eval_every=50000, f_lr=1e-06, hidden_dim=512, img_size=256, inp_dir='**assets/representative/custom/female'**, lambda_cyc=1.0, lambda_ds=2.0, 
lambda_reg=1.0, lambda_sty=1.0, latent_dim=16, **lm_path='expr/checkpoints/celeba_lm_mean.npz'**, lr=0.0001, mode='train', num_domains=3, num_outs_per_domain=10, num_workers=4, **out_dir='assets/representative/celeba_hq/src/female',** print_every=10, randcrop_prob=0.5, **ref_dir='assets/representative/celeba_hq/ref**', result_dir='expr/results', resume_iter=0, sample_dir='expr/samples', sample_every=5000, save_every=10000, seed=777, **src_dir='assets/representative/celeba_hq/src',** style_dim=64, total_iters=100000, train_img_dir='data/afhq/train', val_batch_size=32, val_img_dir='data/afhq/val', w_hpf=0.0, weight_decay=0.0001, wing_path='expr/checkpoints/wing.ckpt')

how to use different dataset

How to use different datasets? How to arrange it inside data file?
for example I want to transfer cat images to dogs.

UnBoundLocalError

Similar to issue #11

After train, eval error happens when use 4 domains dataset.

Should I adjust lambda_ds for #domains-1, or adjust other variable to fit with # of domains?

python main.py --mode train --num_domains 4 --w_hpf 0 \
               --lambda_reg 1 --lambda_sty 1 --lambda_ds 1 --lambda_cyc 1 \
               --train_img_dir data/custom/train \
               --val_img_dir data/custom/val
Calculating evaluation metrics...
Number of domains: 4
Preparing DataLoader for the evaluation phase...
Traceback (most recent call last):
  File "main.py", line 182, in <module>
    main(args)
  File "main.py", line 59, in main
    solver.train(loaders)
  File "/home/ipsych/ML/Stargan_v2/core/solver.py", line 170, in train
    calculate_metrics(nets_ema, args, i+1, mode='latent')
  File "/home/ipsych/.conda/envs/Pytorch_1_4_0/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/ipsych/ML/Stargan_v2/metrics/eval.py", line 61, in calculate_metrics
    iter_ref = iter(loader_ref)
UnboundLocalError: local variable 'loader_ref' referenced before assignment

Manage domain (attribute) as StarGANv1

If you look at the StarGAN v1, one image can belong to multiple domains. It treats attributes by txt file.
And my datasets set up that way now. (Every image belongs to multiple domains.) Images are in one folder and save labels using a CSV file. Should I save the image by domain? Is there any good way to solve my problem? Thanks.

안녕하세요. 한국인이라서 혹시나 보실까하고 한국어로도 질문 남깁니다. 위에서 질문 드린것과 같이, 지금 저는 이미지들이 한 폴더에 있고 각 이미지가 여러 도메인에 속하는 상태입니다. (사진 A가 여자 도메인에 속하는 동시에 금발 도메인에 속해서 두 도메인의 학습 모두에 사진 A를 사용하려 합니다.) 제시해주신대로 데이터를 저장하려면 여러 폴더에 같은 사진이 저장되어야 하는데 이는 너무 번거롭고 용량도 많이 차지해서 다른 좋은 방법이 있을지 궁금합니다. 감사합니다. :)

Needs help!

Hi,

First, it's a great project, and thank you for sharing the documents and codes. However, I have encountered two issues listed below. Thanks again for your help in advance!

  1. x264 cannot be installed?
    (stargan-v2) C:\Users\xxx\stargan-v2>conda install x264=='1!152.20180717' ffmpeg=4.0.2 -c conda-forge

CondaValueError: invalid package specification: x264=='1152.20180717

  1. When running the script in CMD in windows 10, the error shows as below:

python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1 --checkpoint_dir expr/checkpoints/celeba_hq --result_dir expr/results/celeba_hq --src_dir assets/representative/celeba_hq/src --ref_dir assets/representative/celeba_hq/ref

Error message:
Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "C:\Users\xxx\stargan-v2\core\solver.py", line 58, in init
self.ckptios = [CheckpointIO(ospj(args.checkpoint_dir, '{:06d}_nets_ema.ckpt'), **self.nets_ema)]
File "C:\Users\xxx\stargan-v2\core\checkpoint.py", line 17, in init
os.makedirs(os.path.dirname(fname_template), exist_ok=True)
File "C:\Users\xxx.conda\envs\stargan-v2\lib\os.py", line 220, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '{:'

PILLOW_VERSION is missing on pillow==7.0.0

Running command to generate images fails on:

File "/home/bobi/.local/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 5, in <module>
    from PIL import Image, ImageOps, ImageEnhance, PILLOW_VERSION
ImportError: cannot import name 'PILLOW_VERSION'
(stargan-v2) bobi@strix:~/Desktop/stargan-v2$ python 
Python 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 19:16:44) 
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import PIL
>>> PIL.PILLOW_VERSION
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'PIL' has no attribute 'PILLOW_VERSION

I works with

pip install "pillow<7"

about pre-trained model on AFHQ

I have downloaded the pre-trained models trained on AFHQ dataset, I wonder why the ckpt file still contains the weights of FAN network? Isn't it unused when training models on AFHQ?

Preserving identity

Hello,
Thanks for awesome codes and application.

I have a question about training many (6~8) domains like Stargan RAFD implementation in github.

When I make all domains separate folders and train with -ds 8, the result model totally mixes the identity of the photo. The pre-trained CelebHQ model works far better to generate the results even I put the photos of domains as male or female source. (i.e. let the person smile while preserve the identity).

Is it related to changed GAN structure, and Gender is far vivid feature compare to expressions?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.