clovaai / stargan-v2 Goto Github PK
View Code? Open in Web Editor NEWStarGAN v2 - Official PyTorch Implementation (CVPR 2020)
License: Other
StarGAN v2 - Official PyTorch Implementation (CVPR 2020)
License: Other
File "main.py", line 182, in
main(args)
File "main.py", line 59, in main
solver.train(loaders)
File "stargan-v2/core/solver.py", line 110, in train
nets, args, x_real, y_org, y_trg, z_trg=z_trg, masks=masks)
File "stargan-v2/core/solver.py", line 212, in compute_d_loss
s_trg = nets.mapping_network(z_trg, y_trg)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "stargan-v2/core/model.py", line 221, in forward
s = out[idx, y] # (batch, style_dim)
IndexError: index 2 is out of bounds for dimension 1 with size 2
How to solve this problem?
please help
tensor shape as follows:
idx, y tensor([0, 1, 2, 3, 4, 5, 6, 7]) tensor([2, 1, 1, 1, 1, 1, 1, 1])
out.shape torch.Size([8, 2, 64])
idx.shape, y.shape torch.Size([8]) torch.Size([8])
FileNotFoundError: [Errno 2] No such file or directory: 'expr/checkpoints/wing.ckpt'
Colab notebook?
Is it possible to easily translate a source image from one domain to another using latent and not ref images?
I see a fucntion translate_using_latent(nets, args, x_src, y_trg_list, z_trg_list, psi, filename): in utils.py (line 78) but it is never used and I am not sure of how "y_trg_list" and "z_trg_list" are supposed to be.
Hi,
thanks for your paper and code. I was wondering if you explain why you defined your total GAN loss as: loss = loss_adv + args.lambda_sty * loss_sty - args.lambda_ds * loss_ds + args.lambda_cyc * loss_cyc. Why did you combine the GAN loss and style loss and subtract it from the diversity and cyclic loss? What is the intuition behind that?
Thanks for your time!
I re-runned the training with provided dataset and training code, and guess the previous errors are due to mismatch some 'number' between my custom dataset and AFHQ or CELEBA.
Is there any mandatory fixed number of files in each modalities of val folder, or representative folder?
I matched many numbers (number of domains, image size, etc... ) but the numbers of images are quite small (train : 4-600 per domain, val : 100 per each domain), and the images in representative folders are also smaller than examples.
@yunjey @youngjung
First, I want to thank you for your good works including the high-quality paper and very organized codes.
StarGANv2 can generate realistic synthetic images that following the given reference images. But, the generated image has the background of the reference image. Do you have some ideas to maintain not only the pose and identity of the content image but also keep the background the content image in generated results?
Hi, may i have a question about the implementation in solver.py of training step. In which the number of iterations (steps) is particularly used instead of number of epochs ?
Hey!
I am following the readme tutorial but when I run the command
python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1
--checkpoint_dir expr/checkpoints/celeba_hq
--result_dir expr/results/celeba_hq
--src_dir assets/representative/celeba_hq/src
--ref_dir assets/representative/celeba_hq/ref
The video generation seems to go well but then after 100% it just print a "Killed" message and the video is not generated:
Working on expr/results/celeba_hq/reference.jpg...
/home/ubuntu/anaconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/nn/functional.py:2506: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Working on expr/results/celeba_hq/video_ref.mp4...
video_ref: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 32/32 [04:01<00:00, 7.54s/it]
Killed
when compute FID and LPIPS , mode = "latent" then ,the bug
UnBoundLocalError: local variable 'loader_ref' referenced before assigment
Hi,
first of all congrats for the paper. I am curious about your choice of Adam parameters. Could you give more insights about Beta1=0? Why don't you use Momentum?
Thank you!
Is there an option to get the output image only. On running the model, I am getting the video as the output.
I only want to save the generated images.
Thank you for open sourcing such an amazing work.
Hello, the numbers you report here differ from the paper.
Thank you.
Is it possible to generate image with resolution 512 or 1024? I tried the img_size argument in main.py to change it to 512, yet I got following errors, seems like the model doesn't support other resolution?
RuntimeError: Error(s) in loading state_dict for Generator:
Missing key(s) in state_dict: "encode.3.conv1x1.weight", "encode.7.conv1.weight", "encode.7.conv1.bias", "encode.7.conv2.weight", "encode.7.conv2.bias", "encode.7.norm1.weight", "encode.7.norm1.bias", "encode.7.norm2.weight", "encode.7.norm2.bias", "decode.7.conv1.weight", "decode.7.conv1.bias", "decode.7.conv2.weight", "decode.7.conv2.bias", "decode.7.norm1.fc.weight", "decode.7.norm1.fc.bias", "decode.7.norm2.fc.weight", "decode.7.norm2.fc.bias", "decode.7.conv1x1.weight".
size mismatch for from_rgb.weight: copying a param with shape torch.Size([64, 3, 3, 3]) from checkpoint, the shape in current model is torch.Size([32, 3, 3, 3]).
size mismatch for from_rgb.bias: copying a param with shape torch.Size([64]) from checkpoint, the shape in current model is torch.Size([32]).
Getting the issue in Windows 10 CMD
I tried ffmpeg uninstall and install.
Nothing works.
Error message:
Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "C:\Users\xxx\stargan-v2\core\solver.py", line 58, in init
self.ckptios = [CheckpointIO(ospj(args.checkpoint_dir, '{:06d}_nets_ema.ckpt'), **self.nets_ema)]
File "C:\Users\xxx\stargan-v2\core\checkpoint.py", line 17, in init
os.makedirs(os.path.dirname(fname_template), exist_ok=True)
File "C:\Users\xxx.conda\envs\stargan-v2\lib\os.py", line 220, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '{:'
Hello, Does the generator need tanh? I can not find it in this project.
@yunjey @youngjung
Hi~ thanks a lot for your awesome work!
I find that during each iteration, D is optimized for twice:
d_loss, d_losses_latent = compute_d_loss(
nets, args, x_real, y_org, y_trg, z_trg=z_trg, masks=masks)
self._reset_grad()
d_loss.backward()
optims.discriminator.step()
d_loss, d_losses_ref = compute_d_loss(
nets, args, x_real, y_org, y_trg, x_ref=x_ref, masks=masks)
self._reset_grad()
d_loss.backward()
optims.discriminator.step()
G is also optimized for twice:
g_loss, g_losses_latent = compute_g_loss(
nets, args, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], masks=masks)
self._reset_grad()
g_loss.backward()
optims.generator.step()
optims.mapping_network.step()
optims.style_encoder.step()
g_loss, g_losses_ref = compute_g_loss(
nets, args, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], masks=masks)
self._reset_grad()
g_loss.backward()
optims.generator.step()
However, mapping_network and style_encoder are only optimized for once.
Could you explain it for me? many thanks again.
hi,I found that the actual training time was longer than the time mentioned in the paper,could you release the multi-gpu code? or is there any tips for me to change this code to multi-gpu? (i have tried to make the change , but there is some problem, maybe the library of Munch not support multi-gpu operation)
tks
hi, there is no any explanations about HighPass In Paper, could you tell me what role this function plays?
tks!
Hi @yunjey , thanks for great work!
I followed your instruction to manually crop my own image and run the wing alignment, yet I get segmentation fault without any more error message. Please help
Error below:
UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode))
Segmentation fault
when updating the generator, the discriminator parameters should be fixed, but i found that you did not
this part in solver.py (239-241):
x_fake = nets.generator(x_real, s_trg, masks=masks)
out = nets.discriminator(x_fake, y_trg)
loss_adv = adv_loss(out, 1)
i think maybe the following is right:
x_fake = nets.generator(x_real, s_trg, masks=masks)
with torch.no_grad():
out = nets.discriminator(x_fake, y_trg)
loss_adv = adv_loss(out, 1)
could tell me this right? tks
when i execute command of python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1
--checkpoint_dir expr/checkpoints/celeba_hq
--result_dir expr/results/celeba_hq
--src_dir assets/representative/celeba_hq/src
--ref_dir assets/representative/celeba_hq/ref
or python main.py --mode align
--inp_dir assets/representative/custom/female
--out_dir assets/representative/celeba_hq/src/female
the following error occurred:
Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "/root/work/stargan-v2/core/solver.py", line 34, in init
self.nets, self.nets_ema = build_model(args)
File "/root/work/stargan-v2/core/model.py", line 300, in build_model
fan = FAN(fname_pretrained=args.wing_path).eval()
File "/root/work/stargan-v2/core/wing.py", line 213, in init
self.load_pretrained_weights(fname_pretrained)
File "/root/work/stargan-v2/core/wing.py", line 217, in load_pretrained_weights
checkpoint = torch.load(fname) # map_location=torch.device('cpu'))
File "/root/anaconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/serialization.py", line 526, in load
if _is_zipfile(opened_file):
File "/root/anaconda3/envs/stargan-v2/lib/python3.6/site-packages/torch/serialization.py", line 76, in _is_zipfile
if ord(magic_byte) != ord(read_byte):
TypeError: ord() expected a character, but string of length 0 found
and i have installed all dependencies and downloaded the corresponding datasets and checkpoints as your description in repository. could you tell me how to solve this problem? tks much
when start to train my data, run "python3 main.py --mode train.............."
Traceback (most recent call last):
File "main.py", line 18, in
from core.data_loader import get_train_loader
ImportError: No module named 'core.data_loader'
Hello, I would like to ask, if I use a new dataset, how do I prepare it, and how is the data in the assets folder selected, if I want to test an entire test-dataset, can you provide a guidance?
Hello, nice work.
I have a couple of doubts regarding the heatmaps.
Could you please elaborate on these values? Why resizing and shifting heatmaps and why those numbers for different regions of the face (x
and x2
)? In the main paper, there is nothing about heatmaps or keypoints, so I am trying to understand the intuition.
Are wing.ckpt
pre-trained weights the same as in this work, or do they differ in some way?
Does CelebA_HQ work without heatmaps?
Thanks :)
Hi, training with afhq train script mentioned in your README the code seems to hang for me on fetching a batch of images and labels. So it never actually begins to train as it gets stuck on line 101 of solver.py:
inputs = next(fetcher)
Any ideas about this problem?
Thanks in advance!
Sam
Hello! I have a question about class AdaIN.
In your implementation, you used (1 + gamma) * self.norm (x) + beta
. Why don't you use gamma * self.norm (x) + beta
? Thank you.
I'm new to Python, I've been studying it for a while. I found your project by chance and it really impressed me, I want to figure out how everything works and test it myself, but using istruction nothing works. When I write the command:
bash download.sh celeba-hq-dataset
download.sh: line 9:
StarGAN v2
Copyright (c) 2020-present NAVER Corp.
This work is licensed under the Creative Commons Attribution-NonCommercial
4.0 International License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-nc/4.0/ or send a letter to
Creative Commons, PO Box 1866, Mountain View, CA 94042, USA.
: No such file or directory
download.sh: line 38: wget: command not found
unzip: cannot find or open ./data/celeba_hq.zip, ./data/celeba_hq.zip.zip or ./data/celeba_hq.zip.ZIP.
rm: ./data/celeba_hq.zip: No such file or directory
After training, can I generate some images from input without eval?
For example, specifying input .jpg or folder, in case I have only some cat images want to transfer to dogs.
Hi, I was wondering why the style encoder isn't updated when computing the losses with reference images.
Thank you for your work.
I am trying to test the model, But it runs on CPU and takes , 16GB of memory. How can we run the model on GPU ?
--style_dim', type=int, default=64,It's a great honor to see the project developed by your team. I wonder how face style can find these 64 kinds. For example, I need to use this model to modify the hair color, face color and skin color of a figure in a photo
Probably caused by windows, edited the line 58 of solver to link directly to the checkpoint instead
Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "D:\Documents\Desktop\StarGAN\core\solver.py", line 58, in init
self.ckptios = [CheckpointIO(ospj(args.checkpoint_dir, '{:06d}_nets_ema.ckpt'), **self.nets_ema)]
File "D:\Documents\Desktop\StarGAN\core\checkpoint.py", line 17, in init
os.makedirs(os.path.dirname(fname_template), exist_ok=True)
File "D:\Dev\Python\lib\os.py", line 220, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] Path not found: '{:'
I am confused that the output of ResBlk functions would divide 'math.sqrt(2)'
@yunjey @youngjung
I'm running the training with default --batch_size 8
and I get:
RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB (GPU 0; 15.75 GiB total capacity; 14.58 GiB already a llocated; 22.88 MiB free; 14.75 GiB reserved in total by PyTorch)
Server details:
running this training on Google Cloud Platform.
g_loss, g_losses_latent = compute_g_loss(nets, args, x_real, y_org, y_trg, z_trgs=[z_trg, z_trg2], masks=masks, face_mask=face_mask)
self._reset_grad()
g_loss.backward()
optims.generator.step()
optims.mapping_network.step()
optims.style_encoder.step()
g_loss, g_losses_ref = compute_g_loss(nets, args, x_real, y_org, y_trg, x_refs=[x_ref, x_ref2], masks=masks, face_mask=face_mask)
self._reset_grad()
g_loss.backward()
optims.generator.step()
Hi, I was very curious why the style encoder isn't updated when computing the loss with reference images.
Hello, according to the source code you provided, I keep the same parameters, except batchsize changed to 4, training a model.
However, it was found that the batch size 4 model was quite different from the official model, especially in terms of hairstyle and style diversification.
Could you please help me to analyze the reason? Is it batch size or the parameter used by the official model is different from the source code?
Thanks a million!
When I run AFHQ training code
python main.py --mode train --num_domains 3 --w_hpf 0 \
--lambda_reg 1 --lambda_sty 1 --lambda_ds 2 --lambda_cyc 1 \
--train_img_dir data/afhq/train \
--val_img_dir data/afhq/val
The printed namespace is including celeba values.(Bold part below).
Is it normal? Since I face same phenomenon in custom dataset training, and suspect that it's a possible cause of errors.
Namespace(batch_size=8, beta1=0.0, beta2=0.99, checkpoint_dir='expr/checkpoints', ds_iter=100000, eval_dir='expr/eval', eval_every=50000, f_lr=1e-06, hidden_dim=512, img_size=256, inp_dir='**assets/representative/custom/female'**, lambda_cyc=1.0, lambda_ds=2.0,
lambda_reg=1.0, lambda_sty=1.0, latent_dim=16, **lm_path='expr/checkpoints/celeba_lm_mean.npz'**, lr=0.0001, mode='train', num_domains=3, num_outs_per_domain=10, num_workers=4, **out_dir='assets/representative/celeba_hq/src/female',** print_every=10, randcrop_prob=0.5, **ref_dir='assets/representative/celeba_hq/ref**', result_dir='expr/results', resume_iter=0, sample_dir='expr/samples', sample_every=5000, save_every=10000, seed=777, **src_dir='assets/representative/celeba_hq/src',** style_dim=64, total_iters=100000, train_img_dir='data/afhq/train', val_batch_size=32, val_img_dir='data/afhq/val', w_hpf=0.0, weight_decay=0.0001, wing_path='expr/checkpoints/wing.ckpt')
How to use different datasets? How to arrange it inside data file?
for example I want to transfer cat images to dogs.
Similar to issue #11
After train, eval error happens when use 4 domains dataset.
Should I adjust lambda_ds for #domains-1, or adjust other variable to fit with # of domains?
python main.py --mode train --num_domains 4 --w_hpf 0 \
--lambda_reg 1 --lambda_sty 1 --lambda_ds 1 --lambda_cyc 1 \
--train_img_dir data/custom/train \
--val_img_dir data/custom/val
Calculating evaluation metrics...
Number of domains: 4
Preparing DataLoader for the evaluation phase...
Traceback (most recent call last):
File "main.py", line 182, in <module>
main(args)
File "main.py", line 59, in main
solver.train(loaders)
File "/home/ipsych/ML/Stargan_v2/core/solver.py", line 170, in train
calculate_metrics(nets_ema, args, i+1, mode='latent')
File "/home/ipsych/.conda/envs/Pytorch_1_4_0/lib/python3.7/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
return func(*args, **kwargs)
File "/home/ipsych/ML/Stargan_v2/metrics/eval.py", line 61, in calculate_metrics
iter_ref = iter(loader_ref)
UnboundLocalError: local variable 'loader_ref' referenced before assignment
If you look at the StarGAN v1, one image can belong to multiple domains. It treats attributes by txt file.
And my datasets set up that way now. (Every image belongs to multiple domains.) Images are in one folder and save labels using a CSV file. Should I save the image by domain? Is there any good way to solve my problem? Thanks.
안녕하세요. 한국인이라서 혹시나 보실까하고 한국어로도 질문 남깁니다. 위에서 질문 드린것과 같이, 지금 저는 이미지들이 한 폴더에 있고 각 이미지가 여러 도메인에 속하는 상태입니다. (사진 A가 여자 도메인에 속하는 동시에 금발 도메인에 속해서 두 도메인의 학습 모두에 사진 A를 사용하려 합니다.) 제시해주신대로 데이터를 저장하려면 여러 폴더에 같은 사진이 저장되어야 하는데 이는 너무 번거롭고 용량도 많이 차지해서 다른 좋은 방법이 있을지 궁금합니다. 감사합니다. :)
Hi,
First, it's a great project, and thank you for sharing the documents and codes. However, I have encountered two issues listed below. Thanks again for your help in advance!
CondaValueError: invalid package specification: x264=='1152.20180717
python main.py --mode sample --num_domains 2 --resume_iter 100000 --w_hpf 1 --checkpoint_dir expr/checkpoints/celeba_hq --result_dir expr/results/celeba_hq --src_dir assets/representative/celeba_hq/src --ref_dir assets/representative/celeba_hq/ref
Error message:
Traceback (most recent call last):
File "main.py", line 182, in
main(args)
File "main.py", line 37, in main
solver = Solver(args)
File "C:\Users\xxx\stargan-v2\core\solver.py", line 58, in init
self.ckptios = [CheckpointIO(ospj(args.checkpoint_dir, '{:06d}_nets_ema.ckpt'), **self.nets_ema)]
File "C:\Users\xxx\stargan-v2\core\checkpoint.py", line 17, in init
os.makedirs(os.path.dirname(fname_template), exist_ok=True)
File "C:\Users\xxx.conda\envs\stargan-v2\lib\os.py", line 220, in makedirs
mkdir(name, mode)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '{:'
Running command to generate images fails on:
File "/home/bobi/.local/lib/python3.6/site-packages/torchvision/transforms/functional.py", line 5, in <module>
from PIL import Image, ImageOps, ImageEnhance, PILLOW_VERSION
ImportError: cannot import name 'PILLOW_VERSION'
(stargan-v2) bobi@strix:~/Desktop/stargan-v2$ python
Python 3.6.7 |Anaconda, Inc.| (default, Oct 23 2018, 19:16:44)
[GCC 7.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import PIL
>>> PIL.PILLOW_VERSION
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'PIL' has no attribute 'PILLOW_VERSION
I works with
pip install "pillow<7"
Hi @yunjey ,
Is it possible to only reference certain attribute, like hair color, or skin color of reference image, instead of reference whole style on it?
Thanks
I have downloaded the pre-trained models trained on AFHQ dataset, I wonder why the ckpt file still contains the weights of FAN network? Isn't it unused when training models on AFHQ?
In the Readme file the repo is not this one.
Hello,
Thanks for awesome codes and application.
I have a question about training many (6~8) domains like Stargan RAFD implementation in github.
When I make all domains separate folders and train with -ds 8, the result model totally mixes the identity of the photo. The pre-trained CelebHQ model works far better to generate the results even I put the photos of domains as male or female source. (i.e. let the person smile while preserve the identity).
Is it related to changed GAN structure, and Gender is far vivid feature compare to expressions?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.