Giter Club home page Giter Club logo

transgan's Introduction

TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up

Code used for TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up.

Implementation

  • checkpoint gradient using torch.utils.checkpoint
  • 16bit precision training
  • Distributed Training (Faster!)
  • IS/FID Evaluation
  • Gradient Accumulation
  • Stronger Data Augmentation
  • Self-Modulation

Guidance

Cifar training script

python exp/cifar_train.py

I disabled the evaluation during training job as it causes strange bug. Please launch another evaluation job simultaneously by copying the path to test script.

Cifar test

First download the cifar checkpoint and put it on ./cifar_checkpoint. Then run the following script.

python exp/cifar_test.py

Main Pipeline

Main Pipeline

Representative Visual Results

Cifar Visual Results Visual Results

README waits for updated

Acknowledgement

Codebase from AutoGAN, pytorch-image-models

Citation

if you find this repo is helpful, please cite

@article{jiang2021transgan,
  title={Transgan: Two pure transformers can make one strong gan, and that can scale up},
  author={Jiang, Yifan and Chang, Shiyu and Wang, Zhangyang},
  journal={Advances in Neural Information Processing Systems},
  volume={34},
  year={2021}
}

transgan's People

Contributors

yifanjiang19 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

transgan's Issues

The model code

I want to know if you could push the code of the generator and discriminator.

Linear Unflatten layer

As I understand, your paper wants to completely remove convolution layers. But in code, for the linear unflatten layer (to obtain RGB image), I see you use conv2d. Why do you use this? Is there anyway that we can get RGB image without conv2d?

Training for CIFAR-10

Hello,

I've been having a lot of trouble running the most basic model training for CIFAR-10. Is this the correct usage?

python exps/cifar_train.py

The code appeared to stall, and rerunning a second time resulted in "Address already in use", and I wasn't able to find options to run for either a single GPU/run without multiprocessing-distributed. I've tried editing cifar_train.py to specify only a single GPU, and otherwise calling torch.cuda.set_device(...), but the training never goes through.

I did have to change the version of tensorboard from requirements.txt, though I don't see how that would result in these issues. Do you have any advice on how to make the training run and complete?

Thank you!

Checkpoint doesn't match

Commit 9442615 breaks the checkpoint loading:

RuntimeError: Error(s) in loading state_dict for DataParallel:
        Unexpected key(s) in state_dict: "module.to_rgb.0.weight", "module.to_rgb.0.bias", "module.to_rgb.0.running_mean", "module.to_rgb.0.running_var", "module.to_rgb.0.num_batches_tracked". 

Could you upload the updated checkpoint, thanks.

loss

can you share the loss curve during training? i found that the losses do not converge during training.
image

training code

Hello! When will the training code be updated, or how should I write it? Could you please guide me, thank you very much.

Error occurs when training on Cifar-10

Excuse me. I tried to train the transGAN on cifar10 dataset by using the script in "exps/cifar_train.py", but error occurs with RuntimeError: The size of tensor a (64) must match the size of tensor b (256) at non-singleton dimension 3 in line attn = attn + relative_position_bias.unsqueeze(0) and real_validity = dis_net(real_imgs).
截图_2021-06-29_19-22-47
I print the shape of tensor 'attn' and 'relative_position_bias' with (64, 4, 64, 64) and (4, 256, 256) . I haven't modify the code in discriminator.
截图_2021-06-29_19-23-33
I don't know how to solve this problem.

The result FID on CIFAR10 could not be reimplemented.

Hi,

My environment configuration is same as requirements.txt except tensorflow uninstalled.

I run python exps/cifar_train.py with 4 GeForce GTX TITAN (12G).

The slowest FID is 10.72, which is much higher than 9.26 in paper.

My train log:-----------------------------
2021-08-01 11:50:07,095 Namespace(D_downsample='avg', accumulated_times=1, arch=None, baseline_decay=0.9, batch_size=16, beta1=0.0, beta2=0.99, bottom_width=8, channels=3, controller='controller', ctrl_lr=0.00035, ctrl_sample_batch=1, ctrl_step=30, d_act='gelu', d_depth=3, d_heads=4, d_lr=0.0001, d_mlp=4, d_norm='ln', d_spectral_norm=False, d_window_size=8, data_path='./data', dataset='cifar10', df_dim=384, diff_aug='translation,cutout,color', dis_batch_size=16, dis_model='ViT_custom_scale2', dist_backend='nccl', dist_url='tcp://localhost:14256', distributed=True, dropout=0.0, dynamic_reset_threshold=0.001, dynamic_reset_window=500, ema=0.9999, ema_kimg=500, ema_warmup=0.1, entropy_coeff=0.001, eval_batch_size=8, exp_name='cifar_train', fade_in=0.0, fid_stat='None', g_accumulated_times=1, g_act='gelu', g_depth='5,4,2', g_lr=0.0001, g_mlp=4, g_norm='ln', g_spectral_norm=False, g_window_size=8, gen_batch_size=32, gen_model='ViT_custom', gf_dim=1024, gpu=0, grow_step1=25, grow_step2=55, grow_steps=[0, 0], hid_size=100, img_size=32, init_type='xavier_uniform', latent_dim=256, latent_norm=False, load_path=None, loca_rank=-1, loss='wgangp-eps', lr_decay=False, max_epoch=2558.0, max_iter=500000, max_search_iter=90, ministd=False, multiprocessing_distributed=True, n_classes=0, n_critic=4, num_candidate=10, num_eval_imgs=20000, num_landmarks=64, num_workers=4, optimizer='adam', patch_size=2, path_helper={'prefix': 'logs/cifar_train_2021_08_01_11_50_07', 'ckpt_path': 'logs/cifar_train_2021_08_01_11_50_07/Model', 'log_path': 'logs/cifar_train_2021_08_01_11_50_07/Log', 'sample_path': 'logs/cifar_train_2021_08_01_11_50_07/Samples'}, phi=1.0, print_freq=50, random_seed=12345, rank=0, rl_num_eval_img=5000, seed=12345, shared_epoch=15, show=False, topk=5, val_freq=20, wd=0.001, world_size=4)
2021-08-01 16:00:22,196 => calculate inception score
2021-08-01 16:03:21,008 Inception score: 0, FID score: 74.7215576171875 || @ epoch 20.
2021-08-01 20:01:06,548 => calculate inception score
2021-08-01 20:04:05,234 Inception score: 0, FID score: 49.187530517578125 || @ epoch 40.
2021-08-02 00:01:34,926 => calculate inception score
2021-08-02 00:04:33,541 Inception score: 0, FID score: 41.36199951171875 || @ epoch 60.
2021-08-02 04:01:46,366 => calculate inception score
2021-08-02 04:04:45,042 Inception score: 0, FID score: 34.57147216796875 || @ epoch 80.
2021-08-02 08:02:00,014 => calculate inception score
2021-08-02 08:04:58,443 Inception score: 0, FID score: 28.511077880859375 || @ epoch 100.
2021-08-02 12:02:32,033 => calculate inception score
2021-08-02 12:05:30,596 Inception score: 0, FID score: 23.330780029296875 || @ epoch 120.
2021-08-02 16:02:47,431 => calculate inception score
2021-08-02 16:05:46,199 Inception score: 0, FID score: 19.77392578125 || @ epoch 140.
2021-08-02 20:09:35,037 => calculate inception score
2021-08-02 20:12:33,914 Inception score: 0, FID score: 16.3189697265625 || @ epoch 160.
2021-08-03 00:10:55,077 => calculate inception score
2021-08-03 00:13:53,645 Inception score: 0, FID score: 14.2860107421875 || @ epoch 180.
2021-08-03 04:12:10,663 => calculate inception score
2021-08-03 04:15:09,299 Inception score: 0, FID score: 13.2266845703125 || @ epoch 200.
2021-08-03 08:13:34,201 => calculate inception score
2021-08-03 08:16:32,938 Inception score: 0, FID score: 12.24041748046875 || @ epoch 220.
2021-08-03 12:14:29,228 => calculate inception score
2021-08-03 12:17:27,896 Inception score: 0, FID score: 11.8358154296875 || @ epoch 240.
2021-08-03 16:15:31,432 => calculate inception score
2021-08-03 16:18:51,338 Inception score: 0, FID score: 11.555419921875 || @ epoch 260.
2021-08-03 20:17:01,392 => calculate inception score
2021-08-03 20:20:00,398 Inception score: 0, FID score: 11.2257080078125 || @ epoch 280.
2021-08-04 00:17:57,224 => calculate inception score
2021-08-04 00:20:56,055 Inception score: 0, FID score: 10.96759033203125 || @ epoch 300.
2021-08-04 04:18:56,332 => calculate inception score
2021-08-04 04:21:55,526 Inception score: 0, FID score: 10.788848876953125 || @ epoch 320.
2021-08-04 08:19:56,183 => calculate inception score
2021-08-04 08:22:58,059 Inception score: 0, FID score: 10.72705078125 || @ epoch 340.
2021-08-04 12:20:55,906 => calculate inception score
2021-08-04 12:23:54,770 Inception score: 0, FID score: 11.013397216796875 || @ epoch 360.
2021-08-04 16:21:42,184 => calculate inception score
2021-08-04 16:24:40,722 Inception score: 0, FID score: 11.45196533203125 || @ epoch 380.
2021-08-04 20:22:26,842 => calculate inception score
2021-08-04 20:25:25,717 Inception score: 0, FID score: 11.9810791015625 || @ epoch 400.
2021-08-05 00:23:18,124 => calculate inception score
2021-08-05 00:26:17,206 Inception score: 0, FID score: 12.340728759765625 || @ epoch 420.
2021-08-05 04:24:06,863 => calculate inception score
2021-08-05 04:27:05,850 Inception score: 0, FID score: 13.225006103515625 || @ epoch 440.
2021-08-05 08:25:24,152 => calculate inception score
2021-08-05 08:28:23,146 Inception score: 0, FID score: 13.89190673828125 || @ epoch 460.
2021-08-05 12:27:29,588 => calculate inception score
2021-08-05 12:30:28,551 Inception score: 0, FID score: 14.103118896484375 || @ epoch 480.
2021-08-05 16:28:49,395 => calculate inception score
2021-08-05 16:31:48,632 Inception score: 0, FID score: 14.109130859375 || @ epoch 500.
2021-08-05 20:30:04,791 => calculate inception score
2021-08-05 20:33:03,678 Inception score: 0, FID score: 14.337677001953125 || @ epoch 520.
2021-08-06 00:31:27,460 => calculate inception score
2021-08-06 00:34:26,133 Inception score: 0, FID score: 14.567657470703125 || @ epoch 540.
2021-08-06 04:32:47,586 => calculate inception score
2021-08-06 04:35:46,843 Inception score: 0, FID score: 14.43280029296875 || @ epoch 560.
2021-08-06 08:34:08,444 => calculate inception score
2021-08-06 08:37:07,550 Inception score: 0, FID score: 14.98822021484375 || @ epoch 580.
2021-08-06 12:35:32,232 => calculate inception score
2021-08-06 12:38:31,877 Inception score: 0, FID score: 14.719512939453125 || @ epoch 600.
2021-08-06 16:37:05,458 => calculate inception score
2021-08-06 16:40:04,417 Inception score: 0, FID score: 14.872100830078125 || @ epoch 620.
2021-08-06 20:38:30,419 => calculate inception score
2021-08-06 20:41:29,706 Inception score: 0, FID score: 14.83355712890625 || @ epoch 640.
2021-08-07 00:39:53,802 => calculate inception score
2021-08-07 00:42:52,606 Inception score: 0, FID score: 15.34588623046875 || @ epoch 660.
2021-08-07 04:41:16,504 => calculate inception score
2021-08-07 04:44:14,945 Inception score: 0, FID score: 15.1998291015625 || @ epoch 680.
2021-08-07 08:42:53,671 => calculate inception score
2021-08-07 08:45:52,648 Inception score: 0, FID score: 15.45361328125 || @ epoch 700.

About the patchsize in the Generator

Dear author:
Thanks for your job. In your model_search cifa and 256 size code ,I do not find where the patchsize use in the Generator but only in the Discriminator. Can you tell me where it is in your code? Thank you very much.

models_search

in 'models_search" file, why there don't have "building_blocks_search" ?

run Transgan with single gpu

hello,your work is very exciting but i need to modify your code to run successfully on a GPU have tried many times,but i have not succeded.May i ask your advice?

Training on single GPU

What script should I use to train TransGAN (celeba_hq dataset) on a single GPU? I am using a PC with 4GB NVIDIA GTX 1650.

Self-Attention

Why is in the self-attention layer you have a linear projection after multiplying the value matrix with the attention weights ?
In this overview you can see, as much as I can tell that it doesn't appear: https://arxiv.org/pdf/1906.01529.pdf.
In some implementations it doesn't appear and in some it does, what is the impact of it to the attention ?
Is it add weights to the attention for its better configuration ?

self.proj = nn.Linear(dim, dim)

in theAttention model:

class Attention(nn.Module):
    def __init__(self, dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0., proj_drop=0., is_mask=0):
        super().__init__()
        self.num_heads = num_heads
        head_dim = dim // num_heads
        # NOTE scale factor was wrong in my original version, can set manually to be compat with prev weights
        self.scale = qk_scale or head_dim ** -0.5

        self.qkv = nn.Linear(dim, dim * 3, bias=qkv_bias)
        self.attn_drop = nn.Dropout(attn_drop)
        self.proj = nn.Linear(dim, dim)
        self.proj_drop = nn.Dropout(proj_drop)
        self.mat = matmul()
        self.is_mask = is_mask
        self.remove_mask = False
        self.mask_4 = get_attn_mask(is_mask, 4)
        self.mask_5 = get_attn_mask(is_mask, 5)
        self.mask_6 = get_attn_mask(is_mask, 6)
        self.mask_7 = get_attn_mask(is_mask, 7)
        self.mask_8 = get_attn_mask(is_mask, 8)
        self.mask_10 = get_attn_mask(is_mask, 10)

Training script

Hello!
Thanks for your works on TransGAN, it's amazing! When do you plan to post the training scripts?

FID score

why FID score are always nan in CIFAR10?I calculate the FID score for two identical datasets that are also NAN.
image

image
image

Generated Images Have Some Blocking Artifact

Due to the patch-wise generation of TransGAN, I found some blocking artifacts in your generations. I think the authors had already known these phenomena. Are there any tricks to eliminate these artifacts?
image

Can this model generate 128 or 256 resolution images?

Can this model generate 128 or 256 resolution images? If so, what is the cfg of corresponding situation?
I tried 128 with the parameter "bottom_width" equal to 16 , but single GPU card with 24 GB seems not enough.

celeba

The size of your data set celeba is 6464, 128128, 256*256, so how do you process the data to get different sizes ?

Image sizes don't match

Commit 1c51d9f breaks the size of some of the layers in the checkpoint when running the STL10 test:

RuntimeError: Error(s) in loading state_dict for DataParallel:
        size mismatch for module.pos_embed: copying a param with shape torch.Size([1, 145, 384]) from checkpoint, the shape in current model is torch.Size([1, 2305, 384]).
        size mismatch for module.patch_embed.weight: copying a param with shape torch.Size([384, 3, 4, 4]) from checkpoint, the shape in current model is torch.Size([384, 3, 1, 1]).

Would you be able to upload the new checkpoint, thank you.

Where is implementation of MT-CT?

Hello, When I saw the training generator code in funtions.py, there was no the part corresponding to the multi-task co-training(MT-CT). Did i miss the part of MT-CT??

Thanks!

About the role of function “get_attn_mask”.

Thanks for your work!

When I'm looking at the code of model/Celeba64_TransGAN.py , I notice that the function “get_attn_mask” seems to play a role in the training process.Can you point out the specific role of this function?

Thank you~~

Training fails

I've tried using the function train in functions.py, and training seems to fail:

  File "functional.py", line 86, in adam
    exp_avg.mul_(beta1).add_(grad, alpha=1 - beta1)
RuntimeError: output with shape [768] doesn't match the broadcast shape [3, 48, 1, 768]
``

Question about GAN training.

Hi, thanks for the work. Follow the training code in functions.py, it seems that you did not freeze D when training G. When running dis_optimizer.step(), the gradient from G training will also be used to update D's parameters. So I wonder whether if I missed something, or it was a bug here. Thanks a lot!

Model Size

Could you please report the model size of the proposed TransGAN? Compared with the StyleGAN and StyleGAN2 of resolution=256

train

Hi, I'm trying to train the model on CelebA64*64. as you said that the mask plays a big role in the training stage, could you please tell me how to set the "is_mask" argument? Also, could you please tell me how to set the "drop","attn_drop" and "drop_path" in the "Block" class when initializing it? Thank you very much!

Question

May I ask why the mask is used in the Transformer Encoder? It seems that the original implemention only uses the mask in the Transformer Decoder.Thanks.

Is the function `get_attn_mask` the same one used in the paper on arxiv?

Hi,
When I try to recreate the attention mask using the function get_attn_mask in models/TransGAN_8_8_G2_1.py or models/TransGAN_8_8_1.py

I end up with the image below.
The image on the left is the output from get_attn_mask with N=32*32, w = 25.
The image on the right is reshaping the mask for 495th row (or pixel) into into a 32x32 image (like in the paper).
The grey color represents that pixel.

As it's visible the outputs won't match up with the paper for any value of w.

I appreciate any help that you can provide. Thanks!

download (1)

(PS: Added the image from the paper for reference.)
paper

celeba image size

in your data set, what is the size of the celeba data you use ?

Using Conv?

Hello, thank you for uploading the code for your awesome work.

I'm looking at the code and there is a convolutional layer in the generator (self.deconv). Is there some reason using it?
On the paper it says there are no convolutional layers, so I'm a little bit confused

Thank you

RuntimeError: The size of tensor a (5) must match the size of tensor b (4097) at non-singleton dimension 1

Before iteration , i meet follow error,what should i do
image

and the cfg.py

import argparse

def str2bool(v):
if v.lower() in ('yes', 'true', 't', 'y', '1'):
return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
return False
else:
raise argparse.ArgumentTypeError('Boolean value expected.')

def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--world-size', default=-1, type=int,
help='number of nodes for distributed training')
parser.add_argument('--rank', default=-1, type=int,
help='node rank for distributed training')
parser.add_argument('--loca_rank', default=-1, type=int,
help='node rank for distributed training')
parser.add_argument('--dist-url', default='tcp://224.66.41.62:23456', type=str,
help='url used to set up distributed training')
parser.add_argument('--dist-backend', default='nccl', type=str,
help='distributed backend')
parser.add_argument('--seed', default=12345, type=int,
help='seed for initializing training. ')
parser.add_argument('--gpu', default=0, type=int,
help='GPU id to use.')
parser.add_argument('--multiprocessing-distributed', action='store_true',
help='Use multi-processing distributed training to launch '
'N processes per node, which has N GPUs. This is the '
'fastest way to use PyTorch for either single node or '
'multi node data parallel training')
parser.add_argument(
'--max_epoch',
type=int,
default=200,
help='number of epochs of training')
parser.add_argument(
'--max_iter',
type=int,
default=10,
help='set the max iteration number')
parser.add_argument(
'-gen_bs',
'--gen_batch_size',
type=int,
default=4,
help='size of the batches')
parser.add_argument(
'-dis_bs',
'--dis_batch_size',
type=int,
default=4,
help='size of the batches')
parser.add_argument(
'--g_lr',
type=float,
default=0.0002,
help='adam: gen learning rate')
parser.add_argument(
'--wd',
type=float,
default=0,
help='adamw: gen weight decay')
parser.add_argument(
'--d_lr',
type=float,
default=0.0002,
help='adam: disc learning rate')
parser.add_argument(
'--ctrl_lr',
type=float,
default=3.5e-4,
help='adam: ctrl learning rate')
parser.add_argument(
'--lr_decay',
action='store_true',
help='learning rate decay or not')
parser.add_argument(
'--beta1',
type=float,
default=0.0,
help='adam: decay of first order momentum of gradient')
parser.add_argument(
'--beta2',
type=float,
default=0.9,
help='adam: decay of first order momentum of gradient')
parser.add_argument(
'--num_workers',
type=int,
default=0,
help='number of cpu threads to use during batch generation')
parser.add_argument(
'--latent_dim',
type=int,
default=128,
help='dimensionality of the latent space')
parser.add_argument(
'--img_size',
type=int,
default=256,
help='size of each image dimension')
parser.add_argument(
'--channels',
type=int,
default=3,
help='number of image channels')
parser.add_argument(
'--n_critic',
type=int,
default=1,
help='number of training steps for discriminator per iter')
parser.add_argument(
'--val_freq',
type=int,
default=20,
help='interval between each validation')
parser.add_argument(
'--print_freq',
type=int,
default=100,
help='interval between each verbose')
parser.add_argument(
'--load_path',
type=str,
help='The reload model path')
parser.add_argument(
'--exp_name',
type=str,
default='Test',
help='The name of exp')
parser.add_argument(
'--d_spectral_norm',
type=str2bool,
default=False,
help='add spectral_norm on discriminator?')
parser.add_argument(
'--g_spectral_norm',
type=str2bool,
default=False,
help='add spectral_norm on generator?')
parser.add_argument(
'--dataset',
type=str,
default='cifar10',
help='dataset type')
parser.add_argument(
'--data_path',
type=str,
default='./data',
help='The path of data set')
parser.add_argument('--init_type', type=str, default='normal',
choices=['normal', 'orth', 'xavier_uniform', 'false'],
help='The init type')
parser.add_argument('--gf_dim', type=int, default=64,
help='The base channel num of gen')
parser.add_argument('--df_dim', type=int, default=64,
help='The base channel num of disc')
parser.add_argument(
'--gen_model',
type=str,
default='ViT_custom_rp',
help='path of gen model')
parser.add_argument(
'--dis_model',
type=str,
default='ViT_custom_rp',
help='path of dis model')
parser.add_argument(
'--controller',
type=str,
default='controller',
help='path of controller')
parser.add_argument('--eval_batch_size', type=int, default=100)
parser.add_argument('--num_eval_imgs', type=int, default=50000)
parser.add_argument(
'--bottom_width',
type=int,
default=4,
help="the base resolution of the GAN")
parser.add_argument('--random_seed', type=int, default=12345)

# search
parser.add_argument('--shared_epoch', type=int, default=15,
                    help='the number of epoch to train the shared gan at each search iteration')
parser.add_argument('--grow_step1', type=int, default=25,
                    help='which iteration to grow the image size from 8 to 16')
parser.add_argument('--grow_step2', type=int, default=55,
                    help='which iteration to grow the image size from 16 to 32')
parser.add_argument('--max_search_iter', type=int, default=90,
                    help='max search iterations of this algorithm')
parser.add_argument('--ctrl_step', type=int, default=30,
                    help='number of steps to train the controller at each search iteration')
parser.add_argument('--ctrl_sample_batch', type=int, default=1,
                    help='sample size of controller of each step')
parser.add_argument('--hid_size', type=int, default=100,
                    help='the size of hidden vector')
parser.add_argument('--baseline_decay', type=float, default=0.9,
                    help='baseline decay rate in RL')
parser.add_argument('--rl_num_eval_img', type=int, default=5000,
                    help='number of images to be sampled in order to get the reward')
parser.add_argument('--num_candidate', type=int, default=10,
                    help='number of candidate architectures to be sampled')
parser.add_argument('--topk', type=int, default=5,
                    help='preserve topk models architectures after each stage' )
parser.add_argument('--entropy_coeff', type=float, default=1e-3,
                    help='to encourage the exploration')
parser.add_argument('--dynamic_reset_threshold', type=float, default=1e-3,
                    help='var threshold')
parser.add_argument('--dynamic_reset_window', type=int, default=500,
                    help='the window size')
parser.add_argument('--arch', nargs='+', type=int,
                    help='the vector of a discovered architecture')
parser.add_argument('--optimizer', type=str, default="adam",
                    help='optimizer')
parser.add_argument('--loss', type=str, default="hinge",
                    help='loss function')
parser.add_argument('--n_classes', type=int, default=0,
                    help='classes')
parser.add_argument('--phi', type=float, default=1,
                    help='wgan-gp phi')
parser.add_argument('--grow_steps', nargs='+', type=int,default=[50,100,150],
                    help='the vector of a discovered architecture')
parser.add_argument('--D_downsample', type=str, default="avg",
                    help='downsampling type')
parser.add_argument('--fade_in', type=float, default=1,
                    help='fade in step')
parser.add_argument('--d_depth', type=int, default=7,
                    help='Discriminator Depth')
parser.add_argument('--g_depth', type=str, default="5,4,2",
                    help='Generator Depth')
parser.add_argument('--g_norm', type=str, default="ln",
                    help='Generator Normalization')
parser.add_argument('--d_norm', type=str, default="ln",
                    help='Discriminator Normalization')
parser.add_argument('--g_act', type=str, default="gelu",
                    help='Generator activation Layer')
parser.add_argument('--d_act', type=str, default="gelu",
                    help='Discriminator activation layer')
parser.add_argument('--patch_size', type=int, default=4,
                    help='Discriminator Depth')
parser.add_argument('--fid_stat', type=str, default="./fid_stat/fid_camera.npz",
                    help='Discriminator Depth')
parser.add_argument('--diff_aug', type=str, default="None",
                    help='differentiable augmentation type')
parser.add_argument('--accumulated_times', type=int, default=1,
                    help='gradient accumulation')
parser.add_argument('--g_accumulated_times', type=int, default=1,
                    help='gradient accumulation')
parser.add_argument('--num_landmarks', type=int, default=64,
                    help='number of landmarks')
parser.add_argument('--d_heads', type=int, default=4,
                    help='number of heads')
parser.add_argument('--dropout', type=float, default=0.,
                    help='dropout ratio')
parser.add_argument('--ema', type=float, default=0.995,
                    help='ema')
parser.add_argument('--ema_warmup', type=float, default=0.,
                    help='ema warm up')
parser.add_argument('--ema_kimg', type=int, default=500,
                    help='ema thousand images')
parser.add_argument('--latent_norm',action='store_true',
    help='latent vector normalization')
parser.add_argument('--ministd',action='store_true',
    help='mini batch std')
parser.add_argument('--g_mlp', type=int, default=4,
                    help='generator mlp ratio')
parser.add_argument('--d_mlp', type=int, default=4,
                    help='discriminator mlp ratio')
parser.add_argument('--g_window_size', type=int, default=8,
                    help='generator mlp ratio')
parser.add_argument('--d_window_size', type=int, default=8,
                    help='discriminator mlp ratio')
parser.add_argument('--show', action='store_true',
                help='show')

opt = parser.parse_args()

return opt

fid

what is 'fid_stats_celeba_hq_256.npz '? how to get it ? and the calculate of 'fid stat' is necessary ?

Evolution of the generations during training

Hello, I was wondering if you had available anywhere some samples of the outputs of the GAN a different stages of training: like before training, after N epochs, etc... To get an idea of how the model is reaching its goal, and what to expect while training to see if I'm way out of the path. I have been trying to make something very similar with limited success, and always have to fall back to some convolutional layers (like putting a Conv2D after every attention layer) to get any relevant results...

Thanks!

Question about Output of dimension

I try to use a generator in ViT_scale3_local_new_rp.py
But I find the output is a noise like
image
Meanwhile, I also find the dimension of the different stages is different from paper
image
Would you mind helping me to solve this problem?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.