Giter Club home page Giter Club logo

stylegan2-tf-2.x's People

Contributors

moono avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

stylegan2-tf-2.x's Issues

What are the "labels" in the generator and discriminator?

Thank you for this awesome repo! I just really struggle to understand what are supposed to be the labels in the generator and discriminator, although they are not actually used since labels_dim is 0 by default for training? (c.f. labels in g_logistic_ns_pathreg)

different resolution pkl file

Hi,

When i try to execute inference_from_official_weights, if I put the checkpoint-required files(in official-pretrained folder) for resolutions other than 1024x1024( for eg if i put the pkl file for stylegan2 that corrisponds to resolution 256x256 (config d) ) and then i try to execute inference_from_official_weights, it doesnt work..any hint why this happens?

when training in 256x256, the output is 256x512 pixels

First, thanks for this great repository. It is very useful to study the sylegan2 architecture!

When training in 256x256 resolution, the output images have a size of 256x512 (h x w), These are in fact two images stacked on top of each other. I can easily 'unstack' this output by reshaping the tensor, but i wonder why it happens? If my batch size is 2, i get 4 outputs. This will become problematic when i will increase resolution and need to generate just a single 512x152. I don't want the system to actually generate a 512x1024.

The two 'stacked' images are also quite similar.

Memory leak

Hi,

My dataset is only around 25gb in memory, but after training for a few hours the memory usage is already more than 100gb and it keeps slowly but constantly increasing. I'm using a custom dataset created with tf.data.Dataset.from_generator. Do you know where the problem could be?

Using 1 gpu (3090), with custom cuda, batch of 4, no labels. The rest is more or less the original code.

Training details

Hi,
This is exactly what I was looking for. Thank you.
But, it is not clear to me how the training needs to be done. Could you please help me with that?

NHWC Support

Hi, thanks for the great repo. I am trying to convert the generator model to tflite. I get this error: "Unexpected value for attribute 'data_format'. Expected 'NHWC'." Do you have an option for NHWC? Or do you have any other idea to convert the generator to tflite file? Thanks...

Training crash randomly

Few thousands of steps after training start, I get the following error :

ValueError: in user code:

    /content/drive/My Drive/stylegan2-master/train.py:198 dist_d_train_step  *
        per_replica_losses = strategy.run(fn=self.d_train_step, args=(inputs,))
    /content/drive/My Drive/stylegan2-master/train.py:129 d_train_step  *
        d_loss = d_logistic(real_images, self.generator, self.discriminator, self.g_params['z_dim'])
    /content/drive/My Drive/stylegan2-master/losses.py:12 d_logistic  *
        real_scores = discriminator([real_images, labels], training=True)
    /content/drive/My Drive/stylegan2-master/stylegan2_ref/discriminator.py:129 call  *
        x = self.last_block(x)
    /content/drive/My Drive/stylegan2-master/stylegan2_ref/discriminator.py:85 call  *
        x = self.minibatch_std(x)
    /content/drive/My Drive/stylegan2-master/stylegan2_ref/custom_layers.py:158 call  *
        y = tf.reshape(x, [group_size, -1, self.num_new_features, s[1] // self.num_new_features, s[2], s[3]])
    /usr/local/lib/python3.6/dist-packages/tensorflow/python/util/dispatch.py:201 
    [...]  
    raise ValueError(str(e))

ValueError: Dimension size must be evenly divisible by 32768 but is 49152 for '{{node discriminator/4x4/minibatchstd/Reshape}} = Reshape[T=DT_FLOAT, Tshape=DT_INT32](discriminator/8x8/mul, discriminator/4x4/minibatchstd/Reshape/shape)' with input shapes: [6,512,4,4], [6] and with input tensors computed as partial shapes: input[1] = [4,?,1,512,4,4].

The model is learning and I can generate images, but restart is needed each 10 minutes.
Any idea on how to fix it ?

Error

You ResizeConv2D not work when upsampling, I can't use this module

10 errors detected in the compilation of upfirdn_2d.cu

Thank you very much for your great work and contribution!

I try to get the code running using CUDA Version: 11.6 , Tensorflow 2.8 and Cudnn 8303

When calling the discriminator through train.py I get several NVCC errors in upfirdn_2d.cu.

Tensorflow version: 2.8.0
2022-11-17 09:21:46.087225: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX AVX2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-11-17 09:21:46.430941: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 13626 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
1 Physical GPUs, 1 Logical GPUs
2022-11-17 09:21:56.034905: I tensorflow/stream_executor/cuda/cuda_blas.cc:1786] TensorFloat-32 will be used for the matrix multiplication. This will only be logged once.
2022-11-17 09:21:56.258153: I tensorflow/stream_executor/cuda/cuda_dnn.cc:368] Loaded cuDNN version 8303

Setting up TensorFlow plugin "upfirdn_2d.cu": PreprocessingC:... 2022-11-17 09:22:04.023226: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1525] Created device /device:GPU:0 with 13626 MB memory: -> device: 0, name: NVIDIA GeForce RTX 3080 Laptop GPU, pci bus id: 0000:01:00.0, compute capability: 8.6
CompilingC:... Failed!
Traceback (most recent call last):
File "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in run_code
exec(code, run_globals)
File "c:\Users\XX.vscode\extensions\ms-python.python-2022.18.2\pythonFiles\lib\python\debugpy_main
.py", line 39, in
cli.main()
File "c:\Users\XX.vscode\extensions\ms-python.python-2022.18.2\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 430, in main
run()
File "c:\Users\XX.vscode\extensions\ms-python.python-2022.18.2\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 284, in run_file
runpy.run_path(target, run_name="main")
File "c:\Users\XX.vscode\extensions\ms-python.python-2022.18.2\pythonFiles\lib\python\debugpy_vendored\pydevd_pydevd_bundle\pydevd_runpy.py", line 321, in run_path
return _run_module_code(code, init_globals, run_name,
File "c:\Users\XX.vscode\extensions\ms-python.python-2022.18.2\pythonFiles\lib\python\debugpy_vendored\pydevd_pydevd_bundle\pydevd_runpy.py", line 135, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "c:\Users\XX.vscode\extensions\ms-python.python-2022.18.2\pythonFiles\lib\python\debugpy_vendored\pydevd_pydevd_bundle\pydevd_runpy.py", line 124, in _run_code
exec(code, run_globals)
File "C:...\train.py", line 438, in
main()
File "C:...\train.py", line 432, in main
trainer = Trainer(training_parameters, name=f'stylegan2-ffhq-{args["train_res"]}x{args["train_res"]}')
File "C:...\train.py", line 63, in init
self.discriminator, self.generator, self.g_clone = initiate_models(self.g_params,
File "C:...\train.py", line 16, in initiate_models
generator = load_generator(g_params=g_params, is_g_clone=False, ckpt_dir=None, custom_cuda=use_custom_cuda)
File "C:...\load_models.py", line 25, in load_generator
_ = generator([test_latent, test_labels])
File "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\keras\utils\traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "C:...\stylegan2\generator.py", line 104, in call
image_out = self.synthesis(w_broadcasted)
File "C:...\stylegan2\layers\synthesis_block.py", line 144, in call
x = block([x, w0, w1])
File "C:...\stylegan2\layers\synthesis_block.py", line 83, in call
x = self.conv_0([x, w0])
File "C:...\stylegan2\layers\modulated_conv2d.py", line 71, in call
x = upsample_conv_2d(x, self.in_res, w, self.kernel, self.kernel, self.pad0, self.pad1, self.k)
File "C:...\stylegan2\layers\cuda\upfirdn_2d_v2.py", line 88, in upsample_conv_2d
return _simple_upfirdn_2d(x, new_x_res, k, pad0=pad0, pad1=pad1)
File "C:...\stylegan2\layers\cuda\upfirdn_2d_v2.py", line 106, in _simple_upfirdn_2d
y = upfirdn_2d_cuda(y, k, upx=up, upy=up, downx=down, downy=down, padx0=pad0, padx1=pad1, pady0=pad0, pady1=pad1)
File "C:...\stylegan2\layers\cuda\upfirdn_2d_v2.py", line 146, in upfirdn_2d_cuda
return func(x)
File "C:...\stylegan2\layers\cuda\upfirdn_2d_v2.py", line 138, in func
y = _get_plugin().up_fir_dn2d(x=x, k=kc, upx=upx, upy=upy, downx=downx, downy=downy, padx0=padx0, padx1=padx1, pady0=pady0, pady1=pady1)
File "C:...\stylegan2\layers\cuda\upfirdn_2d_v2.py", line 10, in _get_plugin
return custom_ops.get_plugin(os.path.join(loc, cu_fn))
File "C:...\stylegan2\layers\cuda\custom_ops.py", line 148, in get_plugin
_run_cmd(nvcc_cmd + ' "%s" --shared -o "%s" --keep --keep-dir "%s"' % (cuda_file, tmp_file, tmp_dir))
File "C:...\stylegan2\layers\cuda\custom_ops.py", line 62, in _run_cmd
raise RuntimeError('NVCC returned an error. See below for full command line and output log:\n\n%s\n\n%s' % (cmd, output))
RuntimeError: Exception encountered when calling layer "conv_0" (type ModulatedConv2D).

NVCC returned an error. See below for full command line and output log:

nvcc --std=c++11 -DNDEBUG "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\python_pywrap_tensorflow_internal.lib" --gpu-architecture=sm_86 --use_fast_math --disable-warnings --include-path "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include" --include-path "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include\external\protobuf_archive\src" --include-path "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include\external\com_google_absl" --include-path "C:\Users\XX\AppData\Local\Programs\Python\Python310\lib\site-packages\tensorflow\include\external\eigen_archive" --compiler-bindir "C:/Program Files/Microsoft Visual Studio/2022/Community/VC/Tools/MSVC/14.31.31103/bin/Hostx64/x64" 2>&1 "C:...\stylegan2\layers\cuda\upfirdn_2d.cu" --shared -o "C:\Users\XX\AppData\Local\Temp\tmp7cgcj6sd\upfirdn_2d_tmp.dll" --keep --keep-dir "C:\Users\XX\AppData\Local\Temp\tmp7cgcj6sd"

C:...\stylegan2\layers\cuda\upfirdn_2d.cu(310): error: expected an expression
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(310): error: no instance of constructor "tensorflow::register_op::OpDefBuilderWrapper::OpDefBuilderWrapper" matches the argument list
argument types are: (const char [10], __nv_bool)
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(323): error: expected an expression
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(323): error: expected an expression
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(323): error: expected a type specifier
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(323): error: expected an expression
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(324): error: expected an expression
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(324): error: expected an expression
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(324): error: expected a type specifier
C:...\stylegan2\layers\cuda\upfirdn_2d.cu(324): error: expected an expression

10 errors detected in the compilation of "w:/Entwicklung/300_Neural_Network/331_StyleGAN_Keras/stylegan2/layers/cuda/upfirdn_2d.cu".
nvcc warning : The -std=c++11 flag is not supported with the configured host compiler. Flag will be ignored.
_pywrap_tensorflow_internal.lib
upfirdn_2d.cu

Call arguments received:
• inputs=['tf.Tensor(shape=(1, 512, 4, 4), dtype=float32)', 'tf.Tensor(shape=(1, 512), dtype=float32)']
• training=None
• mask=None

Do you've got any idea how to fix it?

Please update readme

Hello,

This is a very useful repo. Can you please update readme so one can figure out how to train on custom datasets?

Thank you,
Siavash

Training on custom dataset failed

Hello, I was trying to run the code with the following command:

python train.py --tfrecord_dir=../datasets/butterfly-dataset --train_res=1024

and get the following error message:

2020-09-30 19:31:44.655206: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at conv_grad_input_ops.cc:1103 : Resource exhausted: OOM when allocating tensor with shape[4,64,511,511] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
Traceback (most recent call last):
File "train.py", line 438, in
main()
File "train.py", line 433, in main
trainer.train(dist_dataset, strategy)
File "train.py", line 261, in train
d_loss = dist_d_train_step((real_images, ))
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in call
result = self._call(*args, **kwds)
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
return self._stateless_fn(*args, **kwds)
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2829, in call
return graph_function._filtered_call(args, kwargs) # pylint: disable=protected-access
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1843, in _filtered_call
return self._call_flat(
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1923, in _call_flat
return self._build_call_outputs(self._inference_function.call(
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 545, in call
outputs = execute.execute(
File "/media/chembiodep/Storage/GREAT/penv38/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InternalError: 3 root error(s) found.
(0) Internal: cudaErrorNoKernelImageForDevice
[[node replica_1/discriminator/1024x1024/skip/UpFirDn2D (defined at :98) ]]
[[mul/_222]]
(1) Internal: cudaErrorNoKernelImageForDevice
[[node replica_1/discriminator/1024x1024/skip/UpFirDn2D (defined at :98) ]]
(2) Internal: cudaErrorNoKernelImageForDevice
[[node replica_1/discriminator/1024x1024/skip/UpFirDn2D (defined at :98) ]]
[[AddN_94/_234]]
0 successful operations.
0 derived errors ignored. [Op:__inference_dist_d_train_step_61417]

Errors may have originated from an input operation.
Input Source operations connected to node replica_1/discriminator/1024x1024/skip/UpFirDn2D:
replica_1/discriminator/1024x1024/skip/Reshape (defined at /media/chembiodep/Storage/GREAT/butterfly/stylegan2-tf-2.x/stylegan2/layers/cuda/upfirdn_2d_v2.py:105)
replica_1/discriminator/1024x1024/skip/Const (defined at /media/chembiodep/Storage/GREAT/butterfly/stylegan2-tf-2.x/stylegan2/layers/cuda/upfirdn_2d_v2.py:129)

Input Source operations connected to node replica_1/discriminator/1024x1024/skip/UpFirDn2D:
replica_1/discriminator/1024x1024/skip/Reshape (defined at /media/chembiodep/Storage/GREAT/butterfly/stylegan2-tf-2.x/stylegan2/layers/cuda/upfirdn_2d_v2.py:105)
replica_1/discriminator/1024x1024/skip/Const (defined at /media/chembiodep/Storage/GREAT/butterfly/stylegan2-tf-2.x/stylegan2/layers/cuda/upfirdn_2d_v2.py:129)

Input Source operations connected to node replica_1/discriminator/1024x1024/skip/UpFirDn2D:
replica_1/discriminator/1024x1024/skip/Reshape (defined at /media/chembiodep/Storage/GREAT/butterfly/stylegan2-tf-2.x/stylegan2/layers/cuda/upfirdn_2d_v2.py:105)
replica_1/discriminator/1024x1024/skip/Const (defined at /media/chembiodep/Storage/GREAT/butterfly/stylegan2-tf-2.x/stylegan2/layers/cuda/upfirdn_2d_v2.py:129)

Function call stack:
dist_d_train_step -> dist_d_train_step -> dist_d_train_step

Here is the file structure:

butterfly/
├── datasets/
│ └── butterfly-dataset
│ │ └── ...
└── stylegan2-tf-2.x
├── train.py
└── ...

Spec:
Quadro P6000
TITAN V
CUDA version: 11.1

convert official weights

Dear Moono,

Thank you for shaing this great implement. I really apreeciate it.

If I want to convert the weight from tf1 to tf2, I should download the weights from stylegan2 website (stylegan2-ffhq-config-f.pkl), and place it under official-pretrained folder, is this correct?

Then i run inference_from_official_weights.py. It output:

DataLossError: Unable to open table file ./official-pretrained/stylegan2-ffhq-config-f.pkl: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator?

Could you tell me how to solve this issue?

Thank you again for your help.

Best Wishes,

Alex

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.