naturomics / dlf Goto Github PK
View Code? Open in Web Editor NEWCode for reproducing results in "Generative Model with Dynamic Linear Flow"
Home Page: https://arxiv.org/abs/1905.03239
Code for reproducing results in "Generative Model with Dynamic Linear Flow"
Home Page: https://arxiv.org/abs/1905.03239
Hello authors,
Thank you for your paper! Do you have any idea as to when the pretrained models will be released? It would be great to hear an approximate timeframe if possible.
Could you double check that your small imagenet datasets are the same as http://image-net.org/small/train_32x32.tar, and http://image-net.org/small/valid_32x32.tar? As far as I know different preprocessing on imagenet can greatly affect the likelihood you get. For example, using the imagenet 32x32 dataset from https://patrykchrabaszcz.github.io/Imagenet32/ easily yields a bpd around 3.80 for flow models. It is weird that your model has such a big advantage on imagenet, but not on CIFAR-10
Dear @naturomics ,
Do you think DLF is good when being applied to TTS models? For example, Portaspeech has successfully gained high-quality output using simple VP-Flow and Affine Coupling. I'm wondering whether DLF can contribute it's benefits to this kindda model! Would love to see your comments!
Thanks,
Max
I am so interested in your paper. It will help me a lot if you give more training details.
How many GPUs did you use for training?
How long did it take for one epoch?
I follow the setting you give for training ImageNet 32x32, but it takes about 5 hours every epoch for one GPU 2080Ti.
By the way, you claim the results are obtained in 50 epochs and your model is more 10 times efficient than Glow. However, the epochs you defined is different from ones Glow did.
More specifically, in Glow, every epoch depends on n_train, default=50000. On the other hand, if I understand correctly, in your paper, one epoch for processing all images in the training set. Take ImageNet as an example, one epoch means 1.28M images are processed.
How do you evaluate the efficiency of your model?
Thanks a lot.
Greetings, Thanks for the nice work. I attempted to run the unconditional MNIST example as described in the README, but I hit the error Incompatible shapes: [256] vs. [96]
.
In more detail, when I run:
python main.py \
--problem mnist \
--results_dir results/mnist_noCond \
--num_levels 2 \
--width 128 \
--batch_size 256
I obtain:
Number of trainable parameters: 1828648
Train from scratch
epoch, step, loss, bits_x, bits_y, l2_loss, speed(samples/sec)
0.000, 248.73601
Incompatible shapes: [256] vs. [96]
[[node flow_0/block_0/dynamic_linear/add_1 (defined at /scratch0/pepope/DLF/layers.py:100) = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flow_0/block_0/invconv/add, flow_0/b
lock_0/dynamic_linear/Sum)]]
[[{{node loss/total_loss/_315}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incar
nation=1, tensor_name="edge_85349_loss/total_loss", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Caused by op 'flow_0/block_0/dynamic_linear/add_1', defined at:
File "main.py", line 343, in <module>
main(hps=get_arguments())
File "main.py", line 244, in main
train(model, dataloader, sess, hps)
File "main.py", line 99, in train
model.train(images, labels)
File "/scratch0/pepope/DLF/model.py", line 149, in train
condition=condition if condition is None else condition[start:end])
File "/scratch0/pepope/DLF/model.py", line 233, in _single_tower
self.encode(inputs, labels, condition)
File "/scratch0/pepope/DLF/model.py", line 65, in encode
z, objective, eps = codec(inputs, cond=condition, objective=objective, hps=self.hps, reverse=False)
File "/scratch0/pepope/DLF/model.py", line 22, in codec
hps=hps, name="flow_%s" % str(level), reuse=reuse)
File "/scratch0/pepope/DLF/layers.py", line 28, in revnet2d
logdet=logdet, hps=hps, reverse=reverse)
File "/scratch0/pepope/DLF/layers.py", line 100, in revnet2d_step
logdet += obj
File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 866, in binary_op_wrapper
return func(x, y, name=name)
File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 301, in add
"Add", x=x, y=y, name=name)
File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
op_def=op_def)
File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
return func(*args, **kwargs)
File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
op_def=op_def)
File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
self._traceback = tf_stack.extract_stack()
InvalidArgumentError (see above for traceback): Incompatible shapes: [256] vs. [96]
[[node flow_0/block_0/dynamic_linear/add_1 (defined at /scratch0/pepope/DLF/layers.py:100) = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flow_0/block_0/invconv/add, flow_0/b
lock_0/dynamic_linear/Sum)]]
[[{{node loss/total_loss/_315}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incar
nation=1, tensor_name="edge_85349_loss/total_loss", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
It appears to be a size mismatch occurring at
File "/scratch0/pepope/DLF/layers.py", line 100, in revnet2d_step
logdet += obj
Does it work for you? I'm running tensorflow-gpu==1.12.0
installed with pip
in an conda
virtual environment.
In the top layer, the prior distribution here use h=conv(0) + embedding as the mean and std in the case of 'ycond=True'.
It seems that the conv layer is unnecessary.
Thanks for your nice paper and implementation. I read your paper and have a question regarding the channel number, appreciate it if you can help.
If I understand correctly, transformations in a normalizing flow should be bijective, only in that case the change of variable formula works. A bijective transform does not change the dimension, which means the total number of pixels in a tensor cannot be changed (except the split operation drops half of the tensor on purpose).
In this case, since you are using squeeze operation, where the dimension is halved but channel increases by 4, thus the total number of pixels in a tensor does not change. But how do you arbitrarily choose 512 (128) channels in your experiment? The channel number in level L should be 3x4^(L-1) (512%3 != 0). I might misunderstand your model, please help me clarify. Thanks a lot.
In figure2 of your paper,you show K=2 is the better choice of K,so is there any different of your model with Glow when K=2? And when k=4 or 6,what's the result of inverse dynamic linear transformation.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.