Giter Club home page Giter Club logo

dlf's People

Contributors

naturomics avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

dlf's Issues

Pretrained models

Hello authors,

Thank you for your paper! Do you have any idea as to when the pretrained models will be released? It would be great to hear an approximate timeframe if possible.

Imagenet dataset

Could you double check that your small imagenet datasets are the same as http://image-net.org/small/train_32x32.tar, and http://image-net.org/small/valid_32x32.tar? As far as I know different preprocessing on imagenet can greatly affect the likelihood you get. For example, using the imagenet 32x32 dataset from https://patrykchrabaszcz.github.io/Imagenet32/ easily yields a bpd around 3.80 for flow models. It is weird that your model has such a big advantage on imagenet, but not on CIFAR-10

DLF4TTS

Dear @naturomics ,

Do you think DLF is good when being applied to TTS models? For example, Portaspeech has successfully gained high-quality output using simple VP-Flow and Affine Coupling. I'm wondering whether DLF can contribute it's benefits to this kindda model! Would love to see your comments!

Thanks,

Max

More training details

I am so interested in your paper. It will help me a lot if you give more training details.
How many GPUs did you use for training?
How long did it take for one epoch?
I follow the setting you give for training ImageNet 32x32, but it takes about 5 hours every epoch for one GPU 2080Ti.
By the way, you claim the results are obtained in 50 epochs and your model is more 10 times efficient than Glow. However, the epochs you defined is different from ones Glow did.
More specifically, in Glow, every epoch depends on n_train, default=50000. On the other hand, if I understand correctly, in your paper, one epoch for processing all images in the training set. Take ImageNet as an example, one epoch means 1.28M images are processed.
How do you evaluate the efficiency of your model?
Thanks a lot.

MNIST example: `Incompatible shapes: [256] vs. [96]`

Greetings, Thanks for the nice work. I attempted to run the unconditional MNIST example as described in the README, but I hit the error Incompatible shapes: [256] vs. [96].

In more detail, when I run:

python main.py \
  --problem mnist \
  --results_dir results/mnist_noCond \
  --num_levels 2 \
  --width 128 \
  --batch_size 256

I obtain:

Number of trainable parameters: 1828648
Train from scratch
 epoch, step, loss, bits_x, bits_y, l2_loss, speed(samples/sec)
0.000, 248.73601
Incompatible shapes: [256] vs. [96]
         [[node flow_0/block_0/dynamic_linear/add_1 (defined at /scratch0/pepope/DLF/layers.py:100)  = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flow_0/block_0/invconv/add, flow_0/b
lock_0/dynamic_linear/Sum)]]
         [[{{node loss/total_loss/_315}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incar
nation=1, tensor_name="edge_85349_loss/total_loss", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

Caused by op 'flow_0/block_0/dynamic_linear/add_1', defined at:
  File "main.py", line 343, in <module>
    main(hps=get_arguments())
  File "main.py", line 244, in main
    train(model, dataloader, sess, hps)
  File "main.py", line 99, in train
    model.train(images, labels)
  File "/scratch0/pepope/DLF/model.py", line 149, in train
    condition=condition if condition is None else condition[start:end])
  File "/scratch0/pepope/DLF/model.py", line 233, in _single_tower
    self.encode(inputs, labels, condition)
  File "/scratch0/pepope/DLF/model.py", line 65, in encode
    z, objective, eps = codec(inputs, cond=condition, objective=objective, hps=self.hps, reverse=False)
  File "/scratch0/pepope/DLF/model.py", line 22, in codec
    hps=hps, name="flow_%s" % str(level), reuse=reuse)
  File "/scratch0/pepope/DLF/layers.py", line 28, in revnet2d
    logdet=logdet, hps=hps, reverse=reverse)
  File "/scratch0/pepope/DLF/layers.py", line 100, in revnet2d_step
    logdet += obj
  File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/ops/math_ops.py", line 866, in binary_op_wrapper
    return func(x, y, name=name)
  File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/ops/gen_math_ops.py", line 301, in add
    "Add", x=x, y=y, name=name)
  File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper
    op_def=op_def)
  File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/util/deprecation.py", line 488, in new_func
    return func(*args, **kwargs)
  File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 3274, in create_op
    op_def=op_def)
  File "/scratch0/pepope/envs/DLF/lib/python3.6/site-packages/tensorflow/python/framework/ops.py", line 1770, in __init__
    self._traceback = tf_stack.extract_stack()

InvalidArgumentError (see above for traceback): Incompatible shapes: [256] vs. [96]
         [[node flow_0/block_0/dynamic_linear/add_1 (defined at /scratch0/pepope/DLF/layers.py:100)  = Add[T=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:GPU:0"](flow_0/block_0/invconv/add, flow_0/b
lock_0/dynamic_linear/Sum)]]
         [[{{node loss/total_loss/_315}} = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/device:CPU:0", send_device="/job:localhost/replica:0/task:0/device:GPU:0", send_device_incar
nation=1, tensor_name="edge_85349_loss/total_loss", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]

It appears to be a size mismatch occurring at

  File "/scratch0/pepope/DLF/layers.py", line 100, in revnet2d_step
    logdet += obj

Does it work for you? I'm running tensorflow-gpu==1.12.0 installed with pip in an conda virtual environment.

about the prior distribution

In the top layer, the prior distribution here use h=conv(0) + embedding as the mean and std in the case of 'ycond=True'.
It seems that the conv layer is unnecessary.

constraints on channel number

Thanks for your nice paper and implementation. I read your paper and have a question regarding the channel number, appreciate it if you can help.
If I understand correctly, transformations in a normalizing flow should be bijective, only in that case the change of variable formula works. A bijective transform does not change the dimension, which means the total number of pixels in a tensor cannot be changed (except the split operation drops half of the tensor on purpose).
In this case, since you are using squeeze operation, where the dimension is halved but channel increases by 4, thus the total number of pixels in a tensor does not change. But how do you arbitrarily choose 512 (128) channels in your experiment? The channel number in level L should be 3x4^(L-1) (512%3 != 0). I might misunderstand your model, please help me clarify. Thanks a lot.

Partitions K

In figure2 of your paper,you show K=2 is the better choice of K,so is there any different of your model with Glow when K=2? And when k=4 or 6,what's the result of inverse dynamic linear transformation.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.