Giter Club home page Giter Club logo

Comments (3)

ZhihaoDU avatar ZhihaoDU commented on May 29, 2024

In summary, with different batch sizes, the outcome codecs should be very similar, different tokens should be less than 3 for each quantizer. In my case, I test 10 utterance from Librispeech test-clean subset under the batch_size of 1, 4 and 8, and the codec outputs are the same. There are some insights may help you figure out your problem:

  1. To make the bachified inference, utterances in a mini-batch are padded at the end of each utterance with wrap mode in numpy. You can find more details at https://github.com/alibaba-damo-academy/FunCodec/blob/master/funcodec/bin/codec_inference.py#L260 and https://github.com/alibaba-damo-academy/FunCodec/blob/master/funcodec/modules/nets_utils.py#L65

  2. To speed up the data loading at the inference stage, the multi-thread torch Dataloader worker is employed. Therefore, if you set the num_workers larger than 0 in the encoding_decoding.sh script (default value is 4), the utterance order of outputs may be different due to the random of Dataloader worker. If you want to mantain the utterance order, please set the num_workers parameter to 0.

If your test cases are still much different after you check the above mentioned things, please provide an reproducible recipe, I will check it. Thanks.

from funcodec.

hertz-pj avatar hertz-pj commented on May 29, 2024

The variations in outcomes are not significantly noticeable in terms of effects. I am just keen to understand the reasons behind the differences when using various batch sizes. From my understanding, proper masking should avoid inconsistencies caused by different batch sizes

from funcodec.

ZhihaoDU avatar ZhihaoDU commented on May 29, 2024

Since there are only convolutions and uni-directional LSTM layers in the VAE-RVQ model, I didn't implement batchified inference with masking, instead, I use porper padding, wrap mode of numbpy. I think different padding length may cause the very limited inconsistencies for the ending codes, and the other codes should be identical for various batch sizes.

from funcodec.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.