fatchord / fftnet Goto Github PK

View Code? Open in Web Editor NEW

86.0 86.0 20.0 395 KB

Pytorch Implementation of FFTNet

Jupyter Notebook 99.67% Python 0.33%

fftnet's People

Contributors

Stargazers

Watchers

Forkers

mdda auspicious3000 huguanglong rockycamp shlpu entn-at zhf459 psmaragdis edresson batikim09 santi-pdp afd77 human2b wladoo zherebetskyy spxnn radioluna txntxn imhgchoi 43reyerhrstj

fftnet's Issues

Did you use this repo to train a vocoder?

@fatchord Hi, happy to see you again! I'm also working on the FFTNet. But in my experiments, I cannot get the similar results of the paper's demo page, mainly about conditional sampling and post-denoising. Do you try to reconstruct their results? Thanks.

Auxiliary Input to Network

Hi, I'm wondering if you could help me. I'm trying to build a WaveNet-style vocoder in TensorFlow which uses acoustic features as auxiliary input, similar to FFTNet, but I'm struggling to understand how auxiliary input is feed to the network, is it added in parallel (two parallel layers) to the sample values and the output combined at a later layer? If you can point me in the direction of a text-book/article on auxiliary input/conditioning network I would be eternally grateful, I've looked many times and I can't find anything that gives a general undestanding of this.

Specific audio generation

Hi, is it possible to use this model conditioned on the first few samples to generate a specific audio? Say I want to generate audio1.wav, then after training with my dataset, I'd be able to produce that audio given the first N samples.

Thank you for your time

IndexError: Target -9 is out of bounds.

I tried running on a different audio file but I keep getting this error. Why?

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-81-d644dae333d9> in <module>()
      3 #data = torch.zeros((50000,))
      4 print(data.shape)
----> 5 train(data, model, optimizer, batch_size=2, seq_len=5000, lr=1e-5, steps=10_000)

~/dev/FFTNet/fftnet.py in train(data, model, optimizer, batch_size, seq_len, lr, steps)
    138         print(y_hat[:, :, 1:])
    139         print(y.unsqueeze(-1))
--> 140         loss = criterion(y_hat[:, :, 1:], y.unsqueeze(-1))
    141 
    142         running_loss += loss.item()

/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py in __call__(self, *input, **kwargs)
    548             result = self._slow_forward(*input, **kwargs)
    549         else:
--> 550             result = self.forward(*input, **kwargs)
    551         for hook in self._forward_hooks.values():
    552             hook_result = hook(self, input, result)

/usr/local/lib/python3.7/site-packages/torch/nn/modules/loss.py in forward(self, input, target)
    203 
    204     def forward(self, input, target):
--> 205         return F.nll_loss(input, target, weight=self.weight, ignore_index=self.ignore_index, reduction=self.reduction)
    206 
    207 

/usr/local/lib/python3.7/site-packages/torch/nn/functional.py in nll_loss(input, target, weight, size_average, ignore_index, reduce, reduction)
   2115         ret = torch._C._nn.nll_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2116     elif dim == 4:
-> 2117         ret = torch._C._nn.nll_loss2d(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
   2118     else:
   2119         # dim == 3 or dim > 4

IndexError: Target -9 is out of bounds.

How can I help?

Hey There!

You've talked a bit about working on a new vocoder algorithm. You have also worked on WaveRNN and FFTNet. Would love to assist you and contribute.

Who I am:

Here is one of my latest projects in the NLP space: https://github.com/PetrochukM/PyTorch-NLP
I do research at the Allen Insitute of Artificial Intelligence (AI2), we're one of the foremost research labs in NLP and Vision.

fatchord / fftnet Goto Github PK

fftnet's People

Contributors

Stargazers

Watchers

Forkers

fftnet's Issues

Did you use this repo to train a vocoder?

Auxiliary Input to Network

Specific audio generation

IndexError: Target -9 is out of bounds.

How can I help?

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent