A Torch implementation of SampleRNN: An Unconditional End-to-End Neural Audio Generation Model.
Sample output can be heard here. Feel free to submit links to any interesting samples you generate as a pull request.
The following packages are required to run SampleRNN_torch:
- nn
- cunn
- cudnn
- rnn
- optim
- audio
- xlua
- gnuplot
To retrieve and prepare the piano dataset, as used in the reference implementation, run:
cd datasets/piano/
./create_piano_dataset.sh
The violin dataset preparation scripts are located in datasets/violin/
.
Custom datasets may be created by using scripts/generate_dataset.lua
to slice multiple audio files into segments for training, audio must be placed in datasets/[dataset]/data/
.
To start a training session run th train.lua -dataset piano
. To view a description of all accepted arguments run th train.lua -help
.
To view the progress of training run th generate_plots
, the loss and gradient norm curve will be saved in sessions/[session]/plots/
.
By default samples are generated at the end of every training epoch but they can also be generated separately using th train.lua -generate_samples
with the session
parameter to specify the model.
Multiple samples are generated in batch mode for efficiency, however generating a single audio sample is faster with th fast_sample.lua
. See -help
for a description of the arguments.
A pretrained model of the piano dataset is available here. Download and copy it into your sessions/
directory and then extract it in place.
More models will be uploaded soon.