Time spent on repeated kernel construction

Julius, fast PyTorch based DSP for audio and 1D signals

Julius contains different Digital Signal Processing algorithms implemented with PyTorch, so that they are differentiable and available on CUDA. Note that all the modules implemented here can be used with TorchScript.

For now, I have implemented:

julius.resample: fast sinc resampling.
julius.fftconv: FFT based convolutions.
julius.lowpass: FIR low pass filter banks.
julius.filters: FIR high pass and band pass filters.
julius.bands: Decomposition of a waveform signal over mel-scale frequency bands.

Along that, you might found useful utilities in:

julius.core: DSP related functions.
julius.utils: Generic utilities.

News

19/09/2022: julius 0.2.7 released:: fixed ONNX compat (thanks @iver56). I know I missed the 0.2.6 one...
28/07/2021: julius 0.2.5 released:: support for setting a custom output length when resampling.
22/06/2021: julius 0.2.4 released:: adding highpass and band passfilters. Extra linting and type checking of the code. New unfold implemention, up to x6 faster FFT convolutions and more efficient memory usage.
26/01/2021: julius 0.2.2 released: fixing normalization of filters in lowpass and resample to avoid very low frequencies to be leaked. Switch from zero padding to replicate padding (uses first/last value instead of 0) to avoid discontinuities with strong artifacts.
20/01/2021: julius implementation of resampling is now officially part of Torchaudio.

Installation

julius requires python 3.6. To install:

pip3 install -U julius

Usage

See the Julius documentation for the usage of Julius. Hereafter you will find a few examples to get you quickly started:

import julius
import torch

signal = torch.randn(6, 4, 1024)
# Resample from a sample rate of 100 to 70. The old and new sample rate must be integers,
# and resampling will be fast if they form an irreductible fraction with small numerator
# and denominator (here 10 and 7). Any shape is supported, last dim is time.
resampled_signal = julius.resample_frac(signal, 100, 70)

# Low pass filter with a `0.1 * sample_rate` cutoff frequency.
low_freqs = julius.lowpass_filter(signal, 0.1)

# Fast convolutions with FFT, useful for large kernels
conv = julius.FFTConv1d(4, 10, 512)
convolved = conv(signal)

# Decomposition over frequency bands in the Waveform domain
bands = julius.split_bands(signal, n_bands=10, sample_rate=100)
# Decomposition with n_bands frequency bands evenly spaced in mel space.
# Input shape can be `[*, T]`, output will be `[n_bands, *, T]`.
random_eq = (torch.rand(10, 1, 1, 1) * bands).sum(0)

Algorithms

Resample

This is an implementation of the sinc resample algorithm by Julius O. Smith. It is the same algorithm than the one used in resampy but to run efficiently on GPU it is limited to fractional changes of the sample rate. It will be fast if the old and new sample rate are small after dividing them by their GCD. For instance going from a sample rate of 2000 to 3000 (2, 3 after removing the GCD) will be extremely fast, while going from 20001 to 30001 will not. Julius resampling is faster than resampy even on CPU, and when running on GPU it makes resampling a completely negligible part of your pipeline (except of course for weird cases like going from a sample rate of 20001 to 30001).

FFTConv1d

Computing convolutions with very large kernels (>= 128) and a stride of 1 can be much faster using FFT. This implements the same API as torch.nn.Conv1d and torch.nn.functional.conv1d but with a FFT backend. Dilation and groups are not supported. FFTConv will be faster on CPU even for relatively small tensors (a few dozen channels, kernel size of 128). On CUDA, due to the higher parallelism, regular convolution can be faster in many cases, but for kernel sizes above 128, for a large number of channels or batch size, FFTConv1d will eventually be faster (basically when you no longer have idle cores that can hide the true complexity of the operation).

LowPass

Classical Finite Impulse Reponse windowed sinc lowpass filter. It will use FFT convolutions automatically if the filter size is large enough. This is the basic block from which you can build high pass and band pass filters (see julius.filters).

Bands

Decomposition of a signal over frequency bands in the waveform domain. This can be useful for instance to perform parametric EQ (see Usage above).

Benchmarks

You can find speed tests (and comparisons to reference implementations) on the benchmark. The CPU benchmarks are run on a Mac Book Pro 2020, with a 2.4 GHz 8-core intel CPU i9. The GPUs benchmark are run on Nvidia V100 with 16GB of memory. We also compare the validity of our implementations, as compared to reference ones like resampy or torch.nn.Conv1d.

Running tests

Clone this repository, then

pip3 install .[dev]'
python3 tests.py

To run the benchmarks:

pip3 install .[dev]'
python3 -m bench.gen

License

julius is released under the MIT license.

Thanks

This package is named in the honor of Julius O. Smith, whose books and website were a gold mine of information for me to learn about DSP. Go checkout his website if you want to learn more about DSP.

	window = torch.cos(t/self.zeros/2)**2
	kernel = sinc(t) * window

adefossez / julius Goto Github PK

julius's Introduction

Julius, fast PyTorch based DSP for audio and 1D signals

News

Installation

Usage

Algorithms

Resample

FFTConv1d

LowPass

Bands

Benchmarks

Running tests

License

Thanks

julius's People

Contributors

Stargazers

Watchers

Forkers

julius's Issues

Recommend Projects

Recommend Topics

Recommend Org