cjhanks / cmm Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 72 KB

CUDA Audio.

License: MIT License

CMake 3.59% C++ 88.67% Cuda 7.74%

cmm's People

Watchers

cmm's Issues

Research: Research what it would take to integrate cuTENSOR

I have never used cuTENSOR, but it seems to be all the rage. https://docs.nvidia.com/cuda/cutensor/api/cutensor.html#

Research what paradigm is being used here and determine if it is appropriate to integrate into this library.

Feature: Integrate CPack Deb packaging

https://cmake.org/cmake/help/v3.12/module/CPack.html

Feature: 2D FFT Needed

Similar requirements to #3 .

There are two primary considered use cases:

Remove latent signals from bad instrument cables.
Generate downsampled spectrograms that allow you to segment the song based on repeated patterns in the slow time dimension.

Feature: Capability to set high water mark on memory allocator

Presently, the cmm::MemoryManager has an unbounded memory cap (ie: it will grow until it dies of exhaustion).

In real-time applications this is not an issue. In post-processing applications, where there may be a great disparity between a producer and a consumer OR in the case of a very tight CPU loop of non-synchronized GPU operations with temporary allocations... this can become very problematic, very fast.

There should be two modes for the high water mark:

If the high water mark is hit, the program should die.
If the high water mark is hit, the allocation request should block. It is worth noting that this could cause deadlock in a multi-threaded application.

Research: cuGRAPH

It seems like a relatively simple task to create an abstraction over the C interface: https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__GRAPH.html#group__CUDART__GRAPH

It would be interesting to investigate how useful this would be, or if this is already a solved problem by other developers.

Research: Review this guide and extract a list of necessary algorithms

http://www.dspguide.com/ch22.htm

Feature: `cmm::IPCMemory`

I am not sure yet all that is required here, IPC memory documentation starts here: https://docs.nvidia.com/cuda/cuda-runtime-api/group__CUDART__DEVICE.html#group__CUDART__DEVICE_1g02bb3632b5d223db6acae5f8744e2c91

I know this is going to require some sort of IPC mechanism, unfortunately C++ does not have a native system semaphore... or a native unnamed pipe.

So, the code needs to be written in a way that abstracts out the OS details from the inter-process socket AND the inter-process semaphore. I do not feel like implementing them all, and I definitely don't want to drag BOOST into this. So, I will only implement the Linux version with a domain socket and a system semaphore.

Admin: Configure APT Repository

The absolutely best resource I have ever found is this: https://pmateusz.github.io/linux/2017/06/30/linux-secure-apt-repository.html

That needs to be hosted somewhere. Perhaps buy cmm.dev and host it there.

Research: nvGRAPH

It is not clear what differentiates nvGRAPH from cudaGRAPH as outlined in #13 .

https://docs.nvidia.com/cuda/nvgraph/index.html

Feature: Symbol Support

One of the more useful CUDA tools is to create a Symbol, ie: __constant__ memory.

It can be tricky to handle arrays. Ideally, it should support all interface requirements of the Matrix class (https://github.com/cjhanks/cmm/blob/0.x/lib/cmm/matrix/matrix.hh).

Feature: 1D FFT Needed

We will end up targeting 10Hz to 80Khz. When discretized to 1Hz increments, that's 80000 * sizeof(double) for the input audio samples. About 0.6 MB. And for the complex type, 1.2 MB. That is doable.

There should be an extensible type_traits class which maps the cuFFT functions to the templated type.

The class should be capable of either taking a 1D signal, or a batch of 1D signals.

For now, it only needs to support:

float
double

though ideally...

Human perception of sound is logarithmic. As a consequence, most of the solutions (especially in the higher frequencies) in the DFT are completely irrelevant. So it would be nice to find a sparse DFT.

Feature: Inter-thread synchronized communication pipe needed

A "pipe" in this case is simply a (bidirectional or unidirectional) blocking queue with a high-water mark.

The basis for these can be found here:

When you have two threads with independent CUDA streams, sometimes it is desirable to wait stream synchronization on the RECV side of the pipe, sometimes it is desirable to synchronize on the SEND side of the pipe. And sometimes neither is necessary.

The queue needs to be capable of supporting an interface similar to:

enum SyncMode {
  SYNC_NONE,
  SYNC_RECV,
  SYNC_SEND
};

bool
SynchronizedQueue::Push(Type&& data, SyncMode sync_mode);

Feature: `make install` should work.

The CMAKE_INSTALL_PREFIX needs to be obeyed.

Feature: CUBLAS Abstraction Needed

It is not clear what (if any) BLAS functions would benefit audio processing. But for the sake of completeness in the cmm library, I will try to support up to Level 3.

The implementation should be a type_traits class which specializes on the CUDA functions. This should be wrapped in a class capable of deriving most of the argument parameters which would be needed for CUDA by inspecting the matrix types.

Feature: Signal Tone Synthesizer Needed

In order to test anything without dealing with audio parsing, we need the ability to generate a constant tone.

The tone should be parameterized by the following arguments:

Frequency
Amplitude
Time delay

It may also be nice both:

Tone series (a series of constant tones changing in slow time)... like a song (discrete tones).
Frequency sweep, this is necessary for glissando. It would be nice if the sweep could be parameterized with both a zeroth, first, and second order parameter.

Research: NPP

The NVIDIA Performance Primitives library seems to be primarily catered towards images. But, it does have a few statistical functions appropriate for audio processing.

https://docs.nvidia.com/cuda/npp/group__signal__statistical__functions.html

This may obviate #6 for my use case. At which point, it would only be needed for completeness.

Feature: Complex Type Needed

The complex type is needed for a few reasons:

Audio phase coherence is necessary to avoid dead spots in the frequency when mixing two signals, sometimes it is desirable to shift the phase to fix this.
Many "Wah" based audio effects require you to modify the imaginary/real component independently. This is obviously needed for phaser sweeps and envelope filters.

This is low priority.

But it would be nice to fully mimic the std::complex type interface.

Admin: Create Build Server

I can theoretically build all my packages on an AWS server. Unfortunately, all of the GPU instances are too expensive and they run weird GPU models. Theoretically my code should run on them, we'll see. Maybe I will try them out.

In either case... perhaps I could use CODE Deploy triggered from GitHub (https://docs.aws.amazon.com/codebuild/latest/userguide/sample-access-tokens.html)

Then have that trigger an EC2 instance to run unit tests... or something.

cjhanks / cmm Goto Github PK

cmm's People

Watchers

cmm's Issues

Recommend Projects

Recommend Topics

Recommend Org