Giter Club home page Giter Club logo

audio-dataset's People

Contributors

christophschuhmann avatar dmarx avatar isaac0804 avatar kjhenner avatar knoriy avatar krishnakalyan3 avatar lukewys avatar marianna13 avatar retrocirce avatar rvencu avatar tianyu-z avatar tj-solergibert avatar turian avatar yuchenhui22314 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

audio-dataset's Issues

files for preprocessing

Where do we get the following files from?
Any help would be appreciated.
Thanks in advance.

metadata_file = r'/home/yuchen/raw/freesound/parquet/freesound_parquet.parquet'

ignore_file = r'/home/yuchen/raw/freesound/filename_dic.txt'

duration_file = r"/home/yuchen/raw/freesound/all_duration.txt"

How to download Freesound?

Hi, can you share some ways to download Freesound? e.g. How to use Linux scripts to download these audio.

decoding speed / benchmark

This repo is great. I always wanted to benchmark webdataset for audio. A couple of questions:

  1. did you find flac to be a good trade-off between decoding performance and file-size? have you tried mp3 instead?
  2. did you benchmark the pipeline against plain torch.data with torchaudio or the new torch data pipes? Maybe adding the benchmark to https://github.com/faroit/python_audio_loading_benchmark/ to give this a go?
  3. How is partial decoding seeking be typically done with webdatasets, when storing long audio but at decoding stage, only random chunks are being read. Is seeking supported? If yes, does this slow down the i/o pipeline?

Missing 'tag' key in FSD50k preprocessor

Hi,
Thanks for sharing the wonderful code.

According to the readme of data preprocess (here)
there should be a key of 'tag' (containing labels) in the output JSON file after preprocessing.
Screenshot 2023-02-21 at 2 12 11 PM
This tag extraction/creation is missing in the preprocess_FSD50K.py file.

Am I understanding something incorrectly or there is 'tag' creation missing in the file?

Thanks,
Saksham

Dataset Plan

@rvencu @rom1504
We need more data in the next step. The data we need in the ranking of priority is:

  1. Audio data with natural text description(s).
  2. Audio data with other labels, and "made up" a text description for the audio.

For audio data with natural text description, we further need:

For audio data with other labels, we need to collect new large datasets while converting our current dataset with tag labels.

The datasets in top priority are those with large size and easy to turn labels into a text description:

(The following datasets all are those with tag labels of the audio)

The datasets we currently have that need converting labels to text are:

We should come up with a unified way of converting tags to text. We could reference how CLIP did that (in converting classification to natural text).

When possible prefer saving parquet with url inside

Similarly to image datasets, it's better to first save a url + metadata file as parquet
That can be distributed without copyright issue

Then a tool like img2dataset can handle the download

Let's add that in the readme here

AWS S3 Access

Congratulations for executing the herculean effort of putting together this dataset!
Where can one find the access information for the data in s3://s-laion-audio/?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.