Giter Club home page Giter Club logo

toyadmos2-dataset's Introduction

toyadmos2 key visual

ToyADMOS2: Another dataset of miniature-machine operating sounds for anomalous sound detection under domain shift conditions

This repository provides a data mixer tool for ToyADMOS2 ๐Ÿš— ๐Ÿšƒ, a large-scale dataset for anomaly detection in machine operating sounds (ADMOS) that consist of a large number of operating sounds of miniature machines (toys) under normal and anomaly conditions by deliberately damaging them. You can find the detail of the dataset on the ToyADMOS2 dataset website.

If you find the ToyADMOS2 useful in your work, please consider citing our paper.

@inproceedings{harada2021toyadmos2,
    author = "Harada, Noboru and Niizumi, Daisuke and Takeuchi, Daiki and Ohishi, Yasunori and Yasuda, Masahiro and Saito, Shoichiro",
    title = "{ToyADMOS2}: Another Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection under Domain Shift Conditions",
    booktitle = "Proceedings of the 6th Detection and Classification of Acoustic Scenes and Events 2021 Workshop (DCASE2021)",
    address = "Barcelona, Spain",
    month = "November",
    year = "2021",
    pages = "1--5",
    isbn = "978-84-09-36072-7",
    doi. = "10.5281/zenodo.5770113",
    _pdf = {https://dcase.community/documents/workshop2021/proceedings/DCASE2021Workshop_Harada_6.pdf}
}

What is the ToyADMOS2 dataset?

ToyADMOS2 is a unique dataset, which we don't use as it is; it's a set of source material recording samples. We then use a tool provided in this repository, and generate new datasets according to our new recipes.

The samples consist of the recordings under various conditions/configurations for normal/anomaly sounds. You can then edit/program your own set in the recipe so that you can compile new datasets for your research purposes.

A document for how we made it ToyADMOS2_details.pdf is also available. You can also check the detail of anomaly conditions with photos.

Please try making your own!

Download dataset

Visit the ToyADMOS2 dataset website hosted by http://zenodo.org/, and download.

Wanna See the Miniature Machines?

Here're the videos of the toy car and the toy train:

Getting Started

Install dependent packages according to the requirements.txt.

This will install essential modules for running tools in this repository.

Example: Making Example Dataset

Run the following will create the equivalent benchmark dataset evaluated in the Table 3 of the paper, which is a compatible file-folder structure with the DCASE2021 challenge task 2. This will create dataset folder your_new_dataset. This will take about an hour.

# This creates `clean` dataset.
python mixer.py /path/to/ToyADMOS2 your_new_dataset recipe_benchmark.xlsx clean
# This creates SNR=6dB dataset.
python mixer.py /path/to/ToyADMOS2 your_new_dataset recipe_benchmark.xlsx 6
  • recipe_example_car_shift.xlsx is also another example.
  • recipe_template is a template, as well as one more example.

Example: Running Baseline

  1. Clone and apply a patch for making evaluation baseline based on dcase2020_task2_baseline.

    git clone https://github.com/y-kawagu/dcase2020_task2_baseline
    cd dcase2020_task2_baseline && patch --binary < ../dcase2020_task2_baseline.patch
  2. Make a symbolic link for the baseline that finds data source at dcase2020_task2_baseline/dev_data.

    cd dcase2020_task2_baseline && ln -s ../your_new_dataset dev_data
  3. Run the baseline, then you can find the evaluation results stored in result/result.csv.

    cd dcase2020_task2_baseline
    python 00_train.py -d
    python 01_test.py -d

If you find anything missing when running dcase2020_task2_baseline, please follow the instruction in it to install basic modules.

Making Your Dataset

(Example of a recipe file, yes it's an Excel spreadsheet.)

example recipe excel

You simply make a copy of template (recipe_template.xlsx), edit yours, then run a tool.

You can find more information in the UsersManual.md.

License

Please check the LICENSE for the detail.

Acknowledgements

The evaluation of this dataset use y-kawagu/dcase2020_task2_baseline. We thank @y-kawagu for your dedication to the DCASE challenges!

This repository is an 2021 version, kudos to @YumaKoizumi for the 2020 efforts of the ToyADMOS-dataset.

References

toyadmos2-dataset's People

Contributors

daisukelab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

perseus1996

toyadmos2-dataset's Issues

Reading mp4 vs wav

I am attempting to read the new data set with the mp4 files, while this code snippet from mixer.py

sig, sr_sig = __audioread_load(filename, offset=0.0, duration=None, dtype=np.float32)

returns an array of values with length 242550 for the ToyAMOS1 wav files, it only returns the sample
rate of 48,000 for the mp4 files but the length of sig is 0 and there is a warning warning:

/var/folders/mv/qbxkzz3d5zj4dh3wmt30cpfh000r_w/T/ipykernel_55465/1690306295.py:1: FutureWarning: librosa.core.audio.__audioread_load
Deprecated as of librosa version 0.10.0.
It will be removed in librosa version 1.0.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.