Giter Club home page Giter Club logo

timbral_models's Introduction

AudioCommons Timbral Models

NOTE: this repository is no longer maintained. The timbral models can however be still installed by cloning the repository and running pip install (see below)

The timbral models were devleoped by the Institute of Sound Recording (IoSR) at the University of Surrey, and was completed as part of the AudioCommons project.

The current distribution contains python scripts for predicting eight timbral characteristics: hardness, depth, brightness, roughness, warmth, sharpness, booming, and reverberation.

More detailed explanations of how the models function can be found in Deliverable D5.8: Release of timbral characterisation tools for semantically annotating non-musical content, available: http://www.audiocommons.org/materials/

Installing the package

The timbral_models package can be installed using the pip command. This will handle installation of all dependencies. In the update to version 0.4, the dependency to essentia was removed and only pip installable packages are required.

pip install timbral_models

Please note that during testing, pip was unable to install some of the dependencies and produced an error. In these cases, either rerun the pip install timbral_models command or install the offending dependency directly, e.g. pip install numpy.

The package can also be installed locally and be made editable. To do this, clone the repository, navigate to the folder, and run the pip command pip install -e .. In this method, dependencies will not be installed.

Dependencies

The script can also be downloaded manually from the github repository (https://github.com/AudioCommons/timbral_models). If doing this, dependencies will need to be manually installed. The timbral models rely on several other easily accessible python packages: numpy, soundfile, librosa, scikit-learn, and scipy. These are all easily installed using the pip install command. e.g.

$ pip install numpy
$ pip install soundfile
$ pip install librosa
$ pip install scipy
$ pip install scikit-learn
$ pip install six
$ pip install pyloudnorm

Using the models

The models are formatted in a python package that can be simply imported into a Python script. The timbral extractor can be used to extract all timbral attributes with a single function call.

To calculate the timbral attributes, pass the timbral extractor function a string of the filename. The method will then read in the audio file internally and return all timbral characteristics.

import timbral_models
fname = '/Documents/Music/TestAudio.wav'
timbre = timbral_models.timbral_extractor(fname)

In this example, timbre will be a python dictionary containing the predicted hardness, depth, brightness, roughness, warmth, sharpness, booming, and reverberation of the specified audio file.

Single attribute calculation

Alternative, each timbral attribute can be calculated individually by calling the specific timbral function, e.g. timbral_hardness(fname). These are named timbral_xxx(fname), where xxx represents the timbral model, and also require a string of the filename to be analysed.

import timbral_models
fname = '/Documents/Music/TestAudio.wav'
timbre = timbral_models.timbral_hardness(fname)

Model output

The hardness, depth, brightness, roughness, warmth, sharpness, and booming are regression based models, trained on subjective ratings ranging from 0 to 100. However, the output may be beyond these ranges. The clip_output optional parameter can be used to contrain the outputs between 0 and 100.

timbre = timbral_models.timbral_extractor(fname, clip_output=True)

For additional optional parameters, please see Deliverable D5.8.

The reverb attribute is a classification model, returning 1 or 0, indicating if the file "sounds reverberant" or "does not sound reverberant" respectively.

MATLAB Reverb model

Also contained in this repository is a full version of the timbral reverb model. For instruction on installing and using this, please see Deliverable D5.8.

Version History

This section documents the version history of the timbral models. To download a specific version of the model that relate to a specific deliverable, please check this section and download the most recent version from that date.

2024/05/24 - Version 0.4.1 of timbral_models fixes a dependency with scikit_lern and introduces timbral_models.__version__

2019/01/24 - Version 0.4 of timbral_models, relates to Audio Commons Deliverable D5.8. This version of the repository relates to the software version 0.4 on PyPI.

2018/12/14 - Version 0.3 of timbral models, relates to Audio Commons Deliverable D5.7. This version of the repository relates to the software version 0.3 on PyPI.

2018/07/26 - Version 0.2 of timbral models, relates to Audio Commons Deliverable D5.6. This version of the repository relates to the software version 0.2 on PyPI.

2017/09/05 - Version 0.1 of timbral models, relates to Audio Commons Deliverable D5.3. This version of the repository relates to the software version 0.1 on PyPI.

2017/04/27 - Version 0.0 of the timbral models, relates to Audio Commons Deliverable D5.2.

Citation

For refencing these models, please reference Deliverable D5.8, available: http://www.audiocommons.org/materials/

timbral_models's People

Contributors

andyp103 avatar f-brinkmann avatar ffont avatar gionstegmann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

timbral_models's Issues

Add info about requirements

Some of the scripts require external software to be installed and won't work out of the box (e.g. numpy, soundfile, librosa). These should al least be listed in the README. In future iterations it would be great to provide a docker image (we can do it as part of the main audio commons extractor) to facilitate running these algorithms.

[line 253:] segment returns a empty list when a sound file has too short attack time.

Got the following error when I tried to compute timbral_hardness for the following sound. I suspect it's because the sound has too short attack time.

http://freesound.org/people/ShawnyBoy/sounds/165394/

ERROR :
line 253, in timbral_hardness
segment /= float(max(segment))
ValueError: max() arg is an empty sequence

To confirm I tried to print the values of segment and found out one of the segments is an empty list for this sound. I tried to add a condition to ignore the block if there is an empty sequence but couldn't solve it and not quite sure how to optimise the code for this use case

onset_detect() takes 0 positional arguments but 2 were given.

I'm running into an error as soon as I try to run it, using the sample code given in your readme.

code:
import timbral_models

frame = r"Downloads\Ingentia60.mp3"
timbre = timbral_models.timbral_extractor(frame)

print(timbre)

error:
Calculating hardness...
Traceback (most recent call last):
File "C:\Users\andre\OneDrive\VSCode\transcribeapp\timbre.py", line 4, in
timbre = timbral_models.timbral_extractor(frame)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\timbral_models\Timbral_Extractor.py", line 70, in timbral_extractor
hardness = timbral_hardness(audio_samples, fs=fs,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\timbral_models\Timbral_Hardness.py", line 87, in timbral_hardness
original_onsets = timbral_util.calculate_onsets(audio_samples, envelope, fs, nperseg=nperseg)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\timbral_models\timbral_util.py", line 642, in calculate_onsets
onsets = librosa.onset.onset_detect(audio_samples, fs, backtrack=True, units='samples')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: onset_detect() takes 0 positional arguments but 2 positional arguments (and 2 keyword-only arguments) were given

slice indices must be integers or None or have an __index__ method [in timbral_brightness and timbral_hardness]

When processing this sound: https://freesound.org/people/bone666138/sounds/198841/ both timbral_brightness and timbral_hardness models fail (the others work fine). Both return the same error, although it happens in different parts of the code and might be completely unrelated. This is the Python stack trace for the errors:

timbral_brightness

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-85-852004113145> in <module>()
----> 1 timbral_brightness('/mtgdb/incoming/freesound/sounds/198/198841_285997.wav')

~/freesound-audio-analyzer/timbral_models/Timbral_Brightness.py in timbral_brightness(fname)
    105         eval_audio = audio_samples[i:i + blockSize]
    106         complex_spectrum = np.fft.fft(eval_audio * window)
--> 107         magnitude_spectrum = np.absolute(complex_spectrum[0:1 + len(complex_spectrum) / 2])
    108 
    109         if sum(magnitude_spectrum) > 0:

TypeError: slice indices must be integers or None or have an __index__ method

timbral_hardness

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-86-d8f62a7fc6e0> in <module>()
----> 1 timbral_hardness('/mtgdb/incoming/freesound/sounds/198/198841_285997.wav')

~/freesound-audio-analyzer/timbral_models/Timbral_Hardness.py in timbral_hardness(fname, dev_output, max_attack_time, bandwidth_thresh_db, phase_correction)
    636 
    637     # calculate the onsets
--> 638     original_onsets = calculate_onsets(audio_samples, envelope, fs, nperseg=nperseg)
    639     onset_strength = librosa.onset.onset_strength(audio_samples, fs)
    640 

~/freesound-audio-analyzer/timbral_models/Timbral_Hardness.py in calculate_onsets(audio_samples, envelope_samples, fs, look_back_time, hysteresis_time, hysteresis_percent, onset_in_noise_threshold, threshold_correction, minimum_onset_time_separation, nperseg)
    507             current_strength_onset = strength_onset_times[onset_idx]
    508             if current_strength_onset == strength_onset_times[-1]:
--> 509                 onset_strength_seg = onset_strength[current_strength_onset:]
    510             else:
    511                 onset_strength_seg = onset_strength[current_strength_onset:strength_onset_times[onset_idx + 1]]

TypeError: slice indices must be integers or None or have an __index__ method

Missing dependency on the documentation: pyfilterbank

Maybe I'm missing something, but I had to install pyfilterbank before being able to import timbral_models. Just wanted to point it out in case the documentation needs to be updated.

By the way, in order to install pyfilterbank, I just did:

  1. git clone https://github.com/SiggiGue/pyfilterbank.git
  2. cd pyfilterbank
  3. python setup.py install

features meaning

Are there any documents that explained how these features are calculated?

Deprecation of sklearn

Your https://github.com/AudioCommons/timbral_models/blob/master/setup.py depends on sklearn, which is now deprecated. The suggested fix is to use scikit-learn instead.

Does someone have the time to try this? For some reason I get an error when trying to clone the repository. The easiest way to try might be

all_hp_centroid instead of all_hp_centroid_tpower in Timbral_Brightness.py

Hi,
if I read the code in Timbral_Brightness.py correctly, line 125 should say:
all_hp_centroid_tpower.append(hp_centroid_tpower)
and line 132 should say:
weighted_mean_hp_centroid = np.average(all_hp_centroid, weights=all_hp_centroid_tpower)

Otherwise the hp_centroid array would contain a mixture of centroid frequencies and powers, and the average in line 129 and 132 would not make too much sense.

Is this correct? Can you fix it?
Thank you!
Giampiero and Jérôme

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.