Comments (4)
Yes. This is all true. The solution is probably to make an OOP version of these metrics. If you look at the MSBG code you will see that there is an Ear class which has a Cochlea object. A set_audiogram method will build a new Cochlea for the Ear. This allows all the precomputation to be done in the object constructors so that the Ear's process() method doesn't repeat the parts that are signal independent. It would be great if the Kates' models were all rewritten in this style.
Note, also that the MSBG code has its own implementation of a gammatone filterbank. See gammatome_filterbank in cochlea.py. This looks like the cascaded filter implementation. Here it is just a loop of lfilter commands, so no use of any njit code. This would probably be much faster if we could show that it gave similar results. But I haven't benchmarked it.
from clarity.
I checked the periodicity and found that some are not "perfect" I mean, you can find the period (no more than 300 samples) However, the values for some bands are the same in each period with variation in the 15th decimal place but, for other bands, the variations are in the 3rd or 4th decimal place. repeating the first period resulted in a reduction of 0.02 HAAQI in the example I was running from 0.12 to 0.10.
Regarding refactoring HAAQI, there are many computations that can be done in the constructor and others that could be saved:
-
There are many constants and 3 or 4 filters' numerator and denominator that depends only on sample rate that can be precomputed and saved as constant in the class.
-
The
sin
andcos
for the gammatone depends only on the length of the signal, so they could be computed only the first time the metric is called. Pre-saving this computation for a specific max length can be another option but, may increase the clarity package size. We could add this as part of the constructor and compute only if precomputations are not given. -
Profiling the original HAAQI, one third of the computation are the number of correlations.
from clarity.
HAAQI's Ear model is hard to separate as the MSGB does.
The main issue is an alignment step that run for all the bands (32 times)
clarity/clarity/evaluator/haspi/eb.py
Lines 240 to 248 in 6b98a25
In this part of the Ear model, the processed signal
is aligned with the reference signal
. This makes this step the reference signal
's ear model computation a requirement of the processed signal
's ear model computation.
As we can't really make the ear model independent of the signals we could have the following structure.
Note, from now on, I will called the processed signal
as enhanced signal
to avoid confusions
EarModel Class:
__init__()
: In the constructor we can compute all the parameters that depends on frequency sample to avoid repetitions.set_audiogram()
: Here we can compute parameters that depend on audiogram that are common toreference
andenhanced
process_reference()
: Here we compute thereference signal
ear model.process_enhanced()
: Here we compute theenhanced signal
ear model. We can condition this method to the execution ofprocess_reference()
To me, it is not an ideal structure but I can't think in another right now. @jonbarker68, any suggestions?
from clarity.
Analysing the ear model, it has 2 functions using almost all the execution time: envelope_align
and gammatone_basilar_membrane
.
The next figure show the execution time for 15-second signals. The total execution takes 8.3 seconds.
- The
envelope_align
takes 2.6 seconds. However, it may not be necessary to use the whole signal for the alignment, maybe we can select one second to find the delay, which it is not expected to be more then 200 samples - The
gammatone_basilar_membrane
is the green box next toenvelope_align
and it takes 1.7 seconds to execute.
This affects all three metrics, HAAQI, HASQI and HASPI
from clarity.
Related Issues (20)
- Cad1-Task1 - Rollback changes in score
- CAD1 - Task1 second baseline
- introducing dataclass to represent listener
- Build multiple versions of documentation HOT 7
- Alternative resampling methods
- write_signal should clip signals outside PCM_16 range
- Conflicts in torch devices in torchloudnorm amd ConvTasNet
- Bring CAD1 recipes improvements to main
- There is a typo in CAD 1 Task 2
- Release to ORDA HOT 4
- Working on Evaluation code CAD1-CPC2
- ICASSP 2024 Cadenza Baseline
- [BUG] smearer tests are failing when using numpy 1.25 or later
- [BUG] ICASSP 2024 is not generating all the data
- Close
- Tutorial notebooks not working with latest version [BUG]
- Forthcoming NumPy2.0 release
- Cleaning MSBG code
- From HAAQI - HASPI - HASQI
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from clarity.