cannam / expressive-means Goto Github PK

View Code? Open in Web Editor NEW

5.0 5.0 0.0 613 KB

Expressive Means Vamp Plugins

License: GNU General Public License v2.0

Shell 2.68% Batchfile 0.48% PowerShell 0.67% Standard ML 26.53% C++ 68.58% Meson 1.06%

expressive-means's People

Contributors

Stargazers

Watchers

expressive-means's Issues

Portamento: (state of) Summary output

Dear Chris,

besides the discussion in #10, glide direction, range, and dynamics detection work great already, thank you very much! It is really only the "duration" and the "link" aspects which could need some tuning, both of which may be solved with the suggestions made in #10 (comment).

(Therefore, this issue is only for tracking the remaining aspects besides the ones in #10 .)

Semantic adapter for Expressive Means: Onsets

Dear Chris,

would it be reasonable to add a semantic adapter for our marvelous onset detector as well? Options and parameter specifications will be the same as for Articulation (but respecting the onset parameters there only of course), but I could do the specifications myself I you like!

Volume development:

Dear Chris, thanks for the new volume development logic! It seems to work fine in principle; the issue here links to the offset detection (see screenshot): right now, subsequent onset volumes are considered as well (so volumes that actually are decreasing are labelled as "de- and increasing"). Most if that will be solved as soon as we have a better working offset logic, I think.

Anyhow, it seems reasonable to me to add one rule, particularly for the case of "constant" notes (where there is no offset, screenshot 2): Volume development is considered until note offset, but it stops 50 ms before next onset in any case. What do you think?

Portamento: time instants

Dear Chris,

currently, time instants are set only if a glide has been detected near an onset. Would it be possible instead to return instants at every onset (comparable to Articulation) and attribute an "N" in case no glides are allocated to the following IOI?

– This would allow to align the portamento data much easier to other data later on, e.g. to onset and articulation layer data exports. Thank you very much!

Decide what to do with the various debug outputs

This is an issue for tracking any thoughts about the various "Debug" outputs that we have.

We probably ought to make a choice, for each of them:

Is it useful enough to be promoted to an "official" output for the plugin?
Is it useful, but perhaps something that should be in a different plugin altogether? We already moved some of the debug stuff from Articulation into the new Onset plugin, and if desired we could make further plugins to return other intermediate features
Or is it not all that useful, so that for later release builds we can switch off the WITH_DEBUG_OUTPUTS flag in the code and just hide it along with the other development outputs (or even remove it altogether)

No urgency to this, we can rearrange things at the end, and perhaps as we go through the remaining plugins we'll find other things that they have in common (as with Onsets). But if you have any thoughts as we go, jot them down here.

Commit issues

Dear Chris, please consider 2870807#commitcomment-112044956 – thanks! (And sorry.....!)

Improving parameter handling for the (unexperienced) user

Dear Chris,

since the number of parameters reached an extent which might be a little daunting for the user, I thought of offering preset combinations in a bit more comprehensible way. The idea is that the parameters stay as they are, but we facilitate preselections of convenient settings apt for various qualities of the audio rather than of the plugin’s logic (so the user doesn’t necessary have to learn about a „noise time window“ first in order to make use of the plugin). In a scheme (cf. the „Parameter window“ model below):

The „Plugin parameters“ window (prompted when starting the plugin) is rearranged in the way displayed below. Our 20 (!) parameters are bundled so that the user has to determine five qualities only (instead of 20) that match their audio best: Clef – Instrument – Sound quality – Note durations – Reverb (plus a tick box for overlap compensation).
The options chosen within these bundles lead to convenient parameter setting preselections that are given in the right column („[forwards to parameter preset no.]“).
However, if desired, each parameter will stay adjustable on its own. User-defined values are activated in the „Custom…“ area by ticking a tick box (leading to a bypass of the preselection bundles).

“Plugin Parameters“ window layout (prompted when starting the plugin)

[start of "Plugin parameters" area: parameter preselection bundles]

[To enhance clarity, is it possible to create a break and / or partition signs to clearly separate the „Custom…“ area from the upper part? And also the breaks between the respective parameter settings?]

Custom…
[= original parameters, see alterations in order of appearance. Each parameter gets a tick box which is deactivated by default. If activated, the choice here supersedes the respective parameter’s preselected settings within the bundles above.]

[end of „Plugin parameters“ area]

Preventing glides from causing too early offsets

(future work)

Goal:

„if sustain phase begins in the middle of a glide, reset it towards starting not before glide ends“

Originally posted by @FrithjofVollmer in #29 (comment)

(General:) Normalisation tick box

Dear Chris,

...another 'tidying-up' question: I guess we overlooked the normalisation step discussed in #4 (comment) (it was part of that big new articulation logic) –

Within the „Processing“ area of the plugin settings, we add a „Normalise audio“ tick box on top (above the „audio frames“ and „increment“ settings) which is activated by default. It causes a standard alignment of the audio’s maximum level towards 0 dBFS based on raw power, preceding analysis.  (—> This will help for a number of other issues as well, such as false-negative on- and offsets caused by too low levels.)

...since the outputs differ significantly without that normalisation (see below: the second recording is identical to the first but features a -31 dB level drop), could we still add this option for all plugins as defaults?

Thanks, Chris!

Portamento: glide link

Dear Chris, I’m afraid we need a better working logic for the link aspect: Right now, it almost always returns „2“ (interconnecting glide) since it builds on pYin data gaps surrounding the glide (which almost never occur). We might instead be better off if we relate it to the pitch data accompanying its surrounding onsets. This is:

Pre-work: Substitute the „Link threshold“ parameter [l] unit (ms) for cent units (preset at 50 cents).

If the difference of (1) the glide’s start pitch and the pitch of the onset preceding the glide's associated onset is below [l] and if the difference of (2) the glide’s end pitch and the glide’s associated onset pitch exceeds [l], or if both values exceed [l]: Return „starting glide“ (1)
If both (1) and (2) are below [l]: Return „interconnecting“ (2)
If (1) exceeds [l] and if (2) is below [l]: Return „targeting“ (3)

This should be more yielding, I guess – thanks, Chris!

Small safety note preceding v1.0 release

(within SV's transform menu, the plugins are still displayed with author names or branches, respectively:)

Offsets: add spectral drop detector?

From email:

Finally, one last suggestion regarding offset detector: To compensate for quiet passages with no much difference between attack maximum and offset, could we give it a try to combine the raw power drop logic with a spectral drop detector for upper overtone frequencies?

That is, an offset is defined either if the raw power level falls [level drop threshold] below sustain begin level or if all frequencies between 2 and 5 kHz apparent at sustain begin fall below [new parameter, preset: -70 dB].

To improve transparency for the user, these parameters could be renamed „Offset sensitivity: Raw power drop threshold“ and „Offset sensitivity: Spectral drop threshold“.

Clearing unused parameters from Portamento and Vibrato Plugin

Dear Chris,

I guess we don't need the "Sustain phase" as well as the two "Offset sensitivity" parameters within Portamento and Vibrato (besides from one vibrato debug option), so they could be cleared from the dialogue box, right?

Thanks!

Vibrato: semantic adapter

Dear Chris,

here is a first suggestion for the vibrato semantic adapter defaults:

...tomorrow, I'll have plenty of time to further refine these defaults (as you see, the vibrato-specific defaults are rather placeholders right now) and also to finalise the semantic adapter for portamento.

Processing speed: Improving workflow for the user?

Dear Chris,

one interesting thing I found in Matthias' pYin: the plugins seems to provide its data 'right on the way', that is, I can see (and work with) the pitch curve even the plugin is still processing. Is there a mode in SV, at least for low-level features, that allows plugins to provide data 'real-time' (i.e., returning results right when the analysis passed the respective hop)?

If yes (and if it's not too time-consuming to install), this probably would be a great thing for our plugins: For an audio of 4 1/2 minutes in 44,1 / 16, they currently take 54 seconds for processing on my computer (which is operating on an M1 chip, so colleagues with older processors may wait even minutes). This probably sounds not particularly dramatic, but I becomes a real burden if various parameter settings have to be tested (i.e., 6–7 turns of processing).

Thanks Chris! (On everything else, you'll receive an email soon!)

Pitch Vibrato: debug options?

Hi Chris,

found a short time window + sufficient Wifi in a German train to have a first look into the vibrato summary output. The layout looks great so far, thanks a lot! Regarding the analysis itself: do you think, there may be some more options again for debugging (or would it take too much time)?

Take the first few bars of the Huberman example for instance: Based on manual measurement, they signify as
4Fn> / N / 4Fm> / N / N ... – instead it currently returns:

...so at first glance it seems to me that the detectors for

Duration (start / end of the vibrato) and Rate – that is, the number of peaks recognised – are too insensitive so far (for these aspects, the "peak" debug helps a lot already, thanks!),
Range seem to me way too sensitive (almost all Huberman vibratos are analysed as being "wide" [w] instead of "narrow"],
Development almost always return "stable" [=] instead of "decreasing" [>] (this changes only with very large vibratos, as in the Roehn example)

Anyways, will have a deeper look into this on Tuesday! Thanks, so far!

Vibrato elements within IOI

From email:

We discussed at an early stage that if there are multiple vibrato elements within an IOI that are separated from each other (leaving a gap in between), only the succession closest to the onset should be considered. We tried to compensate for onset-crossing instances this way.
Now it turns out that, while we addressed the problem by means of the „segmented“ and „without glide“ modes sufficiently, this rule returns significantly incomplete results when it comes to singing in particular. Would it therefore be conceivable to tell the plugin to count all vibrato elements detected within an IOI (but to leave out the gap when it comes to rate calculation, of course)?

I think an example might be useful - this looks like a fiddly one to implement without a test case.

Articulation: noisy onset detection

Dear Chris, last but not least about the noisy onset detection: Thanks for implementing the logic here! You hard-coded the noise ratio parameters now, right? Would it still be conceivable to keep them as flexible parameters, at last for "affricative", "plosive", and "fricative"? Also, "affricatives" are not recognised at all right now, which seems to be due to my oversensitive preset suggestions.

Anyhow, the way bigger problem which occurs now is that as soon as an offset precedes, onsets are regularly classified as "noisy" (so even almost noiseless onsets are classified as being "plosive", see screenshot). So the spectral rise logic may not be the best solution anyways. I have to think about that again. If you don't have a better idea I would suggest to postpone this issue until next week (I'll have more time to reconsider then)... Ok?

Type and index layer work fine, of course! Thank you!

Onsets: raw power onsets

Dear Chris, thanks for your idea to solve the early-onset problem via the function's derivative; sounded plausible! However, even though the overall results seem to be a bit more yielding, the problem is still present (see screenshot) – so what do you think about keeping it plus complement it with the hierarchy logic suggested ealier?
(--> spectral rise beats power rise: If a spectral rise onset follows [use minimum onset interval, i.e. 100 ms] after a raw power onset, return spectral rise onset only; all other onset rules apply)

Portamento: Decide which (or how many) glide(s) to accept per IOI

(future work)

We might at some point evaluate whether our current glide acceptance rule works out: currently, while each onset accepts one glide only, only the glide closest to this onset is considered. Meanwhile, I spotted a number of instances where musicians employ multiple glides per IOI (particularly in Jazz & Popo vocals).

Solutions could be (1) to accept the longest glide only, or (2) to set one Time Instant per glide. (I personally would prefer the first one, in order to not abandon the IOI reference.)

Articulation: further refine sensitivity for various signal types

(future work)

It would be yielding at some point to further refine the Articulation "Plosive" and "Fricative" thresholds to be more apt for instrument-specific characteristics: at the moment, analysis is based on (historical) violin thresholds only.

To do so, we might either:

(1) introduce a new parameter which factorises the "impulseNoiseRatioPlosive" and "impulseNoiseRatioFricative" parameters according to a per-instrument preset, or
(2, the other way around) integrate these parameters to the semantic "Signal type" settings and then newly define the "Sound quality" parameter as consisting of a factor for "plosive" and "fricative" each (comparable to the "Reverb" parameter).

As a first lead (just so I don't loose the numbers): Optimal thresholds for "Vocal (Jazz & Pop)" seem to be 33% (Plosive) / 14% (Fricative).

Also see #24 .

Summary output: export format

Dear Chris,

one thing which may be aiming at the SV update rather than at the plugin: when exporting the summary output, SV writes the data with rather inconvenient separators (see screenshot: CSV file on the left, TXT file on the right ), which will make it hard for further processing since the values appear mixed up.

Is it possible to set one column for each break within the string instead? (1 – time stamp, 2 – duration in ms, ...) – also, the time stamps appear doubled (as being part of both the time instant and the string) and merged into each other, so if there is a way to omit (or hide) the second column / first value within string (= time stamp), this certainly would avoid confusion... What do you think?

Portamento: glide detection

[Mail from Chris, Feb. 16th:]

There is a significant problem with the glide detection as it stands, which is that whenever a new note occurs, a glide is usually detected, even if the note onset was quite distinct. This is because the new note's pitch values start to feed in to the right edge of the filter's moving window, so the old note's pitch values become increasingly different from the average within the window. The glide detection duration is (quite reasonably) shorter than half the filter window length, so a glide is reported.

Using a median filter rather than a mean filter (as in the Median+ output) helps a little, but only a little.

My instinctive feeling (looking at the candidate hops output plot) is that we might get decent results by looking at these values and saying: as soon as the value goes over the threshold, start to track it - but don't consider a glide to have begun until the value drops again (which suggests that the pitch is converging toward a target). After that, consider the glide as continuing regardless of whether the values rise or fall, until they finally fall below the threshold and the glide ends.

However, I'm just imagining this could work based on a quick review of one piece (the Huberman). Perhaps not all glides behave in that way. What do you think? And let me know if there is a better interpretation that I have simply overlooked.

Pitch onset detection: another bug?

There still has to be some minor problem with the logic in pitch detection. Consider the pitch detection curve (green, screenshot: 1988 Brendel_Schubert recording):

It shows a rise to 6 Cents, then falls to 0.4 Cents. Even though both onset layer have considerable higher "Onset sensitivity: Pitch" thresholds (15 and 9999999 Cents, respectively), an Onset is detected. This particularly seems to be the case in instances where the pitch track is interrupted. Looks to me as if something isn't properly connected here..?

(--> According to the logic, the function has to exceed the "Onset sensitivity: Pitch" threshold first for at least the duration of the "Minimum onset interval" and then fall below this threshold again to define a new pitch onset)

Pitch detection switch

(future work:)

Pitch detection doesn't makes sense when using the onset detectors for piano music (as the spectral rise detector has a far better performance here) or, in particular, for percussion. Bridged the problem temporarily by setting absurdly high pitch sensitivity parameter values as defaults. However, (especially) when working with the "advanced" outputs, a simple switch to bypass pitch detection would presumably be a clearly more preferable and elegant option...

Portamento: semantic adapters

Dear Chris,

it looks to me that for this plugin, we may even thin out the 'semantic' version a bit – since aspects like "sound quality", "overlap" and (eventually) "reverb" do not count for the pYin function. If you agree with my suggestions in #10 (comment), the semantic presets then could look somehow like this:

(...as for the semantic adapter for Articulation, the parameters not mentioned in the upper overview are left at their presets as given in the parameter listing.)

Besides that, the "Sustain phase" and "Offset" parameters could be returned from the "advanced" plugin version as well, I guess (as well as the "Onset proximity" parameter if we decide to drop it)!

Articulation: Summary output – volume indication (max / min)

Dear Chris,

…the SV update and the summary output in particular works great! Thank you very much, this looks gorgeous.

The only issue here ist that the volume indication (esp. the max volume) sometimes seems to be inaccurate: E.g., in the case of the screenshot (Rose recording) it should show a rise (of about + 1 dB) but it indicates a max of -0.21 dB instead. Maybe this is due to the raw power curve maximum prompting some false arithmetics?

Facing articulation problem in singing

(future work:)

Other than in instrumental sounds, plosive sounds in singing are usually not directly merging into / overlapping with associated tones – that is, they usually come along with some gap towards the pitch (essentially being two consecutive sounds). To address that, we would need to

(1) decide whether or not a consonant and its consecutive pitch should be regarded as one or two onsets;
(2) modify the Noise time window parameter accordingly (i.e., it likely would have to be lengthened).

Summary output: export format

Dear Chris,

Pitch onset detector: small debug suggestion

Dear Chris,

while putting together the preset bundles yesterday, I found a small bug in my pitch detector logic: in the conception (p. 3, step 2.1), I wrote that "subsequent onsets require [pitch difference threshold] to be exceeded for at least the duration of [minimum onset interval]". The idea behind this was to prevent vibratos to cause onsets; however, I found that linking it to the minimum onset interval will cause the detector to find nothing anymore from a certain value on (>150 ms), as pitch difference curve peaks for „real“ tone steps pass by faster than that.

Since there are hardly vibratos in musical performance that drop below 4.5 Hz (that is, 120 ms for half an amplitude plus some margin), would it be possible to do something like "subsequent onsets require [pitch difference threshold] to be exceeded for at least the duration of [minimum onset interval] but not exceeding 120 ms"?

This way, we prevent "vibrato onsets" and at the same time the minimum onset interval can be used the way it is actually intended for. What do you think?

All the bet
Frithjof

Question on pitch change onset detection again...

Dear Chris,

while searching for portamento and vibrato adapter presets apt for singing, something curious occurred to me which is related to pitch change onset detection again: Please consider this recording, it's a new one in the test material folder https://www.icloud.com/iclouddrive/0b1ZehCHibEWhr1c4R7MhsHlQ#Expressive_Means_Plugins_(sharing_folder) named "1902 Caruso at 14.74 but may be found at multiple other instances:

...for some reason, the "Onsets" output founds a pitch change at the end of notes, even though the pitch difference function (red) does not fall below the threshold preset of 15 Cents (at this specific point, it is at 34 Cents). Based on the logic, it shouldn't identify the onset before falling below that threshold. Moreover, changing the threshold (even to very low numbers) doesn't have a significant effect on the results.

Do you have an idea what I am missing here? (Maybe two conflicting rules I am not aware of – or is there something we missed so far?)

cannam / expressive-means Goto Github PK

expressive-means's People

Contributors

Stargazers

Watchers

expressive-means's Issues

Recommend Projects

Recommend Topics

Recommend Org