yikai-liao / symusic Goto Github PK

A cross platform note level midi decoding library with lightening speed, based on minimidi.

Home Page: https://yikai-liao.github.io/symusic/

License: MIT License

CMake 0.96% C++ 53.50% Python 29.52% Shell 0.03% Jupyter Notebook 15.91% C 0.09%

symusic's Issues

merging tracks

Hi! Thanks for the great library!
I have a feature request that might already be there, but I didn't find it: is it possible to merge all tracks in a score object into a single track? I'm aware this could create issues with the same pitch being played at the same time in different tracks, but for several of my files it should work.

Writing back pedal and control change

Pedals are actually stored twice, in controls and pedals of a track

If someone change controls or pedals, there might be some inconsistence between them.

Now, in symusic, we just write all the controls back and ingoring the pedal events in pedals.

Well, a solution is to remove all the corresponding control events in controls, but I'm not sure if it is a good design.

What's your opinion @Natooz ?

SoA (Struct of Array) Interface Selection

SoA support would be an important feature for symusic, since it is more suitable for AI applications than the current AoS (Array of Struct) interface.

SoA interface could enable lots of flexible conversion, resampling, quantization and other operations. #10
These functions take full advantage of numpy's eco (like numba), and could be very fast.

It's also important that we don't need to introduce more time_unit types (like beat) in symusic, which would make the general purpose code in c++ part more and more complex.

However, the interface for SoA is still to be determined. I'd like to hear your advice @Natooz @ilya16 . And of course, other design options are welcome!

Here, I will list some possible options I could think of.

Option 1: Dict of Numpy Array

In this case, we won't introduce new classes in symusic, but only use dict and numpy.ndarray.

from symusic import Score, Note, ControlChange
s = Score(...)

# get the soa of controls
# because "controls" is not a python list, but a c++ vector
# we could bind a .numpy() method for it
controls_arr: Dict[str, np.ndarray] = s.controls.numpy()
notes_arr: Dict[str, np.ndarray] = s.tracks[0].notes.numpy()

# create traditional AoS from numpy array
# Here, we could utilize the existing factory class for events like notes 
s.controls = ControlChange.from_numpy(controls_arr['time'], controls_arr['number'], controls_arr['value'])
s.tracks[0].notes = Note.from_numpy(notes_arr['time'], notes_arr['duration']， notes_arr['pitch']， notes_arr['velocity'])
# or we could use ** to shorten this
s.controls = ControlChange.from_numpy(**controls_arr)
s.tracks[0].notes = Note.from_numpy(**notes_arr)

Option 2: New SoA Classes in C++

In this case, we will define new classes for SoA, and use them in symusic. It seems more object-oriented.

The problem is, they will be defined 3 times, because of time unit. (The same reason for NoteTick, NoteQuarter and NoteSecond)
We will define a Union for them in symusic.types

from symusic import Score
import symusic.types as smt

# get the soa of controls and notes
controls_arr: smt.ControlChangeArr = s.controls.numpy()
notes_arr: smt.NoteArr = s.tracks[0].notes.numpy()

# convert them back to AoS
s.controls = controls_arr.list()
s.tracks[0].notes = notes_arr.list()

Also, although we have switched to nanobind, which get a much smaller overhead on accessing class attributes, the overhead is still there. Note that those overhead are almost constant, so it's not a problem for those "ms level" functions.

So if not necessary, I would not recommend create new class in c++. (Well, this overhead should be considered more in AoS part, not the SoA part)

Here is a benchmark for those tiny operations.

lib	Create a Note	Access Note.pitch	Note.pitch += & -=
python dict	66 ns	17 ns	69.9 ns
miditoolkit	162 ns	15.2 ns	48.1 ns
NamedTuple	175 ns	17.4 ns	tuple is const
symusic[nanobind]	251 ns	27.8 ns	110 ns
symusic[pybind11]	791 ns	238 ns	1070 ns
nb.jitclass in py	5.6 µs	37.8 ns	656 ns

Option 3: New SoA Classes in Python

In this case, we define the new class in python. It is more flexible, python native (no overhead).

But, these class can't be called in c++ (At least I don't know how to achieve this now. Maybe it's possible).
So, we won't get the .numpy() function here.

from symusic import Score, NoteArr, ControlChangeArr

controls_arr = ControlChangeArr(s.controls)
notes_arr = NoteArr(s.tracks[0].notes)

# convert them back to AoS
s.controls = controls_arr.list()
s.tracks[0].notes = notes_arr.list()

Suppression of Nanobind leak messages

In the past week, I've been getting a huge increase in the number of nanobind leak messages. When trying to train a model (which is utilizing symusic via Miditok) it will finish with >1,000 leak messages from nanobind.

It makes troubleshooting rather difficult, as it occupies so much of the terminal's recent history that I need to go through logs just to see what are the actual errors in my python code.

From this documentation I see there is an option to suppress the messages on the C++ side. Any chance this could be done for the public releases?

MIDI with negative beginning tick

Hey, 👋

I encountered a case where a specific MIDI file from the Lakh dataset got it's first tick before 0.
10e903c3aa7a6b6c5ce5a74d5cfb8702.mid.zip

When inspecting tracks, the 14th (idx 13) note of the 19th (idx 18) track has a time at -4465856

for ti, track in enumerate(score.tracks):
    for ni, note in enumerate(track.notes):
        if note.time < 0:
            test = 0

Didn't have to to dig further yet, I'll just leave this open

[Question] Performance improvement compared to mido

Hello,

What is the real performance improvement factor compared to mido? I see 100x in the first sentence of the README, but 1021x in the last table of the README. Are those values designing different things?

Thank you for your work, this library seems very promising!

nanobind v2.0.0 breaks building compatibility

Describe the Bug
As stated in title, after nanobind releases its v2.0.0 just last week, symusic will no more be built successfully under default settings.

To Reproduce
Head into pyproject.toml, fix version specifier to "==2.0.0" for nanobind, and run pip wheel ./symusic to build wheel for symusic. It is expected to fail at symusic/py_src/bind_vector_copy.h:67:66, raising error: ‘iterable_type_id’ is not a member of ‘nanobind::detail’; did you mean ‘iterable_check’? and other similar errors complaining missing members and missing overrides for functions.

No error is reported and everything is fine with nanobind releases v1.8.x and v1.9.x.

Expected Behavior
A successful build.

Possible Root of Cause and Solution
Since v2.0.0, iterable_type_id surely has not resided in nanobind::detail namespace. In this case, one may expect to suffer a build failure as pip would gladly like to choose the latest nanobind v2.0.0 as build dependency, which satisfies the version specifier (>=1.8.0) in pyproject.toml of symusic.

This might not be a pain for systems equipped with GNU libc modern enough to install wheels directly from pypi. However, for those who have to build symusic from source, it is important to get aware of this inconsistency that pip hides under its sleeves.

System Info

OS: Linux x86-64
Kernel version: 4.9.151
gcc version: 13.2.0
GNU libc version: 2.24

repr for event and list

For higher information density, I'm considering changing the repr of list of event ( like NoteTickList ), like this

When showing a list, I tend to hide they arguments' names, while the names are reserved for showing the single one

Note(time=0, duration=118, pitch=69, velocity=117)
[Note(0, 118, 69, 117), Note(0, 118, 77, 62), Note(240, 238, 76, 62), Note(480, 238, 72, 62), Note(720, 238, 67, 62)]

And for track and score, I will turn to show the summary, following miditoolkit, like this:

Score(ttype=Tick, tpq=480, begin=780, end=1431755, tracks=51, notes=60411, time_sig=97, key_sig=97, markers=97, lyrics=0)
Track(ttype=Tick, program=72, is_drum=false, name=PICCOLO, notes=1053)

And, when showing TrackList, the arguments' names won't be removed

@Natooz Do you have any suggestions?

Bug when multithreading with macOS since 0.3.4

Hi there 👋,

I spotted a bug causing Python to crash (Fatal Python error: Segmentation fault) when running multithreading operations (pytest-xdist) with the latest version of symusic on macOS (intel).

You can find logs of the issue here Natooz/MidiTok#142
I don't know what can be causing the issue exactly, I didn't check what changes have been made to the code.
It seems to only happen on "What a Fool Believes.mid".
I was able to reproduce locally, even though only one worker out of 8 crashed.

Unable to build on linux

pip install symusic worked fine on my local system, but isn't building on a linux google VM.

Python:
Python 3.9.18

PyPy:
Python 3.7.10 (7.3.5+dfsg-2+deb11u2, Nov 01 2022, 20:16:36) [PyPy 7.3.5 with GCC 10.2.1 20210110

It seems to fail during the CMake build process. Pasting just the end of the error log:

tor<unsigned char, std::allocator<unsigned char> >]’
      /var/tmp/pip-install-bk3m82wb/symusic_80be9b57712b46c19076677986767473/src/io/zpp.cpp:130:1:   required from here
      /var/tmp/pip-install-bk3m82wb/symusic_80be9b57712b46c19076677986767473/3rdparty/zpp_bits/zpp_bits.h:80:30: 
error: ‘__builtin_bit_cast’ was not declared in this scope; did you mean ‘__builtin_strcat’?
         80 |     return __builtin_bit_cast(ToType, from);
            |            ~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~
            |            __builtin_strcat
      [6/63] Building CXX object CMakeFiles/symusic.dir/src/utils.cpp.o
      [7/63] Building CXX object CMakeFiles/symusic.dir/src/score.cpp.o
      [8/63] Building CXX object CMakeFiles/symusic.dir/src/io/midi.cpp.o
      [9/63] Building CXX object CMakeFiles/symusic.dir/src/conversion.cpp.o
      [10/63] Building CXX object CMakeFiles/symusic.dir/src/track.cpp.o
      ninja: build stopped: subcommand failed.
      
      *** CMake build failed

Any help would be greatly appreciated! This is also preventing me from updating my Miditok installation.

Segmentation Fault when reading Score from MIDI files

I am getting Segmentation faults when attempting to read get Scores from this database:
https://zenodo.org/records/5120004#.Yhxr0-jMJBA

I am simply trying to read in all the MIDI files as scores when it occurs:

score_dict = {}
midi_files = list(Path(input_dir).glob("**/*.mid"))

fail_count = 0
for mf in tqdm(midi_files):
  try:
      name = str(Path(mf).stem)
      score = Score(mf)
  except Exception as e:
      fail_count = 1

Python simply returns:
Segmentation fault

After re-running multiple times, it seems to happen on different files each time.
Following this post I tried increasing my stack memory to 12,000 but that did not fix the issue

Interface design for pianoroll representation

We are designing interface for pianoroll representation. There are different existing interface designs:

miditoolkit :
- Only accept List[Note] as input. (I think that' s weird because miditoolkit itself has containers as Instrument.
- resample_factor, resample_method, time_portion, keep_note_with_zero_duration are provided as arguments, because division of it is tpq. (I think we accept Score<Tick>, Track<Tick>, Note<Tick> should be enough)
- velocity_threshold is provided as argument. (I think we can ignore it and implement filter_notes)
pypianoroll:
- Accept encode_velocity as argument to decide whether to binarize the pianoroll.
- No much arguments.
muspy:
- Basically the same as pypinaoroll

I think the interface must satisfy:

Provide encode_velocity as argument to decide whether to binarize the pianoroll.
Let user use the global resample interface instead of resolution -related argument.
Provide pitch_range, pitch_offset to clip pitch range.
modes of pianoroll (onset, offset, frame), accept as a List[str]?

Here are my considerations, do you have ideas? @Natooz @Yikai-Liao

MIDI files with simultaneous note on/off ticks result in incorrect note durations

Describe the bug
Notes with simultaneous note on and note off events (zero duration) are parsed incorrectly. The note off message is ignored and the duration is extended until the next note off event.

To Reproduce
An example MIDI file that has such corrupted messages: https://github.com/fosfrancesco/asap-dataset/blob/master/Bach/Fugue/bwv_883/KaiRuiR03.mid

For example, there is a zero-length note with pitch 72 at tick 114351. Printing all events with pitch 72:

mido 1.2.10

note_on channel=0 note=72 velocity=88 time=67
note_off channel=0 note=72 velocity=64 time=25
note_on channel=0 note=72 velocity=88 time=16
note_off channel=0 note=72 velocity=56 time=28
note_on channel=0 note=72 velocity=81 time=11
note_off channel=0 note=72 velocity=65 time=64
note_on channel=0 note=72 velocity=89 time=14
note_off channel=0 note=72 velocity=63 time=2
note_on channel=0 note=72 velocity=90 time=80
note_off channel=0 note=72 velocity=64 time=3
note_on channel=0 note=72 velocity=77 time=16
note_off channel=0 note=72 velocity=63 time=9
note_on channel=0 note=72 velocity=85 time=7
note_off channel=0 note=72 velocity=64 time=10
note_on channel=0 note=72 velocity=90 time=5
note_off channel=0 note=72 velocity=59 time=39
note_on channel=0 note=72 velocity=35 time=17
note_off channel=0 note=72 velocity=127 time=0
note_on channel=0 note=72 velocity=89 time=2
note_off channel=0 note=72 velocity=13 time=14

miditoolkit 1.0.1

Note(velocity=88, pitch=72, start=28440, end=28739), duration=299
Note(velocity=88, pitch=72, start=75032, end=75159), duration=127
Note(velocity=81, pitch=72, start=75966, end=76030), duration=64
Note(velocity=89, pitch=72, start=77486, end=77568), duration=82
Note(velocity=90, pitch=72, start=78786, end=78941), duration=155
Note(velocity=77, pitch=72, start=87932, end=87994), duration=62
Note(velocity=85, pitch=72, start=90025, end=90085), duration=60
Note(velocity=90, pitch=72, start=113164, end=113263), duration=99
Note(velocity=35, pitch=72, start=114351, end=114351), duration=0
Note(velocity=89, pitch=72, start=126793, end=127634), duration=841

symusic 0.4.5

Note(time=28440, duration=299, pitch=72, velocity=88)
Note(time=75032, duration=127, pitch=72, velocity=88)
Note(time=75966, duration=64, pitch=72, velocity=81)
Note(time=77486, duration=82, pitch=72, velocity=89)
Note(time=78786, duration=155, pitch=72, velocity=90)
Note(time=87932, duration=62, pitch=72, velocity=77)
Note(time=90025, duration=60, pitch=72, velocity=85)
Note(time=113164, duration=99, pitch=72, velocity=90)
Note(time=114351, duration=13283, pitch=72, velocity=35)

Expected behavior
I think events for notes with same note on/off should at least be parsed as notes with 0 duration (as done in miditoolkit) or ignored as they have no meaning. Giving them a duration of 1 tick is probably not the right choice.

Conversion between ticks and seconds

Hi! Many thanks for developing Symusic, a really great addition to the community.

I've been using miditoolkit for MIDI related processing but plan to switch to Symusic as it's really much faster. The only thing I miss now is the conversion between ticks and seconds. I need this for music transcription, alignment and synchronization tasks.

What are your plans for implementing the conversion of event times between ticks/quarters and seconds? Natooz/MidiTok#112 (comment)

In miditoolkit there are get_tick_to_time_mapping and _get_tick_to_second_mapping methods that create an array with indices providing a map between tick positions in MIDI to their time positions in seconds. It's very memory inefficient, but allows you to convert any tick to seconds.

I propose to have in Symusic a function that accepts a time in ticks/quarters and returns the time in seconds. And the reverse function (second -> tick/quarter). Converting the whole score to/from seconds is nice, but being able to map any point in time between time domains is also a needed feature.

Possible implementation: having precomputed times in ticks and seconds for all Score tempos, for any arbitrary tick/second we can find the closest tempo and compute the delta shift using the tempo.

Inconsistency with Score when writing to Audio

I want to do the following (in the context of a gradio interface):

Take in a MIDI file, and write it out immediately to audio
Process the MIDI file, (via Miditok/neural net) and save this as a new MIDI file
Convert these (new) MIDI files back into symusic Score
Convert these processed scores into audio as well

The new MIDI scores are not being 'allowed' by the symusic synthesizer.render function.

Original MIDI (this works fine):

s = Score(midi_file)
audio = synth.render(s, stereo=True)

print(type(s)) returns: 
<class 'symusic.core.ScoreTick'>

New MIDI (this breaks):

gen_midi.dump_midi(outpath)
s = Score(outpath)
gen_audio = synth.render(s, stero=True)

print(type(s)) returns: 
Score(ttype=Tick, tpq=480, begin=0, end=7200, tracks=1, notes=27, time_sig=1, key_sig=0, markers=0, lyrics=0)

I've verified that the new MIDI file is in fact valid.
Sorry if I'm missing something obvious. Why does the second 'Score' return a different type?

Incorrect/inconsistent treatment of program change in dump_midi

Hi there!

I have realized a strange behaviour when writing a Score containing multiple tracks to a MIDI file. General MIDI instruments (programs) don't seem to be interpreted correctly by different sequencers.

Here is a minimal example that should reproduce the issue:

from symusic import Score, Tempo, TimeSignature, Track

score = Score(16)
score.tempos.append(Tempo(time=0, qpm=120))
score.time_signatures.append(TimeSignature(0, 4, 4))
score.tracks.append(Track(name='piano', program=0, notes=[Note(0, 16, 60, 64), Note(16, 16, 64, 64), Note(32, 16, 67, 64), Note(48, 32, 72, 64)]))
score.tracks.append(Track(name='violin', program=40, notes=[Note(16, 16, 60, 64), Note(32, 16, 64, 64), Note(48, 32, 67, 64)]))
score.dump_midi("test.mid")

This should be synthesized by two different general MIDI sounds. However, in Reaper and Windows Media Player, only the first note is rendered with a piano sound and all subsequent notes (of both tracks) are rendered as violin. In Audacity on the other hand, only piano is used as a sound for all notes.

I assume this has to do with the way symusic writes the program change events. Unfortunately I don't have enough understanding of the code base to debug further.

Do you have any suggestion/idea what goes wrong here?

Adding a function to easily modify the event times at a score level (like `midi.adjust_times` in PrettyMIDI)

Is your feature request related to a problem? Please describe.
I would like to be able to speed-up or slow down a MIDI file easily, by modifying note timings (not only the tempo events).

Describe the solution you'd like
PrettyMIDI proposes the method .adjust_times:

import pretty_midi as pm

midi = pm.PrettyMIDI("file.mid")
end_time = midi.get_end_time()
midi.adjust_times([0, end_time], [0, 0.7 * end_time])

This linearly interpolate between the original timing and the new timing by stretching or shrinking time, impacting all the elements of the MIDI file (note starting time, note duration, tempo value, tempo position, time signature position, more generally all possible events). I haven't dig into what they are doing under the hood, to the relation between tempo, ticks and seconds, but this function can be quite useful.

It even allow to pass n-sized lists to allow for more complex mapping than a simple interpolation between the start and the end time.

Describe alternatives you've considered
I've considered doing it manually, but it would be handier to have a dedicated function. I don't know if this kind of function have its place in the context of symusic, I would be happy to have your opinion on that.

Inconsistency between the interface of vector bindings and python list & potential wild pointer problems.

Following #28 , I'm considering adding inplace argument for other functions. But I find that inplace operation does bring some inconsistency between the interface of vector bindings and python list. Also, this might lead to some wild pointer.

An Experiment

Here is an experiment I conducted:

Theoretical Analysis

Essentially, a python list holds an array of pointers (pointing to the real data), while the std::vector holds an array of the real data. Consider the following code:

notes = [Note(i, 0, 0, 0) for i in range(10)]
note = notes[-1]
notes.insert(0, Note(0, 0, 0, 0))
assert note == Note(9, 0, 0, 0)

For python list, we always expect the note (a reference to the last element of that list) to be Note(9, 0, 0, 0) no matter how we insert or remove elements in the list. Since in such operations, we only move some pointers, and we don't change the position of any real data.

notes = NoteTickList([Note(i, 0, 0, 0) for i in range(10)])
note = notes[-1]
notes.insert(0, Note(0, 0, 0, 0))
assert note == Note(8, 0, 0, 0)

For the vector bindings, e.g. NoteTickList, we do move the real data when insert the new note, while the pointer in note remains the same. Normally (If we don't reach the vector's capacity), the note will still point to the 10th element in notes, which is Note(8, 0, 0, 0). So this is an inconsistency between the bindings and python list.

And if we reach the capacity, vector will automatically copy all the data into a new, larger block of memory and free the original. Then, the note will become a "wild pointer", pointing to an invalid block of memory. If it's unfortunate enough, this will cause your program to trigger a segmentation fault.

In conclusion, even if such inconsistency is acceptable any operations that modify the position of the real data might lead to some potential "wild pointer".

Possible Solution

@Natooz I'm quite concerned about this issue and no perfect solution has been found.

One possible solution is to disable all operations that change the location of the data, and replace it with a copy, like numpy's append. And as we all know, numpy.append is slow and something that should be avoided.
Another is to Replace vector<Note> with vector<shared_ptr<Note>>. This makes c++ vectors perform like python lists and introduces a lot of memory fragmentation and overheads.

Both of these schemes introduce a significant overhead, and neither looks elegant or pythonic.

Of course, we could write in the documentation and the README to tell the user not to hold a reference to an event for a long time, but to copy it manually.

But I think there are still a lot of people who don't read these instructions, and then run into the problems I've analyzed.

Improving performance always comes at a price, which can take various forms.

Tempo value mismatch

Following #6, there is a float conversion somewhere that can slightly alter the Tempo.tempo values, causing the tests (__eq__) to fail.

nanobind problem

when I run pip install symusic,the error occurs:

     In file included from /tmp/pip-install-j269we9w/symusic_fde26dadb687464ba7ae39e41005e3da/py_src/core.cpp:10:
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:431:5: note: candidate: ‘nanobind::ndarray<Args>::ndarray(nanobind::ndarray<Args>&&) [with Args = {nanobind::numpy, unsigned char}]’
        431 |     ndarray(ndarray &&t) noexcept : m_handle(t.m_handle), m_dltensor(t.m_dltensor) {
            |     ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:431:5: note:   candidate expects 1 argument, 2 provided
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:427:5: note: candidate: ‘nanobind::ndarray<Args>::ndarray(const nanobind::ndarray<Args>&) [with Args = {nanobind::numpy, unsigned char}]’
        427 |     ndarray(const ndarray &t) : m_handle(t.m_handle), m_dltensor(t.m_dltensor) {
            |     ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:427:5: note:   candidate expects 1 argument, 2 provided
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:404:5: note: candidate: ‘nanobind::ndarray<Args>::ndarray(std::conditional_t<is_const_v<typename nanobind::detail::ndarray_info<Ts ...>::scalar_type>, const void*, void*>, std::initializer_list<long unsigned int>, nanobind::handle, std::initializer_list<long int>, nanobind::dlpack::dtype, int32_t, int32_t) [with Args = {nanobind::numpy, unsigned char}; std::conditional_t<is_const_v<typename nanobind::detail::ndarray_info<Ts ...>::scalar_type>, const void*, void*> = void*; typename nanobind::detail::ndarray_info<Ts ...>::scalar_type = unsigned char; int32_t = int]’
        404 |     ndarray(std::conditional_t<std::is_const_v<Scalar>, const void *, void *> data,
            |     ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:404:5: note:   candidate expects 7 arguments, 2 provided
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:390:5: note: candidate: ‘nanobind::ndarray<Args>::ndarray(std::conditional_t<is_const_v<typename nanobind::detail::ndarray_info<Ts ...>::scalar_type>, const void*, void*>, size_t, const size_t*, nanobind::handle, const int64_t*, nanobind::dlpack::dtype, int32_t, int32_t) [with Args = {nanobind::numpy, unsigned char}; std::conditional_t<is_const_v<typename nanobind::detail::ndarray_info<Ts ...>::scalar_type>, const void*, void*> = void*; typename nanobind::detail::ndarray_info<Ts ...>::scalar_type = unsigned char; size_t = long unsigned int; int64_t = long int; int32_t = int]’
        390 |     ndarray(std::conditional_t<std::is_const_v<Scalar>, const void *, void *> data,
            |     ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:390:5: note:   candidate expects 8 arguments, 2 provided
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:388:14: note: candidate: ‘template<class ... Args2> nanobind::ndarray<Args>::ndarray(const nanobind::ndarray<Args2 ...>&) [with Args2 = {Args2 ...}; Args = {nanobind::numpy, unsigned char}]’
        388 |     explicit ndarray(const ndarray<Args2...> &other) : ndarray(other.m_handle) { }
            |              ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:388:14: note:   template argument deduction/substitution failed:
      /tmp/pip-install-j269we9w/symusic_fde26dadb687464ba7ae39e41005e3da/py_src/core.cpp:801:24: note:   mismatched types ‘const nanobind::ndarray<Args ...>’ and ‘uint8_t*’ {aka ‘unsigned char*’}
        801 |             return py::ndarray<py::numpy, pianoroll_t>{const_cast<uint8_t*>(pianoroll.release()),
            |                        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        802 |                 { std::get<0>(pianoroll.dims()),
            |                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        803 |                     std::get<1>(pianoroll.dims()),
            |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        804 |                     std::get<2>(pianoroll.dims()),
            |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        805 |                     std::get<3>(pianoroll.dims()) }
            |                     ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
        806 |             };
            |             ~
      In file included from /tmp/pip-install-j269we9w/symusic_fde26dadb687464ba7ae39e41005e3da/py_src/core.cpp:10:
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:382:14: note: candidate: ‘nanobind::ndarray<Args>::ndarray(nanobind::detail::ndarray_handle*) [with Args = {nanobind::numpy, unsigned char}]’
        382 |     explicit ndarray(detail::ndarray_handle *handle) : m_handle(handle) {
            |              ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:382:14: note:   candidate expects 1 argument, 2 provided
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:380:5: note: candidate: ‘constexpr nanobind::ndarray<Args>::ndarray() [with Args = {nanobind::numpy, unsigned char}]’
        380 |     ndarray() = default;
            |     ^~~~~~~
      /mnt/ai_workspace/ComfyUI/venv_comfyui310/lib/python3.10/site-packages/nanobind/include/nanobind/ndarray.h:380:5: note:   candidate expects 0 arguments, 2 provided
      ninja: build stopped: subcommand failed.

      *** CMake build failed
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for symusic
Failed to build symusic
ERROR: Could not build wheels for symusic, which is required to install pyproject.toml-based projects

I checked the nanobind,it is 2.0.0

MIDI message level interface design

I'm preparing to add midi mesage level interface to symusic, but it's not very clear yet how the interface should be designed.

So I thought I'd ask your opinion here @wrongbad

Feature for `ScoreTick`: resample the time division `ticks_per_quarter`

Each MIDI (and in turn ScoreTick) has a time division express in ticks per beat, which often ranges between 120 and 480 (multiples of 2, 3 and 4).

It could be useful for some users to provide a feature allowing to resample a ScoreTick by changing the time signature. This would imply resample the times of all the events of a MIDI, and durations of events concerned.
For the durations especially, a min_duration expressed in tick could be very useful for cases where the resampling reduces the time division and turns some durations to 0.
You can take a look at how it is done in miditok (quantize_* methods), though there are probably better ways to do it, I was thinking of using numpy to batch the computations.

That's not an important feature, this can wait or be refused without major justification.

Question and Request: Sorting after conversion to second

Is your feature request related to a problem? Please describe.
What is the reason for adding sorting after converting MIDI from ticks to seconds in symusic 0.4.0 f93d6ae? I think it confuses the expectation of the conversion method as you expect the MIDI events to change their time domain but not the order.

In my use case, I need to keep track of the order of notes in MIDI at various stages of preprocessing. Sorting them under the hood during conversion leads to unexpected results. I'd rather sort the MIDI events afterwards if I need to.

Describe the solution you'd like
I would like to have sorting after the time conversion optional by a boolean flag if there is a need for sorting.

Describe alternatives you've considered
Sort the notes again after the conversion, but this adds overhead to the computation. Also, if the notes weren't sorted beforehand, it's harder to get them back in their original order.

Failed to dump empty midi

There must be some wrong ...

Make `Score.clip` include the ongoing tempo/time signature/key signature at `start`

Hello,

I have a small feature request for the clip method: make it include the current tempo, time signature and key signature at the tick start. The current implementation keeps the events of these types occurring within the clip section (i.e. changes), but we might not know what are the values of these features at the beginning of the clip.

Describe the solution you'd like

Make clip determine the latest tempo/time signature/key signature occurring before the start tick, and add Tempo/TimeSignature/KeySignature elements in the returned Score chunk with time values at start_tick.
This could set as an option controlled by a method attribute.

Describe alternatives you've considered

Here is how I implemented it in Python:

def extract_chunk_from_midi(
    midi: Score, tick_start: int, tick_end: int, clip_end: bool = False
) -> Score:
    """
    Extract a chunk of a ``Score``.

    The returned chunk will have a starting time at tick 0, i.e. the times of its
    events will be shifted by ``-tick_start``.

    :param midi: object to extract a chunk from.
    :param tick_start: starting tick of the chunk to extract.
    :param tick_end: ending tick of the chunk to extract.
    :param clip_end: if given ``True``, the chunk at ``tick_end + 1`` and thus include
        the events occurring at ``tick_end``. (default: ``False``)
    :return: chunk of the ``midi`` starting at ``tick_start and ending at ``tick_end``.
    """
    # Get the tempo, time sig and key sig at the beginning of the chunk to extract
    # There might not be default key signatures in the Score
    tempo, time_signature, key_signature = None, None, KeySignature(0, 0, 0)
    for tempo_ in midi.tempos:
        if tempo_.time > tick_start:
            break
        tempo = tempo_
    for time_signature_ in midi.time_signatures:
        if time_signature_.time > tick_start:
            break
        time_signature = time_signature_
    for key_signature_ in midi.key_signatures:
        if key_signature_.time > tick_start:
            break
        key_signature = key_signature_
    tempo.time = time_signature.time = key_signature.time = 0

    # Clip the MIDI and append the global attributes
    midi_split = midi.clip(tick_start, tick_end, clip_end=clip_end).shift_time(
        -tick_start
    )
    midi_split.tempos.append(tempo)
    midi_split.time_signatures.append(time_signature)
    midi_split.key_signatures.append(key_signature)

    return midi_split

I finally got back doing some c++ and got a glimpse of nanobind. If you want, I can try to implement this.

Additional context

That's something I'm implementing for this MidiTok PR: Natooz/MidiTok#148

Set-Tempo events not applied correctly to timestamps in multi-track files

https://stackoverflow.com/questions/1080297/how-does-midi-tempo-message-apply-to-other-tracks

Here's an example midi file where the tempo is modulated in track-0 only: https://www.bachcentral.com/ORGAN/catech7.mid

If the tempo updates are not applied to other tracks when computing timestamps in seconds, then they become out of sync.

Order mismatch between Velocity and NoteOff message when writing

Following #6, we found an issue with the order of the velocity and NoteOff messages when parsing / writing notes having the same onset time and pitch values.

This is likely to be a FIFO / LIFO issue, either one of these principle should be applied for both parsing and writing in order to keep the data integrity.