Regarding the charge state array, our reference implementation in our pwiz fork is lacking a specific mapping for that at the moment so it will write it as an array of floats by default. That should be a fairly simple and generic fix - we should check that "xref: binary-data-type:MS:1000519 "32-bit integer" is honoured
I really appreciate your efforts in creating the mzMLb data format. I wonder how difficult it would be to integrate this format into an existing Python program that is made for large scale MS data processing. The program runs on Linux and Windows and currently uses its own high-performance data format. In that case, I could adopt the file format right away. We are currently using the Apache Arror based .feather format.