uellue / opixtem Goto Github PK
View Code? Open in Web Editor NEWOpen Pixelated STEM framework
Open Pixelated STEM framework
Some thoughts in semi-random order, while reading https://github.com/uellue/opixtem/wiki/Requirements
HDF5 is the obvious solution here, due to being a matured, widely used and well-supported file format. I suggest using the NCEM EMD style "internal" HDF5 structure, since it is the most widely used at the moment.
HyperSpy has support for reading and writing EMD, and it is very easy to extend or tweak. The good thing about HDF5, is that it can be read lazily using for example dask (https://dask.pydata.org/). The data can be processed without loading the full file into memory. This is especially important for pixelated STEM data, due to it potentially being very large. HDF5 has native compression support, so this will also help with keeping file sizes more manageable.
Another important factor for this is the chunking value for the HDF5-files: https://support.hdfgroup.org/HDF5/doc/Advanced/Chunking/index.html. (Potentially a bit technical, but anyone who is implementing this needs to think about these things).
Lastly, there are potentially some IO-bottlenecks when using HDF5: loading the same data in a flat binary file can be quicker. This might be solved by using parallel HDF5, since I suspect a reason for the slower loading is partly CPU-limited. More info on this: https://support.hdfgroup.org/HDF5/PHDF5/
HyperSpy can convert DM3/DM4 to HDF5/EMD. This also extends to any file HyperSpy can load. So if we can load a file in HyperSpy, we can also save it as HDF5/EMD.
We in Glasgow got code for converting Medipix3 binary data to HDF5/EMD. We'll release this soonish.
The Glasgow group has software for a variety of data processing on this type of data:
Most of this is based on HyperSpy, both for the data processing and the visualization. Especially the lazy loading/processing part mentioned above is vital for this, since at some point the data will be too large for any standard computer.
I made a Python library for interfacing with a Medipix3 detector (through the Merlin readout system): https://fast_pixelated_detectors.gitlab.io/merlin_interface/. This is done over TCP/IP.
I also made a library for getting live data from a Medipix3 detector (via the Merlin readout system): https://fast_pixelated_detectors.gitlab.io/fpd_live_imaging/. It currently only works for the Medipix3, but it could work for any type of detector, as long as it is possible to get the image data somehow. For the Merlin readout system, this is all done using TCP/IP.
I think being able to interface (control and get data) through TCP/IP for more equipment would make everything so much easier. Since currently everything has to go through specific vendor software, which reduces the possibility to innovate.
I think it is a good idea to implement as much as possible in interpreted languages, like Python, since this greatly reduces the barriers of entry for researchers to participate. While pure python can be relatively slow, more optimized libraries such as NumPy is essentially very optimized C code.
For acquiring data, my fpd_live_imaging
works fine on 1000 fps, and I'll test it on 12000 fps soon. This also includes a PyQt UI for visualization.
So I don't think necessarily Python will be a problem, as long as the correct libraries are used. And if something is too slow in Python (and no relevant library exists), it can be written using Cython.
For GPU calculations, I'd aim for using things like OpenCL (instead of vendor specific, like CUDA). Especially since the cross platform solutions are more future-proof.
Possible user interface for post processing could be http://hyperspy.org/hyperspyUI/
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.