opentyper's Introduction

OpenChrom® Analytics Edition

OpenChrom® is an Open Source tool for the analysis and visualization of mass spectrometric and chromatographic data developed by Lablicate GmbH.

It is based on ChemClipse but offers additional features that cannot be part of the ChemClipse project due to licencing constrains. This is the free base version provided to the community. It does not contain vendor specific file format converters nor commercial extensions.

Please see our Code of Conduct for rules on how to interact on this site.

If you would like to contribute to this project, please have a look at the guideline.

For user- and developer documentation, have a look at our wiki.

Builds for Windows, macOS and Linux can be downloaded including proprietary converter bundles.

Performance measurements provided by the YourKit Java Profiler

opentyper's People

Contributors

Stargazers

Watchers

opentyper's Issues

Heatmap / Gel View of Mass Spectra

Already exists for chromatograms. Might be easy to port for mass spectra comparison.

Verify brukerconverterflex

https://github.com/sgibb/readBrukerFlexData/wiki has reverse engineered specifications.

Spectra Comparison

Compare https://www.eclipsecon.org/europe2017/sites/default/files/slides/Eclipse-Charting.pdf page 2 top right and page 3 bottom left or OpenChrom "Comparison Scan"

Method of identifying microorganisms

Sadly covered by patents. I found

in the http://www.ssi.shimadzu.com/products/literature/biotech/mo347_v1.pdf flyer.

SPECLUST

http://co.bmc.lu.se/speclust/info.pl is an online service which uses the same file format as #29

Mass labels on top of the peak

FoodBIMS

http://bioinformatica.isa.cnr.it/Bact_Dbase.htm looks like it was discontinued early on and the library remains small. Data is only provided in HTML tables under standard Copyright terms.

How can I add/install OpenTyper to Openchrom ???

I'm new to this software and I don't understand how to install / launch / use OpenTyper in Openchrom so could you help me ???

Installation guidelines could be great

PCA plot for Mass Spectra

http://sing.ei.uvigo.es/mass-up/ has one which I can't get to work on @openSUSE Linux (white display, no error message) and the 3D nature makes it very hard to grasp in static presentations where you can't rotate. This is a huge feature and requires lot's of processing steps like #3 and #11 and data reduction to be implemented first.

Revert to the simple Mass Spectrum List

c2b5b9a The new mass spectrum edit list contains irrelevant fields such as parent m/z, parent resolution, daughter m/z, daughter resolution and collision energy. TOF-TOF functionality isn't widely used. Also no one edits MALDI spectra manually.

Mass Spectrum Metadata View

To debug the current Bruker flex series .fid file format importer. I am not sure if all fields are set to things developers and users might expect nor if things are missing. Also due to the closed source nature of the plugin, people can't easily inspect it.

Multiple Spectra comparison

https://wiki.openchrom.net/index.php/Overlay_multiple_chromatograms just for mass spectra

Improve usability of the Tree File Viewer

The current tree file viewer is a bit inconvenient. https://bugs.eclipse.org/bugs/show_bug.cgi?id=496774 It requires users to navigate manually to the folder where data is located and for MALDI-TOF MS even hand pick samples and sub-samples manually to open them. While the Bruker Biotyper Offline Classification client has it's own usability flaws, the file open browser is much more convenient. A specialized tree view structure for the MALDI workflow that adheres to a similiar concept:

etc. makes it less painful to mass import data.

Automatic check for mass shifts

According to the Bruker Biotyper database creation procedures mass spectra of the same technical replicate have to adhere to the following specifications:

mass range	max. allowed difference of masses [Da]	max. allowed deviation (500 ppm)
3000-4000	1.47	339.79
4000-5000	2.17	230.73
5000-6000	2.74	182.55
6000-7000	3.13	319.28
7000-8000	3.58	418.59
8000-9000	4.05	247.16
9000-10000	4.69	426.80
10000-11000	5.21	286.66

Measurement is done from the highest part of the peak in flexAnalysis. Doing this manually with Microsoft Excel is tedious and screams for automation or at least a nice visualization, which highlights differences of masses in a table or even better the selected peak with color.

Matlab Peak List Files Converter

The http://wiki.microbe-ms.com/Data_Format_of_Peak_List_Files contains mass spectra including meta data on instrument calibration, operator and the microorganism.

Own .product file and standalone release

Configuring the Bruker flex plugin to not filter masses and to display profile mass spectra is a huge technical / usability barrier for most microbiologists who will try the pre-release builds.

Trim empty lines from Mass Spectrum CSV export

as it disturbs other tools like Mass-Up.

Don't force microFlex users to click through large amounts of sub-directories

The Bruker file format consists of several sub-folders, which makes navigation hard. With the introduction of randomly generated GUIDs like 4ec601cc-355a-4346-9253-3a266905d4df from the Compass 4 series this becomes even more of a problem as data is hard to find in the file explorer:

The folder structure looks like this:

project hash (e.g. 4ec601cc-355a-4346-9253-3a266905d4df)
- statusInfo.json (contains the status of the measurement for online identification)
- sample hash (e.g. 8aa8ba7a-e2eb-4e5b-81cc-9bf86533624e)
  - info (file contains the analyte ID)
  - 0_C1 (folder, probably position on the target)
    - 1 (always "1")
      - 1SLin (always "1SLin")
        
        acqu (textfile with instrument metadata)
        
        acqus (textfile with instrument metadata)
        
        fid (contains the binary data)
        
        sptype (textfile containing "tof")
        
        pdata (another folder)
        
        1 (folder, always "1")
        
        1r (binary data)
        
        proc (textfile with instrument metadata, looks redundant)
        
        procs (textfile with instrument metadata, looks redundant)

Brukers own Explorer offline identification software automatically skips folder until it sees the data, so it needs less clicks, but it can't read the automation server metadata, which is very inconvenient when working with data generated by Bruker Biotyper Online systems or Bruker Biotyper Satellite systems:

with the only workaround being to filter by time and date of the measurement:

So we have a chance to make it better than the original. I suggest not making the fid file inside the complex folder structure, but the folders above clickable and maybe even already display the analyte ID. Some kind of generic system would be a good idea so we don't end up with vendor specific file browsers.

MALDI-TOF MS User Platform Export Report

To ease listings of large in-house databases to http://maldi-tof-ms-user-platform.ua-bw.de a Microsoft Excel report of MSP metadata can be implemented. Depends on #2. Should be easy as Apache POI is already available via https://github.com/OpenChrom/openchrom3rdpl

mzXML data format for MALDI

According to http://wiki.microbe-ms.com/Import_Mass_Spectra_in_a_mzXML_Data_Format OpenChrom already supports Shimadzu MALDI-TOF MS (marketed under the brand name bioMérieux VITEK MS) but I currently don't have access to a machine to verify. It could turn the currently very Bruker centric development of this project into a true multi-vendor solution.

auto-detect profile mass spectra

At least the Bruker flex file format #18 contains the information and we also parse and save it:

[...]

// MASS SPECTRUM (CENTROID OR PROFILE)
value = extractValue(ACQU_DATATYPE, line);
if(value != null) {
	if(value.contains("CONTINUOUS")) {
		massSpectrumType = 1; // profile
	} else {
		massSpectrumType = 0; // centroid
	}
}

[...]

It is exposed in the interface IRegularMassSpectrum. So in my eyes the preference can be obsoleted and there is also no need for writing specialized MALDI-TOF MS spectrum views like ProfileMassSpectrumViewTOF.java

Continuous Integration

to avoid problems like c6d1381. I have some experience with @Jenkins from my @vogellacompany times and still have a @cloudbees account I can reactivate.

Update repository

@bintray has an Open Source Plan

Support for the Applied Biosystems Voyager DE STR MALDI-TOF MS

http://photos.labwrench.com/equipmentManuals/3117-1078.pdf looks like I missed another big player.

Batch processing

Current Freeware tools provided by Bruker Daltonics such as CompassXport are command line based and designed for the Microsoft Windows operating system only. I also haven't managed to get them up and running on my office/home PCs as they might need a connection to flexAnalysis (Setup.exe seems to install weird DLLs and COM connections) or the command and control server of the MALDI operator machine.

An easy configurable GUI to walk you through the data processing and exporting to replace this functionality can be achieved with an Eclipse wizard.

Smoothing for Mass Spectra

Savitzky Golay seems to be a commonly used algorithm for MALDI-TOF MS processing. Like #3 either port a function which already exists for chromatograms or wait for R scripting support to arrive. Depends on ~~https://bugs.eclipse.org/bugs/show_bug.cgi?id=496773~~ ☑️

Baseline correction for MALDI-TOF MS

Currently various algorithms are available to chromatograms. For MALDI MS spectra without chromatography none are available for now. As the application is slightly different, re-using existing algorithms may not be enough.

@sgibb proposed using

an estimation by computing the median of the intensities in a moving window
a computation of the convex hull of the spectrum via monotonic regression
the PROcess algorithm
the SNIP algorithm Bug 529410

in a 2011 conference publication.

Depends on https://bugs.eclipse.org/bugs/show_bug.cgi?id=496773 which has been resolved already.

Static website

Small https://pages.github.com stub which advertises the project.

MSP file format support

The MALDI Biotyper system stores Main Spectra Projection (MSP) files which are used as reference with proprietary automatic biomarker matching algorithms. According to the Bruker database creation protocol multiple measurements (8 technical replicates) of a single defined strain (1 biological replicate) is needed to capture the whole biological variability of an organism. MSPs are then added to hierarchical projects/libraries to cover a whole taxonomic species.

MSPs are the sum of at least 21 spectra (8 spots, measured 3x) which undergo manual quality control to detect mass shifts that stem from erroneous sample preparation and are calibrated against 1 BTS (bacterial test standard) spot. The MSP creation itself is unsupervised and automated. It employs de-noising and patented mass corrections to the peak data.

Due to the proprietary nature of the file format and the unknown specifications which are very much built around the original Biotyper algorithm, write support can probably never achieve true compatibility. Read support is probably possible, but may just reveal a very reduced peak list with some metadata (correction factors, matching hints, strain information) that is used by database developers and partly exposed to the Bruker Biotyper Compass UI to end users.

Support for SpectraBank .peaks lists

https://www.usc.es/gl/investigacion/grupos/lhica/spectrabank/Database.html

mzIdentML support

In contrast to the proprietary MSP file format #2 the mzIdentML (previously known as analysisXML) may be suitable as an open standard to exchange library entries. It might be able to close the gap between too simplistic CSV peak lists and other raw data missing vital metadata of sample origin and search instructions to standardize matching across applications.

Recommend Projects

openchrom / opentyper Goto Github PK