Giter Club home page Giter Club logo

Comments (35)

AlanRace avatar AlanRace commented on September 15, 2024 4

Okay I think I have it now. It looks like the first part is quadratic and the rest is cubic. There is probably a much more efficient way to calculate this, but for testing here is the code:

library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("0_P10/1/1Ref/fid") 
m <- readMzXmlFile("0_P10.mzXML")

mz <- m$spectrum$mass
tof <- f$spectrum$tof

A <- 6.1682474689786355
B <- sqrt(1e12/1164156.9998801197)
C <- 3884.868239229128
D <- -0.063083671657452864
E <- -34.362308485323794

a = D
b = A
c = B
d = C

# Calculate sqrt(mz) as if it were quadratic, using formula in https://github.com/sgibb/readBrukerFlexData/blob/master/R/tof2mass-functions.R
# But keep track of the sign (whether tof is bigger than C)
tofMinusC = (tof-C)
sign = ((tofMinusC > 0) * 2)-1
to_sqrt = ((B * B) - (4 * A * (tofMinusC)))
root_mz = (-B + sqrt(abs(to_sqrt)))/(2 * A)

predicted_mz <- c()

for(i in 1:length(tof)) {
  # Check whether tof was larger than C
  if(sign[i] < 0) {
    # Use precalculated quadratic calibration
    predicted_mz <- append(predicted_mz, (root_mz[i])^2 * sign[i] - E)
  } else {
    # Assume cubic with the form a*x^3 + b*x^2 + c*x + (d - tof) 
    # where x = sqrt(mz)
    cur_root_mz = Re(polyroot(c(d-tof[i], c, b, a))[1])
    
    predicted_mz <- append(predicted_mz, (cur_root_mz)^2 * sign[i] - E)
  }
}

plot(mz, mz-predicted_mz)

This results in errors ~= 10^-6 at m/z ~= 2000, but I think that is okay for most purposes.

calibration

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024 1

I am curious about how you trial and error to figure out the format of Bruker?

I modified the acqu(s) files compared the generated mzXML files with my code etc.. Also the article Titulaer et al 2006 and the FlexVariableTable.xml file from CompassXport was helpful.


I think I found the important difference. In all my own files the conversion from tof to m/z follows a quadratic equation: 0 = A * (sqrt(mz))^2 + B * sqrt(mz) + C(tof) and is coded in the acqu file (see also: https://github.com/sgibb/readBrukerFlexData/wiki/acqu-file#tof-calculation and https://github.com/sgibb/readBrukerFlexData/wiki/acqu-file#mass-calculation):

##$DELAY= 41161 
##$DW= 0.5 
##$ML1= 416186.134282719 
##$ML2= 180.735984610665 
##$ML3= -0.0485676649403545 
##$TD= 114222 

where

TOF = DELAY + (0:(TD-1)) * DW
A   = ML3
B   = sqrt(1e12/ML1)
C   = ML2 - TOF
MZ  = (-B + sqrt(B^2 - (4 * A * C))/(2 * A))^2

The newer files contain an additional variable NTBCal that includes among others the HPC calibration constants but also the tof to m/z constants (V3.0CTOFCalibrationConstants):

##$NTBCal= <V3.0CCalibrator 1 1 0 0 -2147483648 2147483647 V1.0CHPCData Order 10 vCoeff V1.0VectorDouble 11 -0.30899977070432588 0.00012359234583847467 1.493619815861809e-006 -2.7525708852997263e-009 2.2224914282887503e-012 -9.267322574910882e-016 1.4960981457711781e-019 3.0715708656370546e-023 -1.8530337885161092e-026 3.203442186133685e-030 -1.9959759254400677e-034 c2 -0.048940611526270571 c0 225.86975630715625 minMass 736.50266799999997 maxMass 3814.7214599999998 bUse 1 endCHPCData V3.0CTOFCalibrationConstants 41160.999999999993 0.49999999999999994 180.73598461066527 416186.13428271882 -0.048567664940354464 2 V3.0CTOFCalibrationConstants 41160.999999999993 0.49999999999999994 181.7119550333 416153.63347654999 -0.048420739890736003 2 >

It seems to have the following pattern:

##$NTBCal= < ... V3.0CTOFCalibrationConstants DELAY DW C B A 2 >

Your file has the following string:

##$NTBCal= <V3.0CCalibrator 12 1 0 0 -2147483648 2147483647 V1.0CHPCData Order 0 vCoeff V1.0VectorDouble 0 c2 0 c0 0 minMass 0 maxMass 0 bUse 0 endCHPCData V1.0CTOF2CalibrationConstants 48 0.25 3884.868239229128 1164156.9998801197 6.1682474689786355 -0.063083671657452864 -34.362308485323794 2 V1.0CTOF2CalibrationConstants 0 0 0 0 0 0 0 -1 >

Which could be:

##$NTBCal= < ... V3.0CTOFCalibrationConstants DELAY DW C B A D E 2 >

and maybe should be used in a 4-order-polynomial:

0 = E * (sqrt(mz))^4 + D * (sqrt(mz))^3 + A * (sqrt(mz))^2 + B * sqrt(mz) + C(tof)

I won't have time to test it this year (have to work each night for 12 hours the next days). If you like you could try it. To read the tof data you could use:

library("readBrukerFlexData")
f <- readBrukerFlexFile("0_P10/1/1Ref/fid", filterZeroIntensities=TRUE)

f$spectrum$tof
f$spectrum$mass # wrong mass
f$spectrum$intensity

It would be great to support your file too. If the above is not working I am going to throw an not supported error if more than 3 constants are present in NTBCal.

from readbrukerflexdata.

elsagorre avatar elsagorre commented on September 15, 2024 1

Hi,
I was previously experiencing the same problems where when I imported my data via the importBrukerFlex, there was a mass shift. We figured out that the reason was because after the spectra were collected, they were calibrated using the Cubic enhanced, and the importBrukerFlex needs the spectra to be calibrated using the Quadratic function. Try to calibrate your spectra in FlexAnalysis using the quadratic function, save them and then import them back to R. That should remove the mass shift you see.

Hope this helps!

from readbrukerflexdata.

SamGG avatar SamGG commented on September 15, 2024 1
library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("spectra_P10/0_P10/1/1Ref/fid")
m <- readMzXmlFile("0_P10.mzXML")

tof <- f$spectrum$tof
mz <- m$spectrum$mass

range(tof)
range(mz)

length(tof)
length(mz)

(idx = as.integer(seq(1, length(tof), length.out = 100)))

tof = tof[idx]
mz  = mz[idx]

plot(tof, mz)
plot(tof, sqrt(mz))


##$NTBCal= <V3.0CCalibrator 12 1 0 0 -2147483648 2147483647  V1.0CHPCData  Order 0 vCoeff V1.0VectorDouble 0  c2 0 c0 0 minMass 0 maxMass 0 bUse 0 endCHPCData V1.0CTOF2CalibrationConstants 48 0.25 3884.868239229128 1164156.9998801197 6.1682474689786355 -0.063083671657452864 -34.362308485323794 2 V1.0CTOF2CalibrationConstants 0 0 0 0 0 0 0 -1  > 
##$NTBCalr= <V3.0CCalibrator 12 1 0 0 -2147483648 2147483647  V1.0CHPCData  Order 0 vCoeff V1.0VectorDouble 0  c2 0 c0 0 minMass 0 maxMass 0 bUse 0 endCHPCData V1.0CTOF2CalibrationConstants 48 0.25 3884.868239229128 1164156.9998801197 6.1682474689786355 -0.063083671657452864 -34.362308485323794 2 V1.0CTOF2CalibrationConstants 48 0.25 3884.868239229128 1164156.9998801197 6.1682474689786355 -0.063083671657452864 -34.362308485323794 2> 

Calib = c(3884.868239229128, sqrt(1e12/1164156.9998801197), 6.1682474689786355, -0.063083671657452864, -34.362308485323794, 2)

df <- data.frame(tof=tof,x=sqrt(mz))
plot(df$tof, df$x)

I didn't succeed in fitting it with the equation, but the equation is OK if the fifth coefficient is zero. This means the equation is cubic instead of 4th order. In fact, @sgibb has nearly get the correct equation except he switched the last two coefficients (D<=>E). Here is my code and a test file (matrix control).

library("readBrukerFlexData")
library("readMzXmlData")
#> Warning: package 'readMzXmlData' was built under R version 4.0.2

# define file location and read
setwd("C:/data/active/tests/test-bruker/")
f <- readBrukerFlexFile("spectra_D23/0_D23/1/1SLin/fid")
#> Warning in .readAcquFile(fidFile = fidFile, verbose = verbose): File 'C:
#> \data\active\tests\test-bruker\spectra_D23\0_D23\1\1SLin\fid' seems to be empty
#> because no laser shots applied to this sample.
#> Warning in readBrukerFlexFile("spectra_D23/0_D23/1/1SLin/fid"): The spectrum file 'C:\data\active\tests\test-bruker\spectra_D23\0_D23\1\1SLin\fid' uses V1.0CTOF2CalibrationConstants.
#> V1.0CTOF2CalibrationConstants aren't supported by readBrukerFlexFile. See https://github.com/sgibb/MALDIquantForeign/issues/19 for details.
#> Could not convert time-of-flight values into mass!
m <- readMzXmlFile("spectra_D23.mzXML")  # converted using ProteoWizard

# extract information
tof <- f$spectrum$tof
mz <- m$spectrum$mass

range(tof)
#> [1] 13987.0 94812.6
range(mz)
#> [1]   399.9186 19933.5835

length(tof)
#> [1] 50517
length(mz)
#> [1] 50517

# sample 100 points
(idx = as.integer(seq(1, length(tof), length.out = 100)))
#>   [1]     1   511  1021  1531  2042  2552  3062  3572  4083  4593  5103  5613
#>  [13]  6124  6634  7144  7654  8165  8675  9185  9695 10206 10716 11226 11737
#>  [25] 12247 12757 13267 13778 14288 14798 15308 15819 16329 16839 17349 17860
#>  [37] 18370 18880 19390 19901 20411 20921 21432 21942 22452 22962 23473 23983
#>  [49] 24493 25003 25514 26024 26534 27044 27555 28065 28575 29085 29596 30106
#>  [61] 30616 31127 31637 32147 32657 33168 33678 34188 34698 35209 35719 36229
#>  [73] 36739 37250 37760 38270 38780 39291 39801 40311 40822 41332 41842 42352
#>  [85] 42863 43373 43883 44393 44904 45414 45924 46434 46945 47455 47965 48475
#>  [97] 48986 49496 50006 50517

tof = tof[idx]
mz  = mz[idx]

# overview
plot(tof, mz)

plot(tof, sqrt(mz))

# extract from the acqu file
##$NTBCal= <V3.0CCalibrator 12 1 0 0 -2147483648 2147483647  V1.0CHPCData  Order 0 vCoeff V1.0VectorDouble 0  c2 0 c0 0 minMass 0 maxMass 0 bUse 0 endCHPCData V1.0CTOF2CalibrationConstants 13987.200000000001 1.5999999999999999 581.41079117084337 2220702.4352394971 -0.036090927547664166 7.3810763208576876e-005 0 2 V1.0CTOF2CalibrationConstants 13987.200000000001 1.5999999999999999 581.41079117084337 2220702.4352394971 -0.036090927547664166 7.3810763208576876e-005 0 2  > 
##$NTBCalr= <V3.0CCalibrator 12 1 0 0 -2147483648 2147483647  V1.0CHPCData  Order 0 vCoeff V1.0VectorDouble 0  c2 0 c0 0 minMass 0 maxMass 0 bUse 0 endCHPCData V1.0CTOF2CalibrationConstants 13987.200000000001 1.5999999999999999 581.41079117084337 2220702.4352394971 -0.036090927547664166 7.3810763208576876e-005 0 2 V1.0CTOF2CalibrationConstants 13987.200000000001 1.5999999999999999 581.41079117084337 2220702.4352394971 -0.036090927547664166 7.3810763208576876e-005 0 2  > 

# define the calibration coefficients
Calib = c(581.41079117084337, sqrt(1e12/2220702.4352394971), -0.036090927547664166, 7.3810763208576876e-005, 0, 2)


# model of the data
df <- data.frame(tof=tof,x=sqrt(mz))

(mod2 = lm(tof ~ 1 + I(x) + I(x^2), data=df))
#> 
#> Call:
#> lm(formula = tof ~ 1 + I(x) + I(x^2), data = df)
#> 
#> Coefficients:
#> (Intercept)         I(x)       I(x^2)  
#>   606.45893    669.77815     -0.01825
Calib[1:3]
#> [1] 581.41079117 671.04989940  -0.03609093

# the coefficients perfectly match the coefficients found by a linear fit
(mod4 = lm(tof ~ 1 + I(x) + I(x^2) + I(x^3) + I(x^4), data=df))
#> 
#> Call:
#> lm(formula = tof ~ 1 + I(x) + I(x^2) + I(x^3) + I(x^4), data = df)
#> 
#> Coefficients:
#> (Intercept)         I(x)       I(x^2)       I(x^3)       I(x^4)  
#>   5.812e+02    6.710e+02   -3.609e-02    7.381e-05   -7.016e-18
Calib[1:5]
#> [1]  5.814108e+02  6.710499e+02 -3.609093e-02  7.381076e-05  0.000000e+00

# graphical view
plot(df$x, df$tof)
lines(df$x, predict(mod2), col = 2, lw = 2)
lines(df$x, predict(mod4), col = 4, lw = 2)

# error on the estimated tof is very small, < 8e-11
plot(df$x, predict(mod4) - df$tof)

Created on 2020-10-07 by the reprex package (v0.3.0)

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

@hsiaoyi0504 thanks for your bug report and the attached files. readBrukerFlexData (the package used for fid file import) fails to calculate the m/z values here.
To be honest I don't know why. Bruker didn't comment their file format and everything I know about it was found by trial and error.
I guess you measured the data in reflector mode. I compared it to some files I have and can't find any important difference yet. So I don't even know how to figure out when this problem arises nor how to fix it.
Please find a diff here

from readbrukerflexdata.

hsiaoyi0504 avatar hsiaoyi0504 commented on September 15, 2024

OK, I got it, but I am curious about how you trial and error to figure out the format of Bruker?

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

If you are interested you can find all I know about the format in this wiki: https://github.com/sgibb/readBrukerFlexData/wiki/acqu-file

The important part is the time-of-flight to m/z conversion described in Titulaer et al 2006 and implemented in https://github.com/sgibb/readBrukerFlexData/blob/master/R/tof2mass-functions.R

Do you use an autoflexTOF/TOF device (my data are from an ultraflexTOF/TOF)? It seems the deflector was not activated (##$DEFLON= no). Is that correct?

from readbrukerflexdata.

hsiaoyi0504 avatar hsiaoyi0504 commented on September 15, 2024

My colleague uses autoflex speed TOF/TOF, and I am not sure if he uses reflector mode. I will check that.

from readbrukerflexdata.

hsiaoyi0504 avatar hsiaoyi0504 commented on September 15, 2024

I confirmed that we used reflector mode.

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

Unfortunately this idea doesn't work:

library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("0_P10/1/1Ref/fid", filterZeroIntensities=TRUE)
m <- readMzXmlFile("P10_raw.mzXML")

A <- 6.1682474689786355
B <- sqrt(1e12/1164156.9998801197)
C <- 3884.868239229128
D <- -0.063083671657452864
E <- -34.362308485323794

mz <- m$spectrum$mass
tof <- f$spectrum$tof

y <- D * (sqrt(mz))^4 + E * (sqrt(mz))^3 + A * (sqrt(mz))^2 + B * sqrt(mz) + C - tof

png("quad.png")
plot(mz, y, xlab="mz")
dev.off()

quad

The difference between "the old" quadratic equation and the new unknown formula seems to be quadratic itself:

d <- m$spectrum$mass - f$spectrum$mass
png("diff.png")
plot(m$spectrum$mass, d, type="b", xlab="m/z", ylab="difference")
dev.off()

diff

Any ideas?

from readbrukerflexdata.

Armadilloa16 avatar Armadilloa16 commented on September 15, 2024

Hi There,

I was looking into reverse engineering the Bruker file formats awhile back and although I ended up putting that in the "too much effort" basket for the time being I kept an eye on this repo to come back too at some point. Anyway I'd like to help if I can but I'm not quite sure how.

I took a quick look at the data uploaded and you did already @sgibb, and taking what @hsiaoyi0504 said as an assumption (that the mzXML data is being read in correctly?), we can see that it is a quadratic relation to TOF, mz = A * tof^2 + B * tof + C with coefficients, A = 7.915023e-07, B = -2.619041e-03, C = 2.599336e+01

library("readBrukerFlexData")
library("readMzXmlData")

fid <- readBrukerFlexFile("0_P10/1/1Ref/fid")
xml <- readMzXmlFile("0_P10.mzXML")

df = data.frame(tof = fid$spectrum$tof,
                mz  = xml$spectrum$mass)
m = lm(mz ~ I(tof^2) + I(tof), data = df)
summary(m)

I'm not sure how the coefficients match up to the parameters in the acqu file though, looking at the values that are apparently normally used,

##$DELAY= 48 
##$DW= 0.25 
##$ML1= 1164156.99988012 
##$ML2= 3884.86823922913 
##$ML3= 6.16824746897864 
##$TD= 206848 

the transformed sqrt(1e12 / ML1) which evaluates to 926.8175 and the two other values in the
##$NTB= < ...> parameter, -0.063083671657452864 and -34.362308485323794, nothing
jumps out at me...

from readbrukerflexdata.

hsiaoyi0504 avatar hsiaoyi0504 commented on September 15, 2024

So it seems that we need to add this into the user docs?

from readbrukerflexdata.

elsagorre avatar elsagorre commented on September 15, 2024

Well, either that or hopefully we can figure out a way to add another variable to the TOF equation, which would change the quadratic equation into the cubic one.

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

@hsiaoyi0504 So it seems that we need to add this into the user docs?

I am going to try to figure out how to import this type of files correctly. If we don't get it, I am going to add an error if the NTBCal string contains more than 3 calibration constants.

@Armadilloa16 thanks for testing. The relation is tof ~ sqrt(mz) not tof ~ mz. But even than I can't find any useful answer.

Using the attached files (0_A20.zip, generated by an autoflex device) assuming a quadratic relationship works (as we already know):

library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("0_A20/1/1SLin/fid")
m <- readMzXmlFile("0_A20/1/1SLin/0_A20.mzXML")

mz <- m$spectrum$mass
tof <- f$spectrum$tof

# the NTBCal string:
# 19886 0.9999999999999999 268.44302617844 2597289.7995303 -0.004433520310037
cc <- f$metaData$calibrationConstants
cc[1] <- sqrt(1e12/cc[1])

df <- data.frame(tof=tof,
                 x=sqrt(mz))
m <- lm(tof ~ I(x^2) + I(x), data=df)

cbind(acqu=cc[c(2, 3, 1)], model=coef(m))
#            acqu        model
# c2 268.44302618 268.44302618
# c3  -0.00443352  -0.00443352
# c1 620.49715567 620.49715567

The same for the @hsiaoyi0504's files is not working, neither for the cubic nor the 4th-order-polynomial model:

library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("0_P10/1/1Ref/fid", filterZeroIntensities=TRUE)
m <- readMzXmlFile("P10_raw.mzXML")

mz <- m$spectrum$mass
tof <- f$spectrum$tof

# the NTBCal string:
# 48 0.25 3884.868239229128 1164156.9998801197 6.1682474689786355 
# -0.063083671657452864 -34.362308485323794
df <- data.frame(tof=tof,
                 x=sqrt(mz))

lm(tof ~ I(x^2) + I(x), data=df)
#
# Call:
# lm(formula = tof ~ I(x^2) + I(x), data = df)
#
# Coefficients:
# (Intercept)       I(x^2)         I(x)
#    -282.260       -1.054     1205.826

lm(tof ~ I(x^3) + I(x^2) + I(x), data=df)
#
# Call:
# lm(formula = tof ~ I(x^3) + I(x^2) + I(x), data = df)
#
# Coefficients:
# (Intercept)       I(x^3)       I(x^2)         I(x)
#  -1.313e+03    8.308e-02   -7.498e+00    1.357e+03

lm(tof ~ I(x^4) + I(x^3) + I(x^2) + I(x), data=df)
#
# Call:
# lm(formula = tof ~ I(x^4) + I(x^3) + I(x^2) + I(x), data = df)
#
# Coefficients:
# (Intercept)       I(x^4)       I(x^3)       I(x^2)         I(x)
#  -3.632e+03   -9.276e-03    1.034e+00   -4.131e+01    1.842e+03

Could it be that it depends on the device? It is a TOF/TOF. So maybe there are different constants/formulas for each flight tube/mass analyzer? (resulting in an additive model?)

from readbrukerflexdata.

recumbentbirder avatar recumbentbirder commented on September 15, 2024

Sorry I didn't have the time yet to look into it. And didn't in depth still. But let my just comment on one thing that seems very suspicious to me, which is the delay value: the first two examples (graphs) in this thread seem correct to me except for the mass shift. The shape is ok. That means, the problem lies in the 'delay' alone, which is the time between shot and recording data.

So basically, when you want to record a spectrum from lower masses of 2000 Da, you shoot, and then just let pass by the ions with masses smaller than that. Just wait, until they passed. This is the delay. And when the delay is in fact 48 as in the 'problematic' cases, the shift in the result seems 'correct' to me.

Perhaps we have to figure out how to find the correct delay? Because obviously other tools can do it.

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

@recumbentbirder thanks for looking into this. In fact the delay seems very short but wouldn't a different delay just result in a linear mass shift? But the spectrum is stretched and shifted as well.

from readbrukerflexdata.

recumbentbirder avatar recumbentbirder commented on September 15, 2024

@sgibb oh, yes! Sorry. See, I didn't look that deep into it. I don't know yet, when I have a little more time :-/

from readbrukerflexdata.

recumbentbirder avatar recumbentbirder commented on September 15, 2024

Let me contribute with a screenshot of said spectrum as displayed by Bruker's flexAnalysis:

to be honest, of all the plots now, this seems the most credible to me. Considering the low delay value, translating directly in a low minimum mass. But this is just a matter of displaying it: I guess you set 'xlim'?

But one interesting observation: in the plotted range, the maximum arbitrary unit intensity is shown as around 2000 by flexAnalysis for the maximum peak at 700 Da. Though the units are arbitrary (dependent on how many shots you sum up), the values in a spectrum file are 'absolute': there's normally no normalization done by default for plotting ...

spec_flanal

@hsiaoyi0504, how did you come up with the mzXML file? A Bruker software, or third party?

from readbrukerflexdata.

hsiaoyi0504 avatar hsiaoyi0504 commented on September 15, 2024

My partner used the mzconvert in the proteowizard to convert the files.

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

Sorry for the long delay. I just uploaded readBrukerFlexData 1.8.5 to CRAN. In this version readBrukerFlexData throws a warning if it founds the V1.0CTOF2CalibrationConstants string in NTBCal and doesn't convert the TOF data to m/z values anymore.

from readbrukerflexdata.

SamGG avatar SamGG commented on September 15, 2024

Same problem. Does anybody ask Bruker directly?

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

Back in 2010 and in early 2011 when I started the development of readBrukerFlexData I had some conversations with two developers and an official speaker from Bruker (about HPC). They weren't allowed to give me any details on the file format.
I assume that this attitude didn't changed but I don't know for sure.

from readbrukerflexdata.

SamGG avatar SamGG commented on September 15, 2024

Bruker's policy didn't change. The polynomial equation is confirmed. HPC is patented without any communication.

I spent hours at understanding the question. I think I reached some conclusions (see 2nd next comment, thanks @sgibb for the equation coefficients). I need some help to finish...

I would like a spectra with a fifth calibration coefficient not zero. The one of @hsiaoyi0504 seems problematic to me (next comment).

I would like some feedback from whoever is listening here.

Best,
Samuel

from readbrukerflexdata.

SamGG avatar SamGG commented on September 15, 2024

The file from @hsiaoyi0504 but still don't understand it. What is surprising is the shape of the transformation. I plotted the sqrt(m/z) from the mzXML versus tof and the curve at low tof is somehow odd for me.

image

from readbrukerflexdata.

SamGG avatar SamGG commented on September 15, 2024

Final thought: using a cubic or 4th order (aka "enhanced cubic" as called by Bruker IIUC) is interesting. I showed that we interpret the equation to some extent (up to 3th order). I did a verification that the m/z equation equals the tof. I did not solve the equation in order to compute the m/z from the equation and the tof values. When the equation was of 2nd order, there is a symbolic solution, the one stated by @sgibb . Higher orders imply the use of polyroot function to solve the equation for sqrt(mz). I think we could solve the equation for 1000 points and interpolate the values between them. But there are maybe better solutions.

# only 2nd order coefficients
# 0 = A * (sqrt(mz))^2 + B * sqrt(mz) + C(tof)
Re(polyroot(c(Calib[1]-tof[1], Calib[2], Calib[3])))^2
Re(polyroot(c(Calib[1]-tof[length(tof)], Calib[2], Calib[3])))^2

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

@SamGG thanks a lot for looking into it! Can you share your test files? Your work is already an improvement. Instead of not converting any tof to mz if V1.0CTOF2CalibrationConstants is present in the acqu file we could at least convert up to 3th order polynomials.

from readbrukerflexdata.

czanetti avatar czanetti commented on September 15, 2024

Same problem here, I need to use the cubic enhanced calibration to get decent mass accuracies but also want to keep the metadata from readBrukerFlexData since it is very useful.
If I am not misunderstood, there was some progress but we are not there yet for the exact conversion, I also gave it a try but soon found out it is not an easy task.

I was thinking of a workaround, do you people think this approach is reliable?

  1. use the spectra calibrated with the cubic enhanced equation (or any other unsupported calibration)
  2. export the valid m/z spectra in ASCII or other open formats (e.g flexAnalysis supports batch exporting in ASCII, retaining sample name and fid path)
  3. read the acquired data (fid) with readBrukerFlexData to get all the metadata into R (but wrong m/z values)
  4. substitute the mass and intensity values stored by readBrukerFlexData with the correct ASCII values

Another more elegant/reliable idea could be to store the metadata from the acqu(s) file into mzML or mzXML files, do you think this is feasible? (compassXport documentation is not so user friendly, but it seems to me that this FlexVariableTable.xml may be useful)

or if anybody has suggestions for other workarounds to keep the metadata I would greatly appreciat that.

Thanks for the great contributions!
Best,
Cristian

from readbrukerflexdata.

FrederikHolck avatar FrederikHolck commented on September 15, 2024

Hi,
We've recently run into the same issue with bruker flex data and V1.0CTOF2CalibrationConstants not being supported and read by readBrukerFlexFile. Are there any news on the development?
I'm afraid I'm not qualified to help in any way. We simply wish to read bruker flex data and export to MSD files.
Best,
Frederik

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

Unfortunately not. You just could use the Bruker software to export fid to mzML and use MALDIquantForeign to import this mzML and export it to MSD.

from readbrukerflexdata.

AlanRace avatar AlanRace commented on September 15, 2024

I think the final variable may be some form of constant correction to the resulting m/z. That would explain why when this value is 0, the cubic equation fits (as @SamGG showed).

library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("0_P10/1/1Ref/fid") #, filterZeroIntensities=TRUE)
m <- readMzXmlFile("0_P10.mzXML")

A <- 6.1682474689786355
B <- sqrt(1e12/1164156.9998801197)
C <- 3884.868239229128
D <- -0.063083671657452864
E <- -34.362308485323794

mz <- m$spectrum$mass
tof <- f$spectrum$tof

mzPlusE = mz + E
sign = ((mzPlusE > 0) * 2)-1

df <- data.frame(tof=tof,x=sqrt(abs(mzPlusE))*sign) 

(mod4 = lm(tof ~ 1 + I(x) + I(x^2) + I(x^3), data=df))

plot(df$x, df$tof)
lines(df$x, predict(mod4), col = 2, lw = 2)

x_vs_tof

I don't get the exact coefficients, but fairly close:

Call:
lm(formula = tof ~ 1 + I(x) + I(x^2) + I(x^3), data = df)

Coefficients:
(Intercept)         I(x)       I(x^2)       I(x^3)  
 3845.78915    934.56039      5.79341     -0.05795  
3884.8682    926.8175    6.16824    -0.0630

from readbrukerflexdata.

AlanRace avatar AlanRace commented on September 15, 2024

A little more detail. The following (cubic transform) seems to produce the correct m/zs for the 'upper' mass range (i.e. above the m/z dictated by the final coefficient).

library('AlgebraicHaploPackage')
library("readBrukerFlexData")
library("readMzXmlData")

f <- readBrukerFlexFile("0_P10/1/1Ref/fid") 
m <- readMzXmlFile("0_P10.mzXML")

mz <- m$spectrum$mass
tof <- f$spectrum$tof

A <- 6.1682474689786355
B <- sqrt(1e12/1164156.9998801197)
C <- 3884.868239229128
D <- -0.063083671657452864
E <- -34.362308485323794

a = D
b = A
c = B
d = C

root_mz <- c()

for(t in tof) {
  root_mz <- append(root_mz, cubic(a, b, c, d-t)[3])
}

root_mz = Re(root_mz)
predicted_mz = root_mz*root_mz*sign(root_mz)-E 

plot(mz, predicted_mz)

index_vs_difference

It seems like the range up to this point is determined by a different function(?).

mz_vs_difference

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

@AlanRace that looks promising! Great stuff! I will give it a more detailed look in the next days!

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

Dear @SamGG and @AlanRace thank you for your great work! Based on it I integrated the cubic calibration. I add you both as contributors to the DESCRIPTION file. Do you want to add some additional information (ORCID, email, ...?) there? Please feel free to ask for any changes.

Please find the cubic calibration in the issue3-ctof2calibration branch.
It would be great if you could review and test it. If you are both satisfied with it I am going to publish a new version on CRAN.
To test use:

remotes::install_github("sgibb/readBrukerFlexData@issue3-ctof2calibration")

Something I currently not understand completely. It seems to be important to change C - tof into tof - C for the quadratic part (if you compare the equation in tof2mass-functions.R and ntbcal-functions.R (I renamed C into D in the cubic code.)). I would assume that the quadratic part is of this should be identical to the original tof to mz equation.

from readbrukerflexdata.

AlanRace avatar AlanRace commented on September 15, 2024

Looks great to me! I'll try and test it properly over the next few days.

Happy for you to add my ORCID (https://orcid.org/0000-0001-8996-2641) and email ([email protected]).

I guess if you wanted it to be consistent between quadratic and cubic part you could just do abs(tof - C)? I guess this is just a measure of distance (in time) away from this pivot point where the function swaps over.

from readbrukerflexdata.

sgibb avatar sgibb commented on September 15, 2024

Finally we could close this issue. Thank you all for your support and patience!

I uploaded a new version (1.9.0) to CRAN.

from readbrukerflexdata.

Related Issues (5)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.