Alan Smith reported: We ran into a problem with the newer versions o

Example from Alan: <div class="snippet-clipboard-content notranslate positi

So the raw RT EIC has : <div class="snippet-clipboard-content notranslate position

Alan Smith made some excellent progress: Hi Steffen, <p dir="aut

Merged and pushed as 1.147.3 with Jo's fixes <a class="commit-link" data-hovercard-typ

Problem with getEICs in xcms about xcms HOT 8 CLOSED

sneumann commented on June 16, 2024

Problem with getEICs in xcms

from xcms.

Comments (8)

sneumann commented on June 16, 2024

Example script from Alan:

library(xcms)
library(faahKO)

cdfpath <- system.file("cdf", package = "faahKO")
list.files(cdfpath, recursive = TRUE)
cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)

xset<-xcmsSet(cdffiles,profmethod = "bin", method="centWave", ppm=30, peakwidth=c(5,50), snthresh=10, prefilter=c(7,1000), integrate=1, mzdiff= -0.001, fitgauss=FALSE, scanrange= c(0,500),mzCenterFun="wMean",noise=300,sleep=0, verbose.columns=T,nSlaves=3);
xset2<-retcor(xset, method = "obiwarp",distFunc="cor_opt",profStep=1,plot="deviation",gapInit=0.8, gapExtend=3.3)
gxset2<-group(xset2,method="density", bw=8, minfrac = .5, minsamp = 2, mzwid = .01, max = 50, sleep = 0)
xset3<-fillPeaks(gxset2);

peaktab <-data.frame(ID=groupnames(xset3),peakTable(xset3))

a<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"))
b<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"),rt="raw")
pdf("plot.example.pdf")
plot(a)
plot(b)
dev.off()
sessionInfo()

from xcms.

sneumann commented on June 16, 2024

So the raw RT EIC has :

> str(b)
Formal class 'xcmsEIC' [package "xcms"] with 5 slots
  ..@ eic       :List of 12
  .. ..$ ko15:List of 2
  .. .. ..$ : num [1:128, 1:2] 2689 2691 2692 2694 2695 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
  .. .. ..$ : num [1:128, 1:2] 2828 2830 2832 2833 2835 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
[...]
  .. ..$ wt22:List of 2
  .. .. ..$ : num [1:128, 1:2] 2689 2691 2692 2694 2695 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
  .. .. ..$ : num [1:128, 1:2] 2828 2830 2832 2833 2835 ...
  .. .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. .. ..$ : NULL
  .. .. .. .. ..$ : chr [1:2] "rt" "intensity"
  ..@ mzrange   : num [1:2, 1:2] 205 392 205 392
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "mzmin" "mzmax"
  ..@ rtrange   : num [1:2, 1:2] 2689 2827 2889 3027
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "rtmin" "rtmax"
  ..@ rt        : chr "raw"
  ..@ groupnames: chr [1:2] "M205T2790" "M392T2927"

i.e. the raw have 128 datapoints per EIC.

from xcms.

sneumann commented on June 16, 2024

Hi Alan,
it would be a fantastic help if you could create a self-contained and standalone test here.
The we could try to isolate the relevant commit between 1.26.1 and 1.44.x with "git bisect"
(see https://git-scm.com/book/en/v2/Git-Tools-Debugging-with-Git or https://robots.thoughtbot.com/git-bisect )
Yours,
Steffen

from xcms.

sneumann commented on June 16, 2024

Alan Smith made some excellent progress:

Hi Steffen,

I picked this up again and found that the problems in EIC extraction are caused by the length of the vector of corrected retention times. The vector of corrected RTs corresponds to the length of the all of the scans in an xcmsRaw object in xcms version 1.26.1 and only the scan range used for peak detection in xcms version 1.46.0.

When we perform peak detection using findPeaks.centWave we only evaluate a portion of the gradient that contains peaks of interest. This subset represents a subset of the total number of scans present in the mzData data file. We use the argument scanrange to reduce scans to regions of interest using centwave. This reduces computation time and focus downstream analysis on the regions of the gradient we are interested in.
a) A typical LC-MS analysis contains 7007 scans.
b) We perform peak detection over 5325 scans.
We typically analyze EICs to assess performance of peak detection, grouping and specifically retention correction.
a) To evaluate retention time alignment we plot two EICs for each feature. getEICs with "rt" argument set to "raw" and then set to "corrected"
b) We found that accurate EICs could not be generated when the "rt" argument was set to "corrected" with newer versions of xcms, but that accurate EICs
could be generated with "rt" set to raw.
c) The line of code in getEICs which leads to this observations is below. The code came from getMethod("getEIC", "xcmsSet") for xcms version 1.46.0. lcraw@scantime <- object@rt$corrected[[sampidx[i]]]
The root of the EIC extraction problem seems to come from how the xcms class object is initially created. The older libraries created an rt slot that had values of the entire scan range [7007 entries per file] while the newer versions of xcms create a rt slot with the truncated scan ranges [5325 entries per file ]. When getEICs is called the scantime vector for the raw lcms file is overwritten with the truncated range of the corrected retention time range. This overwriting of the scan range causes some problems with when the eics are created. I was not able to follow the code through getMethod("getEICNew", "xcmsRaw") or getMethod("getEICOld", "xcmsRaw") to find the exact line of code where an index is messed up.
I was able to hack the centwave (xset) object by taking the following steps.

# hack to plot rt corrected EICs on samples analyzed using a subset of the retention time.
lcraw <- xcmsRaw(filepaths(xset)[1], profmethod = profinfo(xset)$method, profstep = 0)
xset@scanrange <- c(0,length(lcraw@scantime)) 
raw.dat <- sapply(filepaths(xset),xcmsRaw,profmethod = profinfo(xset)$method, profstep = 0)
names(raw.dat ) <- 1:length(raw.dat)
xset@rt$raw <- lapply(raw.dat,function(x) x@scantime)
xset@rt$corrected <- lapply(raw.dat,function(x) x@scantime)

Attach is a pdf with an EIC generated using data after filling peaks [no hack] and an EIC generated using the hack on a centwave prior to retention time correction and filling peaks.

Let me know if there is anything else I can do to help.
Alan

from xcms.

jorainer commented on June 16, 2024

Concerning Alan Smith's point 3) above:
for the getEICOld and getEICNew, I introduced that split as I had some problems with the "original" getEIC method (now referred to as getEICOld) on xcmsRaw objects. Just out of curiosity, do you see the same problem you're describing when using the getEICNew method? You can switch to getEICNew using:

library(xcms)
opts <- options()$BioC
opts$xcms$getEIC.method <- "getEICNew"
options(BioC=opts)

thanks, jo

from xcms.

jorainer commented on June 16, 2024

Actually, I repeated the code chunk above after changing xcms to use the "new" getEIC method:

library(xcms)
library(faahKO)
## change getEIC to use getEICNew
opts <- options()$BioC
opts$xcms$getEIC.method <- "getEICNew"
options(BioC=opts)

cdfpath <- system.file("cdf", package = "faahKO")
list.files(cdfpath, recursive = TRUE)
cdffiles <- list.files(cdfpath, recursive = TRUE, full.names = TRUE)

xset<-xcmsSet(cdffiles,profmethod = "bin", method="centWave", ppm=30, peakwidth=c(5,50), snthresh=10, prefilter=c(7,1000), integrate=1, mzdiff= -0.001, fitgauss=FALSE, scanrange= c(0,500),mzCenterFun="wMean",noise=300,sleep=0, verbose.columns=T,nSlaves=3);
xset2<-retcor(xset, method = "obiwarp",distFunc="cor_opt",profStep=1,plot="deviation",gapInit=0.8, gapExtend=3.3)
gxset2<-group(xset2,method="density", bw=8, minfrac = .5, minsamp = 2, mzwid = .01, max = 50, sleep = 0)
xset3<-fillPeaks(gxset2);

peaktab <-data.frame(ID=groupnames(xset3),peakTable(xset3))

a<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"))
b<-getEIC(xset3,groupidx=c("M205T2790","M392T2927"),rt="raw")
pdf("plot.example-new.pdf")
plot(a)
plot(b)
dev.off()

The results look actually pretty good to me, but please reconfirm.
Eventually, we should switch by default to getEICNew (i.e. change getEIC.method="getEICOld" in init.R to getEIC.method="getEICNew")? What I didn't like about the getEICOld was that an EIC "buffer" was created, apparently to speed things up, but I guess this caused ultimately the problem. In the getEICNew I kept the code as simple as possible, loosing eventually some milliseconds in the calculation but be more stable.

from xcms.

sneumann commented on June 16, 2024

Merged and pushed as 1.147.3 with Jo's fixes 2f5e291 and 35be059 . If Alain confirms, this can be closed.

from xcms.

jorainer commented on June 16, 2024

closing this issue now - should have been fixed.

from xcms.

Problem with getEICs in xcms about xcms HOT 8 CLOSED

Comments (8)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent