Giter Club home page Giter Club logo

rawdiag's People

Contributors

cpanse avatar jwokaty avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rawdiag's Issues

dark theme

gp <- PlotMassHeatmap(PXD006932, bins=40)

gp2 <- gp + theme(legend.position = 'none') +
                   theme(axis.line=element_blank(),
                         axis.text.x=element_blank(),
                         axis.text.y=element_blank(),
                         axis.ticks=element_blank(),
                         axis.title.x=element_blank(),
                         axis.title.y=element_blank(),
                         legend.position="none",
                         panel.background=element_blank(),
                         panel.border=element_blank(),
                         panel.grid.major=element_blank(),
                         panel.grid.minor=element_blank(),
                         plot.background=element_blank()) +
                   theme(plot.title = element_blank()) +
                   theme(plot.subtitle = element_blank()) +
                   theme(strip.background = element_blank()) +
                   theme(strip.text = element_blank()) +
		   theme(plot.background = element_rect(fill = "black")) +
		   theme(panel.spacing = unit(-1, "lines"))

ggsave(filename = "graphics/Thumb.png", gp2,
  device = 'png',
  dpi = 300,
  height = 9, width =16)

read.tdf - Bruker timsTOF reader

read.tdf <- function(filename){
  con <- dbConnect(RSQLite::SQLite(), filename)
  rv <- dbGetQuery(con, "SELECT * FROM Precursors a INNER JOIN Frames b on a.id == b.id;");
  dbDisconnect(con)
  
  
  rv <- rv[, c('Id','Time','ScanNumber','Intensity','SummedIntensities',
               'MonoisotopicMz', 'Charge', 'MsMsType')];
  colnames(rv) <- c('scanNumber','StartTime','BasePeakMass','BasePeakIntensity',
                    'totIonCurrent', 'PrecursorMass','ChargeState','MSOrder')
  rv$filename <- basename(filename)
  rv$MSOrder[rv$MSOrder == 0] <- "Ms"
  rv$MSOrder[rv$MSOrder == 8] <- "Ms2"
  as.rawDiag(rv)
}

data analysis task: MaxQuant evidence file for targeted XIC extraction

MQ bfabric workunit with combined course data: 175310

screen shot 2018-09-26 at 11 19 45

download zip than:

xx <- read.csv("paolo_20180716_o4526_MQ_txtFiles/evidence.txt",sep="\t")
xx %>% head()
# vielleicht brauchbare columns.
relevant <- xx %>% select(Raw.file,
Sequence,Modified.sequence,Proteins, Charge,MS.MS.m.z, m.z, Mass,
Retention.time, Retention.length,Score,Delta.score,MS.MS.scan.number,
Intensity, Type ) %>% head()

unit test data

I created some raw files that could be used for unit testing.

a) Calibration mix recording on a FUSION (profile & centroid mode) using direct infusion (no LC seperation!)

Pierce LTQ Velos ESI Positive Calibration Solution, Product number: 88323
product homepage
Certificate of analysis

Could be used to test basic functions like:

  • file header access
  • scan data retrieval (profile or centroid mode)
  • XIC generation
  • m/z peak detection (profile data)

FUSION1_calMix.zip

MSScan_Orbi_centroid.raw contains 50 scan of type
FTMS + c ESI Full ms [150.0000-2000.2000]

scan #50 looks like this in FreeStyle 1.4 (uses RawFileReader)
image

The file header contains
FileHeader_MSScan_Orbi_centroid.txt

The profile mode file is structured accordingly and displays like this for scan 2
image

ggplots for QCs

    gp <- ggplot(data = df, aes(x = log(abundance,10), y = log(intensity,10), fill=filename)) + 
      geom_point(stat='identity', size = 2, aes_string(group = "filename", colour = "filename")) +
      geom_smooth(method = "lm", se = FALSE, aes_string(group = "filename", colour = "filename")) +
      #geom_text(x = -2, y = 7, label = lm_eqn_promega(df), parse = TRUE, aes_string(group = "filename", colour = "filename")) +
      facet_wrap(~ sequence * filename,  scales = "free", nrow = 6)

and

  gp <- ggplot(data = df, aes(x = rt, y = t, fill=filename)) + 
      xlab("iRT score") + 
      ylab("retention time [minutes]") +
      geom_point(stat='identity', size = 2, ) +
      geom_smooth(method = "lm", se = FALSE, aes_string(group = "filename", colour = "filename")) 
      
    
    if (input$plottype == "trellis") {
      gp <- gp + 
        #geom_text(x = 0, y = median(df$t), label = lm_eqn(df), parse = TRUE) +
        #geom_text(x = -2, y = 7, label = lm_eqn(df), parse = TRUE, aes_string(group = "filename", colour = "filename")) +
        facet_wrap(~filename, ncol = 1,  scales = "free")
    }
  
    gp <- gp + scale_fill_manual(values = cbbPalette) 

Def. origin scan type for instrument cycle

It would be nice to def. which scan type should be used as a marker for the start of an instrument cycle. Here is an example:

We execute cycles of
MS1 -> msxSIM -> M2-> ... -> MS1 -> ...

selecting MS1 as origin scan would def. an instrument cycle. This would allow plotting cycle specific stats.

  • How many cycles did the instrument do?
  • How long is a cycle?
    ...

`plot.XIC` and `plot.XICs`

having implementations for three method options.

plot.xic <- function(x, method = 'trellis'){
    #x$fmass <- as.factor(x$mass)
    figure <- ggplot(x, aes_string(x = "time", y = "intensity")) +
      #geom_segment() +
      geom_line(stat='identity', size = 1, aes_string(group = "filename", colour = "filename")) +
      
      #scale_x_continuous(breaks = scales::pretty_breaks(8)) +
      #scale_y_continuous(breaks = scales::pretty_breaks(8)) +
      labs(title = "XIC plot") +
      labs(subtitle = "Plotting XIC intensity against retention time") +
      labs(x = "Retention Time [min]", y = "Intensity Counts [arb. unit]") +
      theme_light()
    
    
    if(input$XICmainPeak){
      figure <- figure + facet_wrap(~  x$mass  , scales = "free", ncol = 1) 
    }else{
      figure <- figure + facet_wrap(~  x$mass  , ncol = 1) 
    }
    return(figure)
  }

add scan filters to readXICs()

Our current readXICs() does not support any scan filters. Actually the compiled c# code uses the hard coded filter:

Filter = "ms"
that returns all scans.
see line 1046 of fgcz_raw.cs

I think it would be cool to have an additional parameter for readXICs() that passes filters to the c# function.

XIC mass range option for shiny application

Hi Christian,

I was thinking: it would be really cool if in the shiny version of rawDiag you added an option for a custom mass XIC. For example if I wanted to see where a particular trypsin peptide was eluting I could type in the mass range (e.g. 421.74-421.76) and it would display those XICs.

Thanks for all your help; I’m really loving rawDiag.

Cheers,

Richard Hagan | PhD Student
Max Planck Institute for the Science of Human History
Kahlaische Straße 10 07745 Jena, Germany

supported std. peptides

So far rawDiag supports:

  • iRT (Biognosys)
  • 6 x 5 LC-MS/MS Peptide Ref. Mix (Promega)
  • MSQC1 (Sigma)

Are there other peptide sets that could make sense?

PROCAL
Zolg, D. P., Wilhelm, M., Yu, P., Knaute, T., Zerweck, J., Wenschuh, H., et al. (2017). PROCAL: A Set of 40 Peptide Standards for Retention Time Indexing, Column Performance Monitoring, and Collision Energy Calibration. Proteomics, 17(21), 1700263. http://doi.org/10.1002/pmic.201700263

JPT product

Centroided ITMS (ion trap) cans not recognized

Don't know if it ever was intended to work on ITMS data, however trying to get the peaklist of a Fusion Lumos ion trap scan always results in the following error:

Example scan:
Scan Mode: ITMS + c NSI r d Full ms2 [email protected] [100.00-825.00]

extract <- readScans(rawfile = rawfile), c(10638))
# No centroid stream available

Cheers
Daniel

``About rawDiag'' tab for shiny application

We should have a tab in the GUI that displays some important infos regarding the software:

  • License/copyright infos (see About RStudio)
  • links to repositories (GitHub)
  • contact points for bug reporting/help
  • literature references (our publication)

more columns than column names Fusion Lumos

...

Bigger bug, I guess you didn’t have a Fusion Lumos file to test with:
Extracting the raw file for scans/XIC works, however the read.raw () function throws an error. I guess the naming is different, perhaps one can build a fallback flag to ignore them if none are found. Otherwise this might have to be adapted to different machines. I could provide raw files for Orbitrap XL, Velos, Elite, QE Plus, QE HF, QE HFX, Fusion Lumos.

metadata <- read.raw(rawfile)

system2 is writting to tempfile C:\Users\danielz\AppData\Local\Temp\RtmpqM2GA5\file139c2bc35ba1tsv ...

Error in read.table(file = file, header = header, sep = sep, quote = quote, :

more columns than column names

If you want to reproduce that, I uploaded a BSA raw file from said Fusion Lumos mass spectrometer ...

single install.R for windows/linux

#R
  
# Christian Panse <[email protected]>
# Functional Genomics Center Zurich 2018

# System Requirements
pkgs <- c( 'devtools',
  'dplyr',
  'ggplot2',
  'hexbin',
  'magrittr',
  'parallel',
  'protViz',
  'rmarkdown',
  'RSQLite',
  'scales',
  'shiny',
  'tidyr',
  'tidyverse',
  'DT')

(pkgs <- pkgs[(!pkgs %in% unique(installed.packages()[,'Package']))])
if(length(pkgs) > 0){install.packages(pkgs)}

# Installation
install.packages('http://fgcz-ms.uzh.ch/~cpanse/rawDiag_0.0.28.tar.gz', repos=NULL)


# Testing
library(rawDiag)
(rawfile <- file.path(path.package(package = 'rawDiag'), 'extdata', 'sample.raw'))
system.time(RAW <- read.raw(file = rawfile))
dim(RAW)
summary.rawDiag(RAW)
PlotScanFrequency(RAW)

# read all dimensions
dim(RAW)
RAW <- read.raw(file = rawfile, rawDiag = FALSE)
dim(RAW)

R.version.string; Sys.info()[c('sysname', 'version')]

run the rawDiag shiny application

library(rawDiag)

# root defines where your raw files are
rawDiagShiny(root="D:/Data2San/")

run as BAT script on the windows box

"c:\Program Files\R\R-3.5.1\bin\R.exe" -e "library(rawDiag); rawDiagShiny(root='D:/Data2San', launch.browser=TRUE)"

or from the Linux/Apple command line

R -e "library(rawDiag); rawDiagShiny(root='$HOME/Downloads', launch.browser=TRUE)"

and you can add it to you $HOME/.bashrc

alias rawDiag="R -e \"library(rawDiag); rawDiagShiny(root='$HOME/Downloads', launch.browser=TRUE)\""

ASMS 2018 poster

INFORMATICS: ALGORITHMS AND STATISTICAL ADVANCES II 374-392
ThP 375
Optimize your Method: rawDiagnostic An R Package to Support Method Development for Bottom-up Proteomics on Orbitrap Instruments

Number of precursors scheduled for fragmentation for each MS1 scan

First of all, rawDiag is AWESOME!
I guess this is more a feature request, but I am not completely sure this is even possible.
It would be nice to be able to extract from a raw file the list of monoisotopic m/z the mass spectrometer has 'calculated' from the MS1 survey scan. I believe Thermo is using a proprietary (?) algorithm for the MIPS, but probably the output of that step (probably m/z, charge and intensity) is stored in the final raw file for each MS1 scan.
Knowing which precursor has been actually fragmented and which one not due to the cycle time, one can decide whether it is worth to adjust LC & MS parameters to dig deeper into the sample.

LFQ demo

code for tweet https://twitter.com/hb9feb/status/1014602529034915840

#R

library(protViz)
library(rawDiag)

f <- function(rawfile, pepSeq, dt = 0.1){
  mass2Hplus <- (parentIonMass(pepSeq) + 1.008) / 2
  X <- readXICs(rawfile = rawfile, masses = mass2Hplus)
  S <- read.raw(rawfile)
  
  idx <- lapply(mass2Hplus, function(m){
    which(abs(S$PrecursorMass - m) < 0.1)
  })
  
  scanNumbers <- lapply(idx, function(x){S$scanNumber[x]})
  
  bestMatchingMS2Scan <- sapply(1:length(pepSeq), function(i){
    peakList <- readScans(rawfile, scans = scanNumbers[[i]])
    
    peptideSpecMatch <- lapply(peakList,
                               function(x){
                                 psm(pepSeq[i], x, FUN = function (b, y){cbind(b, y)}, plot = FALSE)})
    score <- sapply(1:length(peptideSpecMatch), 
                    function(j){
                      sum(peakList[[j]]$intensity[abs(peptideSpecMatch[[j]]$mZ.Da.error) < 0.1])})
    bestFirstMatch <- which(max(score, na.rm = TRUE) == score)[1]
    scanNumbers[[i]][bestFirstMatch]
  })
  
  peakList <- readScans(rawfile, scans = bestMatchingMS2Scan)
  
  pp <- lapply(1:length(pepSeq), function(j){
    jpeg(filename = paste("~/Desktop/rawDiag_", pepSeq[j],".jpeg", sep=''), quality = 100, height = 640)
    op<-par(mfrow = c(2,1), mar = c(5,4,4,3))
    peakplot(pepSeq[j], peakList[[j]], FUN = function (b, y){cbind(b, y)})
    
    t <- S$StartTime[bestMatchingMS2Scan[j]];
    
    peak.idx  <- which((t - dt) < X[[j]]$times & X[[j]]$times < (t + dt))
    
    plot(X[[j]], xlim = c(t - 0.2, t + 0.2), main = paste("RT =", round(t * 60), 'seconds', "[m+2H]2+ =", mass2Hplus[j] ),
         xlab = 'RT [min]', ylab = 'intensity');
    abline(v = t, col = rgb(0.8, 0.1, 0.1, alpha = 0.5), lwd = 3)
    
    # peak fitting
    xx <- X[[j]]$times[peak.idx]
    yy <- X[[j]]$intensities[peak.idx]
    points(xx, yy, pch = 16, col = rgb(0.0, 0.1, 0.8, alpha = 0.5))
    # text(xx, yy, peak.idx, pos = 1)
    peak <- data.frame(logy = log(yy), x = xx)
    x.mean <- mean(peak$x)
    peak$xc <- peak$x - x.mean
    (fit <- lm(logy ~ xc + I(xc^2), data = peak))
    xx <- with(peak, seq(min(xc) - 0.2, max(xc) + 0.2, length = 100))
    lines(xx + x.mean, exp(predict(fit, data.frame(xc = xx))), col=rgb(0.25, 0.25, 0.25, alpha = 0.3), lwd = 5)
    dev.off()
  })
  
}
# https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3918884/

f(rawfile = "/Users/cp/Downloads/20180220_14_autoQC01.raw", 
  c('GAGSSEPVTGLDAK', 'VEATFGVDESNAK', 
    'TPVISGGPYEYR', 'TPVITGAPYEYR', 'DGLDAASYYAPVR',
    'ADVTPADFSEWSK', 'GTFIIDPGGVIR')
    )

installation of NewRawFileReader on MacOS

I tried to follow

"Register the .Net assembly in your system similar to a Linux installation", but it is unclear to me what this implies. The following things are done:

  • downloaded the NewRawFileReader archive from Thermo
  • installed mono

What are the next steps? How do I install the NuGet packe? Do I need VisualStudio? If not, what are the alternatives? We need to document this in a way that people without any prior experience in this area will to able to complete installation.

Error: Negative length vectors are not allowed

Dear RawDiag Team,

I am trying to extract scans from a RAW file. MS2 scans work, MS1 scan extraction works in general, e.g. if I subselect the first 100 scans to extract. Whenever I submit a large amount of scans (like all MS1 scans of a file), readScans returns:

Error in source(tfo) : negative length vectors are not allowed

I suspect that one of the scans might be empty (have seen that before, but rarely). The behavior is file dependent, some run through, some don't. Are there verbose messages to find at which scan it goes wrong? If it is indeed an empty scan, can one try to catch this error?

This seems to be a memory issue, quite a lot hits for the error. When I chunk the scans (5x1000 scans) it runs fine. So I guess the function does not scale well to ~ 5k MS1 scans (profile) or > 80k MS2 scans (these were testfiles that fail).

RawDiag 0.0.29, R 3.5.2 under 64bit Windows:

file <- "02401_Ecoli_QC_R3.raw"
metaDat <- read.raw(file, rawDiag = FALSE)
idx <- metaDat[ which(metaDat$MSOrder == "Ms"),]$scanNumber
scanDat <- readScans(file, scans = idx)

File that I am using: https://drive.google.com/open?id=1VN4U21jtg5bY10Bb9bnFEZ-mTfRMFKEY

Thanks for the support.

class raw

rawfile <- structure(list(path = "Downloads/Resource_642890/20180717_006_tSIM_demo.raw", header = ... ), class = "raw")

accordingly XIC() could be def. as

XIC(rawfile, mz, tol, ...)

Add Signal to Noise data to ReadScan

Hello! Thank you for developing this useful package. Is there any way to add the signal to noise info to the object returned by ReadScans? This is a value that Thermo stores in the .raw file for every peak, besides the m/z and intensity. I believe that it can be obtained from one of the ThermoFisher.CommonCore DLLs already utilized by rawDiag. Thanks!

implement a read.raw.info method

.read.raw.info <- function(file,
     mono = if(Sys.info()['sysname'] %in% c("Darwin", "Linux")) TRUE else FALSE,
     exe = file.path(path.package(package = "rawDiag"), "exec", "fgcz_raw.exe"),
     mono_path = "",
     argv = "info",
     system2_call = TRUE,
     method = "thermo"){

  if(system2_call && method == 'thermo'){

    tf <- tempfile(fileext = '.tsv')
    tf.err <- tempfile(fileext = '.tsv')

    message(paste("system2 is writting to tempfile ", tf, "..."))

    if (mono){
      rvs <- system2("mono", args = c(exe, shQuote(file), argv),
                     stdout = tf)
    }else{
      rvs <- system2(exe, args = c(shQuote(file), argv),
                     stderr = tf.err,
                     stdout = tf)
    }

    if (rvs == 0){
      rv <- read.csv(tf,  sep = ":",   stringsAsFactors = TRUE, header = FALSE,
                     col.names = c('attribute', 'value'))

      message(paste("unlinking", tf, "..."))

      unlink(tf)
      # unlink(tfstdout)
      return(rv)
    }
  }
  NULL
}

add testthat case for .calc.transient

this function has to be refactored to eliminate the R CMD check NOTE: 'no visible binding for global variable'. using the mutate_at
before we should have a unit test

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.