Giter Club home page Giter Club logo

Comments (7)

wkumler avatar wkumler commented on September 26, 2024 1

Great! Glad it looks useful. Version 1.1.0 also introduced a prefilter argument that sounds like it may be what you're interested in — data points with intensity values below the value you provide to prefilter are removed when grabbing the data. It's an interesting idea to do this during the minification step instead though, I'll have to think more on the relative advantages and disadvantages of such. In the meantime, you can also use Proteowizard's msconvert to perform a similar function on the files with a command like:

msconvert [files] --filter threshold absolute 1000

which should remove the data points with absolute intensities below 1000.

from rams.

wkumler avatar wkumler commented on September 26, 2024 1

I appreciate you citing the package, and I've actually got a manuscript pending with the R Journal right now that discusses the package and other MS data considerations! Until that's accepted and published, however, feel free to use the output from citation("RaMS").

from rams.

wkumler avatar wkumler commented on September 26, 2024

Hi Dong,

That's a good idea but I'm not sure how best to implement it while keeping object sizes small. Each scan has its own ion injection time, and with the current format there's no easy way to store individual scan metadata without duplicating it for every single data point (like we do with RT). I do like your idea of exposing a few more functions within the package - the xml2 code is pretty robust and I can imagine providing a general xml2 parser that would allow the extraction of arbitrary metadata like ion injection time. The other option would be to include the extra scan metadata in the BPC or TIC slots, since those only have a single entry for each scan.

I'm headed out into the field next week and will be gone until mid-August, so I probably won't be able to work on this for a while. In the meantime, here's a small bit of code that should do the trick for you.

# Find the file you're interested in 
# Can only handle one at a time, so you'll have to loop if you've got multiple that you want iit for
# Include full path or make sure the file exists in your working directory
filename <- "170706_Blk_Blk0p2_1.mzML"

# Read in the mzML document with xml2
xml_data <- xml2::read_xml(filename)

# Extract the scan nodes that have the ion injection time values in them
iit_nodes <- xml2::xml_find_all(xml_data, '//d1:cvParam[@name="ion injection time"]')

# Extract the actual values from the nodes
iit_vals <- as.numeric(xml2::xml_attr(iit_nodes, "value"))

This snippet worked nicely on a random mzML file I've got around, but won't work for mzXML files or those without the ion injection time cvParam.

from rams.

YonghuiDong avatar YonghuiDong commented on September 26, 2024

Hi William,

Thanks a lot for your help and code. It is very helpful.

It is a nice idea to include a general xml2 parser to allow the user to extract the arbitrary metadata of interest.

Thanks again for your help.

Dong

from rams.

wkumler avatar wkumler commented on September 26, 2024

Hi @YonghuiDong, I've just released version 1.1.0 to GitHub main which includes a function to extract arbitrary metadata (grabAccessionData) by accession number. Thanks for the idea!

This will probably stay on GitHub for a couple weeks to check stability before I push it to CRAN.

from rams.

YonghuiDong avatar YonghuiDong commented on September 26, 2024

@wkumler Hi William, Thanks a lot .

I have been following your updates, Version 1.1.0 seems very interesting, I will be very happy to test it. I saw that you have added a minification function to shrink the data size. I am wondering if it is possible to reduced the overall data size by ignoring "noises" when reading the files, i.e., adding a noise level parameter in grabMSdata function for MS1, if the intensity value is smaller than the user defined noise level, this MS1 peak will not be grabbed from the raw data. This could be helpful to largely reduce the data size (maybe also memory usage?).

Thanks very much again for this excellent package.

Dong

from rams.

YonghuiDong avatar YonghuiDong commented on September 26, 2024

@wkumler

Thanks for your prompt reply! Will you consider publishing your package in a scientific journal? I wrote an R shiny app based on your package for raw data quality evaluation. I will be happy to cite your package.

Dong

from rams.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.