Comments (7)
Great! Glad it looks useful. Version 1.1.0 also introduced a prefilter
argument that sounds like it may be what you're interested in — data points with intensity values below the value you provide to prefilter are removed when grabbing the data. It's an interesting idea to do this during the minification step instead though, I'll have to think more on the relative advantages and disadvantages of such. In the meantime, you can also use Proteowizard's msconvert
to perform a similar function on the files with a command like:
msconvert [files] --filter threshold absolute 1000
which should remove the data points with absolute intensities below 1000.
from rams.
I appreciate you citing the package, and I've actually got a manuscript pending with the R Journal right now that discusses the package and other MS data considerations! Until that's accepted and published, however, feel free to use the output from citation("RaMS")
.
from rams.
Hi Dong,
That's a good idea but I'm not sure how best to implement it while keeping object sizes small. Each scan has its own ion injection time, and with the current format there's no easy way to store individual scan metadata without duplicating it for every single data point (like we do with RT). I do like your idea of exposing a few more functions within the package - the xml2 code is pretty robust and I can imagine providing a general xml2 parser that would allow the extraction of arbitrary metadata like ion injection time. The other option would be to include the extra scan metadata in the BPC or TIC slots, since those only have a single entry for each scan.
I'm headed out into the field next week and will be gone until mid-August, so I probably won't be able to work on this for a while. In the meantime, here's a small bit of code that should do the trick for you.
# Find the file you're interested in
# Can only handle one at a time, so you'll have to loop if you've got multiple that you want iit for
# Include full path or make sure the file exists in your working directory
filename <- "170706_Blk_Blk0p2_1.mzML"
# Read in the mzML document with xml2
xml_data <- xml2::read_xml(filename)
# Extract the scan nodes that have the ion injection time values in them
iit_nodes <- xml2::xml_find_all(xml_data, '//d1:cvParam[@name="ion injection time"]')
# Extract the actual values from the nodes
iit_vals <- as.numeric(xml2::xml_attr(iit_nodes, "value"))
This snippet worked nicely on a random mzML file I've got around, but won't work for mzXML files or those without the ion injection time cvParam.
from rams.
Hi William,
Thanks a lot for your help and code. It is very helpful.
It is a nice idea to include a general xml2 parser to allow the user to extract the arbitrary metadata of interest.
Thanks again for your help.
Dong
from rams.
Hi @YonghuiDong, I've just released version 1.1.0 to GitHub main which includes a function to extract arbitrary metadata (grabAccessionData
) by accession number. Thanks for the idea!
This will probably stay on GitHub for a couple weeks to check stability before I push it to CRAN.
from rams.
@wkumler Hi William, Thanks a lot .
I have been following your updates, Version 1.1.0 seems very interesting, I will be very happy to test it. I saw that you have added a minification
function to shrink the data size. I am wondering if it is possible to reduced the overall data size by ignoring "noises" when reading the files, i.e., adding a noise level parameter in grabMSdata
function for MS1, if the intensity value is smaller than the user defined noise level, this MS1 peak will not be grabbed from the raw data. This could be helpful to largely reduce the data size (maybe also memory usage?).
Thanks very much again for this excellent package.
Dong
from rams.
Thanks for your prompt reply! Will you consider publishing your package in a scientific journal? I wrote an R shiny app based on your package for raw data quality evaluation. I will be happy to cite your package.
Dong
from rams.
Related Issues (20)
- Constructing tmzMLs on the fly HOT 2
- Can't read mzML files written by OpenChrom because of missing namespace declaration HOT 5
- Notes on `arrow` HOT 2
- feature request: read different types of scan HOT 4
- add a filetype check BEFORE loading any files HOT 1
- Add convenience functions trapz, mz_group, and qplotMSdata HOT 1
- RaMS v1.3.2
- feature request: dia data HOT 6
- Metabolights no longer allows direct file access :( HOT 1
- grabMzxmlBPC still uses old method to determine RT unit HOT 1
- Update CITATION to link to RaMS paper HOT 1
- NEWS is out of date HOT 1
- Warning message: xmlSAX2Characters: huge text nod HOT 1
- RaMS v 1.4 HOT 1
- Implement DBSCAN/OPTICS as an mz_group option? HOT 1
- mzXML files can have multiple binary encodings HOT 2
- Move data.table to Depends instead of Imports? HOT 1
- Update the package-wide file with proper link to vignettes HOT 1
- Can't read data generated from waters instrument after MSconvert HOT 2
- timeStamp? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from rams.