Giter Club home page Giter Club logo

workflow-testing's People

Contributors

hechth avatar ljocha avatar martenson avatar smartx-usman avatar trachtok avatar xtrojak avatar zargham-ahmad avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

workflow-testing's Issues

Generalize high-res filtering workflow and documentation for GC-EI-MS

TODO after the workflow is created and the initial documentation is done.

Questions

  • how to handle missing metadata?
  • Which metadata should be set - like adducts, charge etc. or should be provided by user
  • Precursor_mz should be calculated how?

Since this text should be more general, each step should address clearly why this is done and if the user comes with their own data, how they can ensure that it is valid and to which step they have to skip or go if their data already fulfills these conditions.

Depends on

  • #29
  • Documentation of this workflow
  • Example dataset

Update workflow for recetox-aplcms to v0.10.3

The recetox-aplcms tool should also contain a workflow which connects the individual steps and pipes the data in the correct way so that it is actually easier to use.

  • Create Galaxy workflow
  • Finalize workflow documentation
  • Upload workflow to GitHub
  • Publish workflow in History
  • Create workflow test

Make galaxy workflow on usegalaxy.eu which runs until msfinder for library filtering for internal libraries

This workflow should be built for the "large" in-house libraries, therefore should include the steps to add Charge, Ionmode and Adduct as hard coded values and then to derive precursor_mz using matchms. We can start developing this using a single chunk from our dataset.
For the last step, the peak overwriting tool in RECETOX/galaxytools#485 is needed.

  • MSMetaEnhancer: collect InChi using pubchem
  • MSMetaEnhancer: collect Isomeric_smiles from IDSM
  • matchms filtering: filter invalid smiles and inchi
  • rem_complex + matchms subsetting: remove complexes
  • matchms remove key: remove existing adduct, charge and ionmode keys
  • matchms add key: add charge, ionmode, adduct
  • matchms filtering: derive precursor
  • matchms convert: convert to riken
  • recetox-msfinder: run msfinder with 0.5Da tolerance for MS1 and MS2 and inclusing all element checks as well as extended range
  • handle errors -> remove smiles which are not accepted -> rerun msfinder (optional, not sure if possible)
  • RECETOX/galaxytools#485: run high-res annotation

After all the steps are included in the workflow, the workflow should be downloaded and deposited here on this repo.
The workflow file can then also be deposited on Zenodo.

gc-ms xcms make parameters choosable

Seems like if not explicitly instructed the parameters for the workflow are fixed - I think we should make the filter choice for XCMS etc. also available to change for users

Update workflows to work with new CI

Older workflows present in this repo often use tools available only in testtoolshed. These need to be re-done with versions available in the main toolshed.

Check end-to-end workflow for spectra prediction

Steps

  • Input as SDF
  • Compound conversion or generate conformers -> XYZ
  • xTB optimize geometry -> optimized XYZ
  • qcxms neutral run -> collections of start, xyz and in
  • qcxms productio run -> .res file
  • qcxms getres -> MSP

Let's also include a markdown document describing the individual steps whiich can be used as the baseline for agalaxy Training for spectra prediction. It should also cover a bit the teheoretical background, why we are doing this step and references to the papers.

Add GC MS pipeline using xcms

Create GC MS workflow that starts with xcms. Add it as a subdirectory in GCMS directory, alongside with the workflow using aplcms.

It uses the following steps:

  • msconvert to convert raw to mzML (check Apply peak picking)
    • peak picking in m/z dimension - actually creates centroids
  • preprocessing step using xcms
    • extract information from raw data to obtain a peak table
    • the result is intensity table, metadata table, and Rdata object
  • ramclustr
    • extract mass spectra in msp format
  • riassigner
    • assign retention indices based on given reference compound list
  • matchms
    • find matches with given reference spectra database

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.