Giter Club home page Giter Club logo

untargeted_metabolomics_ath_sagl1's Introduction

Data analysis for untargeted metabolomics.

R version license Institute DOI

Work flow for untargeted metabolomics analysis.

Citation:

Yu, K., Yang, W., Zhao, B., Wang, L., Zhang, P., Ouyang, Y., Chang, Y., Chen, G., Zhang, J., Wang, S., Wang, X., Wang, P., Wang, W., Roberts, J. A., Jiang, K., Mur, L. A. J., & Zhang, X. (n.d.). The Kelch-F-box protein SMALL AND GLOSSY LEAVES 1 (SAGL1) negatively influences salicylic acid biosynthesis in Arabidopsis thaliana by promoting the turn-over of transcription factor SYSTEMIC ACQUIRED RESISTANCE DEFICIENT 1 (SARD1). New Phytologist. https://doi.org/10.1111/nph.18197

Workflow:

graph LR

A(CD Compound Name)

A --> |webchem|B[PubChem CID]

B --> |yes| C(Compound retained)
B --> |No| D(Drop the compound)

C --> |CID|E(InChIkey)

E --> |Yes| F(ClassFire)
E --> |No| D

C --> |CID| G(formula and mw)

H(CD mf and mw) --> |match| G

G --> |yes| I[High identical]
G --> |no| D
I -->  |CTS| J(KEGG annotation)
I -->  |metaboananlyst|J
I --> |t-test and opls-da| K(Differential accumulated metabolites)

L(MS2 information with RT and M/Z) --> X(Key pathway)
J --> X
K --> X

The candidate metabolites were identified by Compound Discoverer , and the raw compound qualification data was generated which contains about 9000 compounds. Most of these metabolites are unreliable. But if we use more strict filter condition, only small number of candidates left (less than 1000), It is hard to reflect the biological issue which biologist concerned. This is a big problem for large scale untargeted metabolomics. In order to solve this problem, we used a strategy that using a flexible condition to allow the existence of false positive compounds to help us digging out the key metabolic pathways between mutant and wildtype. Once the key pathway were found, a strict condition will be used (MS2 information or targeted metabolite estimate) to confirm the discovery.

Step1. Remove false positive errors by Pubchem database.

The compound name was set as the query data to get pubchem cid by Webchem. Drop the false matched compounds.

Molecular Formula, Molecular Weight, InChIKey, IUPACName and ExactMass of correspond compound was obtained by webchem. Compounds which have same molecular formula and bias of molecular weight smaller than 5 were labeled as high identical compounds.

Step2. Compounds classification.

High identical compounds were classfied by ClassFireR via InChIKey of each compound.

Step3. Pairwised differential accumulated metablites analysis

Welch t-test (two-side) were used for estimate the accumulation bias of comparison group. The multiple comparison test (BH method)was done to correct the p-value. and OPLS-DA (ropls) was done to reduce within-group difference, and VIP value was generated.

The DAM cut-off : p-value <= 0.05 & VIP > 1

Step4. Pathway analysis

The KEGG annotation were obtained from CTS and metaboananlyst.

For CTS, InChIKey was used as input data.
For metaboananlyst, CD Compound Name was used as input data.

Step5. Compounds in focused pathway validation.

The rough candidate compounds in focused pathway were select out from DAM datasets, after screening the inhouse script constructed Arabidopsis thaliana KEGG compound database, compounds which belongs to Arabidopsis thaliana (according to kegg database)were select out. The fine candidate compounds were double checked by Compound discovery software with the MS2 database (mzcloud and mzvalut) . Finally High confidencial compounds were ensured. Furthermore, key compounds were evaluated by targeted metabolomic method. The results are highly consistent with our untargeted metabolomic result. It is also proved the accuracy of our untargetd metabolomic workflow.

untargeted_metabolomics_ath_sagl1's People

Contributors

shawnwx2019 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.