Giter Club home page Giter Club logo

causal_microbiome_tutorial's Introduction

Randomization-based causal inference framework to analyze 16s rRNA gut microbiome data.

Causal inference framework for environment-microbiome data applied to American Gut Project (AGP) data.

Framework

Image of Graphical abstract

Data access

American Gut Data subset [paper in preparation, Mishra and Müller 2021].

Stage 2: Design

The R code for our pair matching implementation and diagnostic plots generation can be found in the design_AG file. The matrix of 10,000 possible randomization of the intervention assignment is also generated directly after matching.

Note 1: the matching functions Stephane_matching.R were written in Rcpp by Stéphane Shao.

Note 2: other matching strategies are valid. The researcher should take the conceptual hypothetical experiment into account when choosing its strategy.

Stage 3: Analysis

The ASV (or OTU) data table and matched dataset are combined in a phyloseq object before making statistical analyses. Thus, the following code can be used for any other data combined in a phyloseq object.

Diversity

Richness and alpha-diversity

R code in 1_alpha_diversity_AG folder.

We used Amy Willis’ R packages breakaway for richness estimation [Willis and Bunge, 2015] and DivNet for Shannon index estimation [Willis, 2020].

Richness result:

estimate: 108.3931; p-value: 0.133

Shannon index result:

estimate: -0.008072164; p-value: 0.659

Beta-diversity

R code in 2_beta_diversity_AG folder.

The distance calculations where done with the phyloseq package and we used Anna Plantinga’s R package MiRKAT for the test statistic calculations [Zhao et al., 2015].

Results:
  • Aitchison: estimate: 822866.9; p-value(adj.): 0.002
  • Jaccard: estimate: 132.9856; p-value(adj.): 0.002
  • Gower: estimate: 0.3761873; p-value(adj.): 0.501

Compostion

Compositional equivalence

R code in 3_mean_diff_test_AG folder.

Cao, Lin, and Li’s github repository: composition-two-sampe-test [Cao, Lin, and Li, 2018].

Result:

estimate: 50.0806; p-value: 0.001

Differential abundance

R code in 4_differential_abundance_AG folder.

We use the function dacomp.test() of Barak Brill’ R package: dacomp to calculate the test statistic for all taxa at once [Brill, Amir, and Heller, 2020].

Reference set:

k_Bacteria;p_Firmicutes;c_Clostridia;o_Clostridiales;f_Lachnospiraceae;g_Dorea
k_Bacteria;p_Firmicutes;c_Clostridia;o_Clostridiales;f_Lachnospiraceae;g_NA

Results:

Genera with p-value <= 0.02.
k_Bacteria;p_Proteobacteria;c_Gammaproteobacteria;o_Enterobacteriales;f_Enterobacteriaceae;g_Raoultella
k_Bacteria;p_Firmicutes;c_Clostridia;o_Clostridiales;f_Lachnospiraceae;g_Anaerostipes
k_Bacteria;p_Proteobacteria;c_Alphaproteobacteria;o_Rickettsiales;f_mitochondria;g_Sarcandra

Correlation structure

R code in 5_networks_AG folder.

Peschel et al.’s (2020) R package NetCoMi enables the estimation and comparision of networks for compositional data.

References

[Holle et al., 2005] Holle R, Happich M, Löwel H, Wichmann HE (2005); MONICA/KORA Study Group. KORA–a research platform for population based health research. Gesundheitswesen, 67.

[Willis and Bunge, 2015] Willis A and Bunge J (2015); Estimating diversity via frequency ratios. Biometric Methodology, 71:1042-1049.

[Willis and Bryan, 2020] Willis A and Bryan DM (2020); Estimating diversity in networked ecological communities Biostatistics, kxaa015.

[Zhao et al., 2015] Zhao N, Chen J, Carroll IM et al. (2015); Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test. Am J Hum Genet., 96(5):797-807.

[Cao, Lin, Li, 2018] Cao Y, Lin W, and Li H (2018); Two-sample tests of high-dimensional means for compositional data. Biometrika, 105:115-132.

[Brill, Amir, and Heller, 2020] Brill B, Amir A, and Heller R (2020) Testing for differential abundance in compositional counts data, with application to microbiome studies.] arXiv

[Peschel et al., 2020] Peschel et al. (2020) NetCoMi: network construction and comparison for microbiome data in R. Briefings in Bioinformatics, bbaa290.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.