Bioinformatic scripts for Boreal moss-microbe interactions are revealed through metagenome assembly of novel bacterial species
Preprint available here (currently in review).
The moss_pipeline.sh
script contains the main steps for metagenome processing, assembly, and profiling. The pipeline in not seamless and I highly recommend to run it step by step. These steps rely on modules of a pipeline implemented by our team available here and are scripted to work with the SLURM workload manager. Several steps use singularity-contained programs to process data in parallel, but the underlying scripts (in the ILL_pipelines repository) can be run on a sample-wise basis.
The moss_pipeline.sh
script was run on the Mammouth-mp2 clusters, whereas R scripts were run locally on a Macbook Pro M2. Raw data (metagenomes and assemblies) required for the bash script will be available on ENA (BioProject PRJEB76464
). Intermediate data, i.e. the Sourmash gather
, BLASTn
, PhyloPhlAn
, GTDB-Tk
, CheckM
and MicrobeAnnotator
outputs, as well as community composition tables (as phyloseq
objects) required for the R script analyses, are published within this repository.