Giter Club home page Giter Club logo

amrplusplus_bioinformatic_workshop's Introduction

Microbial Ecology Group (MEG) - AMR++ bioinformatics workshop

Course syllabus

Start Date: October 10, 2022

Dropbox link

  • This dropbox folder contains all of the videos from our zoom course sessions and recordings from a previous MEG bioinformatics workshop.

Course content

Summary

These lessons are designed to introduce researchers to the R programming language for statistical analysis of metagenomic sequencing data. While we are primarily developing these training resources for the Microbial Ecology Group (MEG), we would love to get your input on improvements to any component so that we can one day provide this as a useful public resource. As the lessons are meant to be an informal collection of resources and tutorials, we have have liberally used parts and pieces of other online lessons and tailored it for our purposes. We attempt to give credit when possible by linking the original source and we are happy to hear recommendations for other resources to include.

We wholeheartedly encourage students to independently troubleshoot the majority of problems they might encounter by:

  • googling it (or using another search engine)
  • getting help from other students by using our slackgroup channel #2021-AMR++workshop
  • searching bioinformatic forums such as (stackoverflow.com, biostars.org, seqanswers.com, etc.)

Workshop details

Learning objectives:

Upon completion of these lessons, students will:

  • have their computer set up with the R and RStudio software
  • know how to read-in count matrices from bioinformatic analysis of sequence data
  • be able to explore and summarize bioinformatic results using
    • diversity indices and box plots
    • ordination with non-metric multidimensional scaling (NMDS)
    • heatmaps
  • be familiar with common statistical techiniques such as:
    • Wilcoxon test
    • Generalized linear models
    • Analysis of similarities (ANOSIM)
    • Differential abundance testing using a zero-inflated Gaussian (ZIG) model

Instructors

Group email: [email protected]

Dr. Paul Morley -- [email protected]

Dr. Noelle Noyes -- [email protected]

Peter Ferm -- [email protected]

Dr. Lee Pinnell -- [email protected]

Dr. Enrique Doster -- [email protected]

Dr. Lisa Perez -- [email protected]

Bioinformatic overview

Metagenomic sequencing approach determines the type of analysis you can perform:

  • Shotgun metagenomic sequencing
    • can analyze both the microbiome and resistome, in addition to other sequences such as plasmid-associated or virulence factors
  • Target-enriched resistome sequencing (MEGARes baits)
    • can only analyze the resistome
  • 16S rRNA amplicon sequencing
    • can only analyze the microbiome

In this repository, we'll show you examples of running variants of the AMR++ pipeline to achieve your bioinformatic analysis goals. We'll be using code found in this repository of bioinformatic pipelines

  • AMR++ pipeline
  • Qiime2 pipeline
    • We use the Qiime2 pipeline to analyze 16S rRNA reads and export the results to a file format that we can use to analyze with R.

Statistics overview

Remember, the analysis will always have to be based on your study design and performed with the goal of testing your apriori hypotheses. The scripts in this repository are merely meant to provide an outline for you to begin your analysis and branch off as needed.

Using RStudio, download everything in this repository and change your working directory to the newly downloaded AMRplusplus_bioinformatic_workshop directory. Start by opening the script on the main page, Stats_overview_script.R, and follow along for a brief explanation of how each of the scripts below fits into your analysis.

If you don't have RStudio installed, click on the link below to explore our test dataset using Binder and RStudio:

Binder

Otherwise, follow the instructions on this tutorial for installing R and Rstudio on your personal computer.

The main steps of data exploration and statistical analysis we will cover are divided into four main steps with associated scripts for each general step:

  1. Loading count matrix results from bioinformatic analyses into R
  2. Calculating summary statistics
  3. Normalizing counts and creating exploratory figures
  4. Running some common statistical tests

Resources:

MEG resources

R programming

  • RStudio cheatsheets
    • This website has tons of helpful cheatsheets for various R packages and analyses methods. Also includes cheatsheets translated to other languages.
  • YaRrr! The Pirate’s Guide to R
    • This is a free online book that goes over many useful topics in a quirky, but fun way! Follow along with our simplified R scripts in Lesson 1 and reference this book if you have any other questions.
  • R programming coursera course
    • This free coursera course goes in-depth with all of the functionality of R. It combines videos with example R scripts for you to follow along with. We recommend this course after you have been playing around with R a bit and want to learn more about the details into how R works.
  • Introduction to R workshop
    • We haven't personally tried this workshop, but they have a combination of videos, slides, and R code for various topics.
  • ggpubr
    • Nice package for "publication-ready" figures.
  • Harvard's Data Science: R Basics

Data visualization

Command-line

  • Explain shell
    • cool website that explains bash commands piece by piece

Statistics resources

Funding Information:

The development of this tutorial was supported in part by USDA NIFA Grant No. 2018-51300-28563, University of Minnesota College of Veterinary Medicine, The VERO Program at Texas A&M University and West Texas A&M University, and the State of Minnesota Agricultural Research, Education, Extension and Technology Transfer program.

amrplusplus_bioinformatic_workshop's People

Contributors

enriquedoster avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.