Giter Club home page Giter Club logo

atacseq-processor's Introduction

ATACseq-Processor

This script is used to process raw ATAC-seq files ready for analysis. Processed files can be opened in the IGV Genome browser to visualise chromatin accessibility.

Prerequisites

Command Installation
fastqc, bowtie2, samtools, bioawk, bedtools Install via brew by running brew install fastqc bowtie2 samtools bioawk bedtools
macs3 Install by running pip3 install macs3
bedClip Install from the same directory by running curl http://hgdownload.cse.ucsc.edu/admin/exe/macOSX.x86_64/bedClip --output bedClip && chmod +x bedClip
bedGraphToBigWig Install from the same directory by running curl http://hgdownload.cse.ucsc.edu/admin/exe/macOSX.x86_64/bedGraphToBigWig --output bedGraphToBigWig && chmod +x bedGraphToBigWig

Usage

Download with:

git clone https://github.com/phenotypic/ATACseq-Processor.git

Before running, you must ensure that the reference genome and executable files are located in the the same directory as the script, and that the two ATAC-seq files are located in the ATAC_paired subdirectory. See below:

ATACseq-Processor
│   ├── ATAC_paired
│   │   ├── 30fish-0hpa_S1_L001_R1_00_1.fastq.gz
│   │   └── 30fish-0hpa_S1_L001_R1_00_2.fastq.gz
│   ├── Danio_rerio.GRCz11.dna.toplevel.fa.gz
│   ├── bedClip
│   ├── bedGraphToBigWig
│   └── processor.sh

Start processor.sh from the ATACseq-Processor directory by running:

bash processor.sh

The first thing the script does is generate a quality control report for the two files in the ATAC_paired subdirectory and output it to the Quality_ATAC folder. You can view the reports in a web browser. Use this guide to interpret the results.

Once the script has finished running, open the SPECIES.clipped.sorted.bw file in the IGV Genome browser and load a reference genome to view chromatin accessibility:

igv_snapshot_ATAC

Notes

  • The pipeline used in this script is adapted from this excellent tutorial
  • The script should automatically detect the reference genome and ATAC-seq files, as long as they are located in the correct directories. The script will also automatically detect the species shorthand name and the number of CPU cores available
  • Building the genome index (step 2) is likely to take a long time as the process is computationally intensive
  • Once the script has run and you have saved the SPECIES.clipped.sorted.bw file, you are welcome to delete all of the other files generated as they are no longer needed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.