This script is used to process raw ATAC-seq files ready for analysis. Processed files can be opened in the IGV Genome browser to visualise chromatin accessibility.
Command | Installation |
---|---|
fastqc , bowtie2 , samtools , bioawk , bedtools |
Install via brew by running brew install fastqc bowtie2 samtools bioawk bedtools |
macs3 |
Install by running pip3 install macs3 |
bedClip |
Install from the same directory by running curl http://hgdownload.cse.ucsc.edu/admin/exe/macOSX.x86_64/bedClip --output bedClip && chmod +x bedClip |
bedGraphToBigWig |
Install from the same directory by running curl http://hgdownload.cse.ucsc.edu/admin/exe/macOSX.x86_64/bedGraphToBigWig --output bedGraphToBigWig && chmod +x bedGraphToBigWig |
Download with:
git clone https://github.com/phenotypic/ATACseq-Processor.git
Before running, you must ensure that the reference genome and executable files are located in the the same directory as the script, and that the two ATAC-seq files are located in the ATAC_paired
subdirectory. See below:
ATACseq-Processor
│ ├── ATAC_paired
│ │ ├── 30fish-0hpa_S1_L001_R1_00_1.fastq.gz
│ │ └── 30fish-0hpa_S1_L001_R1_00_2.fastq.gz
│ ├── Danio_rerio.GRCz11.dna.toplevel.fa.gz
│ ├── bedClip
│ ├── bedGraphToBigWig
│ └── processor.sh
Start processor.sh
from the ATACseq-Processor
directory by running:
bash processor.sh
The first thing the script does is generate a quality control report for the two files in the ATAC_paired
subdirectory and output it to the Quality_ATAC
folder. You can view the reports in a web browser. Use this guide to interpret the results.
Once the script has finished running, open the SPECIES.clipped.sorted.bw
file in the IGV Genome browser and load a reference genome to view chromatin accessibility:
- The pipeline used in this script is adapted from this excellent tutorial
- The script should automatically detect the reference genome and ATAC-seq files, as long as they are located in the correct directories. The script will also automatically detect the species shorthand name and the number of CPU cores available
- Building the genome index (step 2) is likely to take a long time as the process is computationally intensive
- Once the script has run and you have saved the
SPECIES.clipped.sorted.bw
file, you are welcome to delete all of the other files generated as they are no longer needed