neurogenomics / epicompare Goto Github PK

Comparison, benchmarking & QC of epigenetic datasets

Home Page: https://doi.org/doi:10.18129/B9.bioc.EpiCompare

R 15.52% CSS 0.07% HTML 84.41%

epigenetics benchmark comparison genetics bioconductor benchmarking interactive-reporting html r-package bioconductor-package

epicompare's Introduction

⚖`EpiCompare`⚖
QC and Benchmarking of Epigenomic Datasets

Authors: Sera Choi, Brian Schilder, Leyla Abbasova, Alan Murphy, Nathan Skene

Updated: Mar-08-2023

Introduction

EpiCompare is an R package for comparing multiple epigenomic datasets for quality control and benchmarking purposes. The function outputs a report in HTML format consisting of three sections:

General Metrics: Metrics on peaks (percentage of blacklisted and non-standard peaks, and peak widths) and fragments (duplication rate) of samples.
Peak Overlap: Frequency, percentage, statistical significance of overlapping and non-overlapping peaks. This also includes Upset, precision-recall and correlation plots.
Functional Annotation: Functional annotation (ChromHMM, ChIPseeker and enrichment analysis) of peaks. Also includes peak enrichment around Transcription Start Site.

Note: Peaks located in blacklisted regions and non-standard chromosomes are removed from the files prior to analysis.

Installation

Standard

To install EpiCompare use:

if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager")
BiocManager::install("EpiCompare")

All dependencies

👈 Details

Installing all Imports and Suggests will allow you to use the full functionality of EpiCompare right away, without having to stop and install extra dependencies later on.

To install these packages as well, use:

BiocManager::install("EpiCompare", dependencies=TRUE)

Note that this will increase installation time, but it means that you won’t have to worry about installing any R packages when using functions with certain suggested dependencies

Development

👈 Details

To install the development version of EpiCompare, use:

if (!require("remotes")) install.packages("remotes")
remotes::install_github("neurogenomics/EpiCompare")

Citation

If you use EpiCompare, please cite:

EpiCompare: R package for the comparison and quality control of epigenomic peak files (2022) Sera Choi, Brian M. Schilder, Leyla Abbasova, Alan E. Murphy, Nathan G. Skene, bioRxiv, 2022.07.22.501149; doi: https://doi.org/10.1101/2022.07.22.501149

Documentation

The documentation in this README and the GitHub Pages website pertains to the development version of EpiCompare. Older versions of EpiCompare may have slightly different documentation (e.g. available functions, parameters). For documentation in older versions of EpiCompare, please see the Documentation section of the relevant version on Bioconductor

Usage

Load package and example datasets.

library(EpiCompare)
data("encode_H3K27ac") # example peakfile
data("CnT_H3K27ac") # example peakfile
data("CnR_H3K27ac") # example peakfile
data("CnT_H3K27ac_picard") # example Picard summary output
data("CnR_H3K27ac_picard") # example Picard summary output

Prepare input files:

# create named list of peakfiles 
peakfiles <- list("CnT"=CnT_H3K27ac, 
                  "CnR"=CnR_H3K27ac) 
# set ref file and name 
reference <- list("ENCODE_H3K27ac" = encode_H3K27ac) 
# create named list of Picard summary
picard_files <- list("CnT"=CnT_H3K27ac_picard, 
                     "CnR"=CnR_H3K27ac_picard)

👈 Tips on importing user-supplied files

EpiCompare::gather_files is helpful for identifying and importing peak or picard files.

# To import BED files as GRanges object
peakfiles <- EpiCompare::gather_files(dir = "path/to/peaks/", 
                                      type = "peaks.stringent")
# EpiCompare alternatively accepts paths (to BED files) as input 
peakfiles <- list(sample1="/path/to/peaks/file1_peaks.stringent.bed", 
                  sample2="/path/to/peaks/file2_peaks.stringent.bed")
# To import Picard summary output txt file as data frame
picard_files <- EpiCompare::gather_files(dir = "path/to/peaks", 
                                         type = "picard")

Run EpiCompare():

EpiCompare::EpiCompare(peakfiles = peakfiles,
                       genome_build = list(peakfiles="hg19",
                                           reference="hg38"),
                       genome_build_output = "hg19", 
                       picard_files = picard_files,
                       reference = reference,
                       run_all = TRUE
                       output_dir = tempdir())

Required Inputs

These input parameters must be provided:

👈 Details

peakfiles : Peakfiles you want to analyse. EpiCompare accepts peakfiles as GRanges object and/or as paths to BED files. Files must be listed and named using list(). E.g. list("name1"=peakfile1, "name2"=peakfile2).
genome_build : A named list indicating the human genome build used to generate each of the following inputs:
- peakfiles : Genome build for the peakfiles input. Assumes genome build is the same for each element in the peakfiles list.
- reference : Genome build for the reference input.
- blacklist : Genome build for the blacklist input.
  E.g. genome_build = list(peakfiles="hg38", reference="hg19", blacklist="hg19")
genome_build_output Genome build to standardise all inputs to. Liftovers will be performed automatically as needed. Default is “hg19”.
blacklist : Peakfile as GRanges object specifying genomic regions that have anomalous and/or unstructured signals independent of the cell-line or experiment. For human hg19 and hg38 genome, use built-in data data(hg19_blacklist) and data(hg38_blacklist) respectively. For mouse mm10 genome, use built-in data data(mm10_blacklist).
output_dir : Please specify the path to directory, where all EpiCompare outputs will be saved.

Optional Inputs

The following input files are optional:

👈 Details

picard_files : A list of summary metrics output from Picard. Picard MarkDuplicates can be used to identify the duplicate reads amongst the alignment. This tool generates a summary output, normally with the ending .markdup.MarkDuplicates.metrics.txt. If this input is provided, metrics on fragments (e.g. mapped fragments and duplication rate) will be included in the report. Files must be in data.frame format and listed using list() and named using names(). To import Picard duplication metrics (.txt file) into R as data frame, use picard <- read.table("/path/to/picard/output", header = TRUE, fill = TRUE).
reference : Reference peak file(s) is used in stat_plot and chromHMM_plot. File must be in GRanges object, listed and named using list("reference_name" = GRanges_obect). If more than one reference is specified, EpiCompare outputs individual reports for each reference. However, please note that this can take awhile.

Optional Plots

By default, these plots will not be included in the report unless set to TRUE. To turn on all features at once, simply use the run_all=TRUE argument:

👈 Details

upset_plot : Upset plot of overlapping peaks between samples.
stat_plot : included only if a reference dataset is provided. The plot shows statistical significance (p/q-values) of sample peaks that are overlapping/non-overlapping with the reference dataset.
chromHMM_plot : ChromHMM annotation of peaks. If a reference dataset is provided, ChromHMM annotation of overlapping and non-overlapping peaks with the reference is also included in the report.
chipseeker_plot : ChIPseeker annotation of peaks.
enrichment_plot : KEGG pathway and GO enrichment analysis of peaks.
tss_plot : Peak frequency around (+/- 3000bp) transcriptional start site. Note that it may take awhile to generate this plot for large sample sizes.
precision_recall_plot : Plot showing the precision-recall score across the peak calling stringency thresholds.
corr_plot : Plot showing the correlation between the quantiles when the genome is binned at a set size. These quantiles are based on the intensity of the peak, dependent on the peak caller used (q-value for MACS2).

Other Options

👈 Details

chromHMM_annotation : Cell-line annotation for ChromHMM. Default is K562. Options are:
- “K562” = K-562 cells
- “Gm12878” = Cellosaurus cell-line GM12878
- “H1hesc” = H1 Human Embryonic Stem Cell
- “Hepg2” = Hep G2 cell
- “Hmec” = Human Mammary Epithelial Cell
- “Hsmm” = Human Skeletal Muscle Myoblasts
- “Huvec” = Human Umbilical Vein Endothelial Cells
- “Nhek” = Normal Human Epidermal Keratinocytes
- “Nhlf” = Normal Human Lung Fibroblasts
interact : By default, all heatmaps (percentage overlap and ChromHMM heatmaps) in the report will be interactive. If set FALSE, all heatmaps will be static. N.B. If interact=TRUE, interactive heatmaps will be saved as html files, which may take time for larger sample sizes.
output_filename : By default, the report is named EpiCompare.html. You can specify the file name of the report here.
output_timestamp : By default FALSE. If TRUE, the filename of the report includes the date.

Outputs

EpiCompare outputs the following:

HTML report: A summary of all analyses saved in specified output_dir
EpiCompare_file: if save_output=TRUE, all plots generated by EpiCompare will be saved in EpiCompare_file directory also in specified output_dir

An example report comparing ATAC-seq and DNase-seq can be found here

Datasets

EpiCompare includes several built-in datasets:

👈 Details

encode_H3K27ac: Human H3K27ac peak file generated with ChIP-seq using K562 cell-line. Taken from ENCODE project. For more information, run ?encode_H3K27ac.
CnT_H3K27ac: Human H3K27ac peak file generated with CUT&Tag using K562 cell-line from Kaya-Okur et al., (2019). For more information, run ?CnT_H3K27ac.
CnR_H3K27ac: Human H3K27ac peak file generated with CUT&Run using K562 cell-line from Meers et al., (2019). For more details, run ?CnR_H3K27ac.

Session Info

👈 Details

utils::sessionInfo()

## R version 4.2.1 (2022-06-23)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Big Sur ... 10.16
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## loaded via a namespace (and not attached):
##  [1] pillar_1.8.1        compiler_4.2.1      RColorBrewer_1.1-3 
##  [4] BiocManager_1.30.20 bitops_1.0-7        yulab.utils_0.0.6  
##  [7] tools_4.2.1         digest_0.6.31       jsonlite_1.8.4     
## [10] evaluate_0.20       lifecycle_1.0.3     tibble_3.1.8       
## [13] gtable_0.3.1        pkgconfig_2.0.3     rlang_1.0.6        
## [16] graph_1.76.0        cli_3.6.0           rstudioapi_0.14    
## [19] rvcheck_0.2.1       yaml_2.3.7          xfun_0.37          
## [22] fastmap_1.1.0       dplyr_1.1.0         knitr_1.42         
## [25] generics_0.1.3      desc_1.4.2          vctrs_0.5.2        
## [28] dlstats_0.1.6       stats4_4.2.1        rprojroot_2.0.3    
## [31] grid_4.2.1          tidyselect_1.2.0    here_1.0.1         
## [34] Biobase_2.58.0      glue_1.6.2          R6_2.5.1           
## [37] fansi_1.0.4         XML_3.99-0.13       RBGL_1.74.0        
## [40] rmarkdown_2.20.1    ggplot2_3.4.1       badger_0.2.3       
## [43] magrittr_2.0.3      BiocGenerics_0.44.0 biocViews_1.66.2   
## [46] scales_1.2.1        htmltools_0.5.4     rworkflows_0.99.7  
## [49] RUnit_0.4.32        colorspace_2.1-0    renv_0.17.0        
## [52] utf8_1.2.3          RCurl_1.98-1.10     munsell_0.5.0

Contact

Neurogenomics Lab

UK Dementia Research Institute
Department of Brain Sciences
Faculty of Medicine
Imperial College London
GitHub
DockerHub

epicompare's People

Contributors

Stargazers

Watchers

Forkers

serachoi1230 xindong-sun tomrrr1

epicompare's Issues

Add unit tests

Unit tests are a formal way to automatically test your functions and ensure they're always working (even after you make changes to them).

See here for an intro to setting these up.

Make the initial function so that it has the basic characteristics of a package

E.g. a vignette made with pkgdown; function examples; it completes devtools::check() and runs unit tests with testthat

` Each of the 2 combined objects has sequence levels not in the other:`

Seems to be some issues with seqnames during some steps within EpiCompare::EpiCompare. You may need to do some additional standardization and validation of the inputs GRanges.

The following output is from the same reprex given here: #29

Automatically detect and read bed files

It would be nice if users could directly supply stored bed/peak path names to peakfiles. You could set up some internal functions to check if the list contains paths or GRanges, and if the former then read it in.

peaklist2 <- lapply(names(peaklist), function(x){
  item <- peaklist[[x]]
  if(methods::is(item,"GRanges")){
    return(item)
  } else {
    f <- ChIPseeker::readPeakFile(x, as = "GRanges")
    return(f)
  }
}) %>% `names<-`(names(peaklist))

README example can't find data inputs

In the README example, are these additional built-in datasets ?

...
...
picard = list(encode_picard, CnT_picard, CnR_picard),
...
...

If so, make sure they're setup before this part is run so the example works all the way through.

Rotate all x-axis labels in "Functional Enrichment Analysis" plots

Some x-axis labels are rotated but others are not:

`Error in `.rowNamesDF<-`(x, value = value) : duplicate 'row.names' are not allowed`

Code

### Data stored on Imperial HPC (mounted to local laptop)
root <- "/Volumes/bms20/projects/neurogenomics-lab/live/Data/tip_seq"

### Import the called peak files produced by the [nf-core/cutanrun](https://nf-co.re/cutandrun) pipeline.
peakpaths <- list.files(path = file.path(root,"processed_data"),
                        pattern = "*.peaks.bed.stringent.bed$", 
                        recursive = TRUE, 
                        full.names = TRUE)

names <- paste(basename(dirname(dirname(dirname(peakpaths)))),
               stringr::str_split(basename(peakpaths),"[.]", simplify = TRUE)[,1], 
               sep=".")
peakfiles <- lapply(peakpaths, function(x){
  message(x)
  ChIPseeker::readPeakFile(x, as = "GRanges")
}) %>% `names<-`(names)

### Run EpiCompare
library(EpiCompare)
data("encode_H3K27ac") # example peakfile
data("CnT_H3K27ac") # example peakfile
data("CnR_H3K27ac") # example peakfile
data("hg19_blacklist") # example blacklist 

# peaklist <- list(encode_H3K27ac, peakfiles) 
# namelist <- c("encode", names)
out <- EpiCompare(peakfiles = peakfiles,
           names = names,
           blacklist = hg19_blacklist,
           reference = encode_H3K27ac,
           stat_plot = TRUE,
           chrmHMM_plot = TRUE,
           chipseeker_plot = TRUE,
           enrichment_plot = TRUE,
           save_output = TRUE,
           output_dir = "./")

Errors

Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chr9_KI270719v1_random, chr9_KI270720v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270303v1, chrUn_KI270310v1, chrUn_KI270337v1, chrUn_KI270438v1, chrUn_KI270442v1, chrUn_KI270467v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270748v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combin [... truncated]
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chr9_KI270719v1_random, chr9_KI270720v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270303v1, chrUn_KI270310v1, chrUn_KI270337v1, chrUn_KI270438v1, chrUn_KI270442v1, chrUn_KI270467v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270748v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combin [... truncated]
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000225v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270712v1_random, chr1_KI270713v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr9_KI270718v1_random, chr9_KI270719v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270310v1, chrUn_KI270330v1, chrUn_KI270438v1, chrUn_KI270743v1, chrUn_KI270744v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270442v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000225v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270712v1_random, chr1_KI270713v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr9_KI270718v1_random, chr9_KI270719v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270310v1, chrUn_KI270330v1, chrUn_KI270438v1, chrUn_KI270743v1, chrUn_KI270744v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270442v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chr9_KI270719v1_random, chr9_KI270720v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270303v1, chrUn_KI270310v1, chrUn_KI270337v1, chrUn_KI270438v1, chrUn_KI270442v1, chrUn_KI270467v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270748v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combin [... truncated]
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270707v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chr9_KI270719v1_random, chr9_KI270720v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270303v1, chrUn_KI270310v1, chrUn_KI270337v1, chrUn_KI270438v1, chrUn_KI270442v1, chrUn_KI270467v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270748v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combin [... truncated]
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000225v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270712v1_random, chr1_KI270713v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr9_KI270718v1_random, chr9_KI270719v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270310v1, chrUn_KI270330v1, chrUn_KI270438v1, chrUn_KI270743v1, chrUn_KI270744v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270442v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr11_KI270721v1_random, chr14_GL000225v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270712v1_random, chr1_KI270713v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr9_KI270718v1_random, chr9_KI270719v1_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270310v1, chrUn_KI270330v1, chrUn_KI270438v1, chrUn_KI270743v1, chrUn_KI270744v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Warning in .Seqinfo.mergexy(x, y) :
  Each of the 2 combined objects has sequence levels not in the other:
  - in 'x': chr14_GL000009v2_random, chr14_GL000194v1_random, chr14_GL000225v1_random, chr14_KI270726v1_random, chr15_KI270727v1_random, chr16_KI270728v1_random, chr17_GL000205v2_random, chr1_KI270706v1_random, chr1_KI270711v1_random, chr1_KI270713v1_random, chr1_KI270714v1_random, chr22_KI270731v1_random, chr22_KI270733v1_random, chr22_KI270734v1_random, chr3_GL000221v1_random, chr4_GL000008v2_random, chrM, chrUn_GL000195v1, chrUn_GL000214v1, chrUn_GL000216v2, chrUn_GL000218v1, chrUn_GL000219v1, chrUn_GL000220v1, chrUn_GL000224v1, chrUn_KI270442v1, chrUn_KI270741v1, chrUn_KI270742v1, chrUn_KI270743v1, chrUn_KI270744v1, chrUn_KI270745v1, chrUn_KI270751v1, chrUn_KI270754v1
  - in 'y': chrY
  Make sure to always combine/compare objects based on the same reference
  genome (use suppressWarnings() to suppress this warning).
Saving 7 x 5 in image
Warning: non-unique values when setting 'row.names': 'phase_1_05_jan_2022.S_1_R1', 'phase_1_05_jan_2022.S_1_R2', 'phase_2_03_feb_2022.S_4_R1', 'phase_2_28_jan_2022.S_2_R1', 'phase_2_28_jan_2022.S_3_R1', 'phase_2_28_jan_2022.S_4_R1', 'phase_2_28_jan_2022.S_5_R1', 'phase_2_28_jan_2022.S_6_R1'
Quitting from lines 167-179 (EpiCompare.Rmd) 
Error in `.rowNamesDF<-`(x, value = value) : 
  duplicate 'row.names' are not allowed

Session info

R version 4.1.0 (2021-05-18)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Big Sur 11.4

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods  
[8] base     

other attached packages:
 [1] org.Hs.eg.db_3.14.0  AnnotationDbi_1.56.2 Biobase_2.54.0      
 [4] GenomicRanges_1.46.1 GenomeInfoDb_1.30.1  IRanges_2.28.0      
 [7] S4Vectors_0.32.3     BiocGenerics_0.40.0  dplyr_1.0.8         
[10] EpiCompare_0.99.0   

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                             
  [2] tidyselect_1.1.1                       
  [3] heatmaply_1.3.0                        
  [4] RSQLite_2.2.9                          
  [5] htmlwidgets_1.5.4                      
  [6] grid_4.1.0                             
  [7] TSP_1.1-11                             
  [8] BiocParallel_1.28.3                    
  [9] scatterpie_0.1.7                       
 [10] munsell_0.5.0                          
 [11] codetools_0.2-18                       
 [12] ragg_1.2.1                             
 [13] colorspace_2.0-2                       
 [14] GOSemSim_2.20.0                        
 [15] filelock_1.0.2                         
 [16] highr_0.9                              
 [17] knitr_1.37                             
 [18] rstudioapi_0.13                        
 [19] ggsignif_0.6.3                         
 [20] DOSE_3.20.1                            
 [21] MatrixGenerics_1.6.0                   
 [22] labeling_0.4.2                         
 [23] GenomeInfoDbData_1.2.7                 
 [24] polyclip_1.10-0                        
 [25] seqPattern_1.26.0                      
 [26] bit64_4.0.5                            
 [27] farver_2.1.0                           
 [28] downloader_0.4                         
 [29] rprojroot_2.0.2                        
 [30] treeio_1.18.1                          
 [31] vctrs_0.3.8                            
 [32] generics_0.1.2                         
 [33] xfun_0.29                              
 [34] BiocFileCache_2.2.1                    
 [35] R6_2.5.1                               
 [36] graphlayouts_0.8.0                     
 [37] seriation_1.3.2                        
 [38] locfit_1.5-9.4                         
 [39] gridGraphics_0.5-1                     
 [40] fgsea_1.20.0                           
 [41] bitops_1.0-7                           
 [42] BRGenomics_1.6.0                       
 [43] cachem_1.0.6                           
 [44] DelayedArray_0.20.0                    
 [45] assertthat_0.2.1                       
 [46] BiocIO_1.4.0                           
 [47] scales_1.1.1                           
 [48] vroom_1.5.7                            
 [49] ggraph_2.0.5                           
 [50] enrichplot_1.14.1                      
 [51] gtable_0.3.0                           
 [52] tidygraph_1.2.0                        
 [53] rlang_1.0.1                            
 [54] genefilter_1.76.0                      
 [55] systemfonts_1.0.3                      
 [56] splines_4.1.0                          
 [57] rtracklayer_1.54.0                     
 [58] rstatix_0.7.0                          
 [59] lazyeval_0.2.2                         
 [60] impute_1.68.0                          
 [61] broom_0.7.12                           
 [62] yaml_2.2.2                             
 [63] reshape2_1.4.4                         
 [64] abind_1.4-5                            
 [65] GenomicFeatures_1.46.4                 
 [66] crosstalk_1.2.0                        
 [67] backports_1.4.1                        
 [68] qvalue_2.26.0                          
 [69] clusterProfiler_4.2.2                  
 [70] tools_4.1.0                            
 [71] ggplotify_0.1.0                        
 [72] gridBase_0.4-7                         
 [73] ggplot2_3.3.5                          
 [74] gplots_3.1.1                           
 [75] ellipsis_0.3.2                         
 [76] jquerylib_0.1.4                        
 [77] RColorBrewer_1.1-2                     
 [78] Rcpp_1.0.8                             
 [79] plyr_1.8.6                             
 [80] progress_1.2.2                         
 [81] zlibbioc_1.40.0                        
 [82] purrr_0.3.4                            
 [83] RCurl_1.98-1.6                         
 [84] prettyunits_1.1.1                      
 [85] ggpubr_0.4.0                           
 [86] viridis_0.6.2                          
 [87] SummarizedExperiment_1.24.0            
 [88] ggrepel_0.9.1                          
 [89] here_1.0.1                             
 [90] magrittr_2.0.2                         
 [91] data.table_1.14.2                      
 [92] DO.db_2.9                              
 [93] matrixStats_0.61.0                     
 [94] patchwork_1.1.1                        
 [95] hms_1.1.1                              
 [96] evaluate_0.14                          
 [97] xtable_1.8-4                           
 [98] XML_3.99-0.8                           
 [99] gridExtra_2.3                          
[100] compiler_4.1.0                         
[101] biomaRt_2.50.3                         
[102] tibble_3.1.6                           
[103] shadowtext_0.1.1                       
[104] KernSmooth_2.23-20                     
[105] crayon_1.5.0                           
[106] htmltools_0.5.2                        
[107] ggfun_0.0.5                            
[108] tzdb_0.2.0                             
[109] aplot_0.1.2                            
[110] tidyr_1.2.0                            
[111] geneplotter_1.72.0                     
[112] DBI_1.1.2                              
[113] tweenr_1.0.2                           
[114] genomation_1.26.0                      
[115] ChIPseeker_1.30.3                      
[116] dbplyr_2.1.1                           
[117] MASS_7.3-55                            
[118] rappdirs_0.3.3                         
[119] boot_1.3-28                            
[120] Matrix_1.4-0                           
[121] car_3.0-12                             
[122] readr_2.1.2                            
[123] cli_3.2.0                              
[124] parallel_4.1.0                         
[125] igraph_1.2.11                          
[126] pkgconfig_2.0.3                        
[127] TxDb.Hsapiens.UCSC.hg19.knownGene_3.2.2
[128] GenomicAlignments_1.30.0               
[129] registry_0.5-1                         
[130] plotly_4.10.0                          
[131] xml2_1.3.3                             
[132] foreach_1.5.2                          
[133] ggtree_3.2.1                           
[134] annotate_1.72.0                        
[135] bslib_0.3.1                            
[136] webshot_0.5.2                          
[137] XVector_0.34.0                         
[138] yulab.utils_0.0.4                      
[139] stringr_1.4.0                          
[140] digest_0.6.29                          
[141] Biostrings_2.62.0                      
[142] fastmatch_1.1-3                        
[143] rmarkdown_2.11                         
[144] tidytree_0.3.7                         
[145] dendextend_1.15.2                      
[146] restfulr_0.0.13                        
[147] curl_4.3.2                             
[148] gtools_3.9.2                           
[149] Rsamtools_2.10.0                       
[150] rjson_0.2.21                           
[151] nlme_3.1-155                           
[152] lifecycle_1.0.1                        
[153] jsonlite_1.7.3                         
[154] carData_3.0-5                          
[155] viridisLite_0.4.0                      
[156] BSgenome_1.62.0                        
[157] fansi_1.0.2                            
[158] pillar_1.7.0                           
[159] lattice_0.20-45                        
[160] KEGGREST_1.34.0                        
[161] fastmap_1.1.0                          
[162] httr_1.4.2                             
[163] plotrix_3.8-2                          
[164] survival_3.2-13                        
[165] GO.db_3.14.0                           
[166] glue_1.6.1                             
[167] remotes_2.4.2                          
[168] png_0.1-7                              
[169] iterators_1.0.14                       
[170] bit_4.0.4                              
[171] ggforce_0.3.3                          
[172] stringi_1.7.6                          
[173] sass_0.4.0                             
[174] blob_1.2.2                             
[175] textshaping_0.3.6                      
[176] DESeq2_1.34.0                          
[177] caTools_1.18.2                         
[178] memoise_2.0.1                          
[179] ape_5.6-1

Allow for figure size adjustments

It would be helpful to pass up figure dimensions as arguments the user can specify. Otherwise, can result in very scrunched plots when more samples are added.

Alternatively (or in addition), you could come up with dynamic defaults. A nice default would be to figure out some scaling between the number of samples in the peakfiles list and the size of the plots.

You'd need to play around this the formula a bit but it could be something like:

n <- length(peakfiles)
width <- 7 * (n/2)
height <- 5 * (n/2)

Add topics

In the upper righthand of your repo (by the description of the package) you'll notice a little gear icon. Click that and add some topics that your repo is associated with.

This is mostly gratuitous but it makes your repo look more fancy and official 😄

Include code in report file

It would be a nice feature to include the code that was used to generate the report HTML file within the report file itself. This would help with reproducibility.

You could include a collapsed code chunk at the beginning containing the exact command/args that was used:

out <- EpiCompare(peakfiles = peakfiles,
           names = names,
           blacklist = hg19_blacklist,
           reference = encode_H3K27ac,
           stat_plot = TRUE,
           chrmHMM_plot = TRUE,
           chipseeker_plot = TRUE,
           enrichment_plot = TRUE,
           save_output = TRUE,
           output_dir = here::here("reports"))

Set version to 0.99.0

In your DESCRIPTION you have the version set as Version: 0.0.0.9000.
I recc switching this to Version: 0.99.0 to align with Bioconductor standards. See our wiki for more info.

Even if you don't plan on putting your package on Bioc atm, following this standard will make your life easier if you ever decide to down the road

Would be good to include an example report that can be linked to from within the README, so people can see what the output of the package is

Is there a way to make it so the file gets regenerated when the package is built? Can this be done without throwing package errors? @bschilder @Al-Murphy

Add README.Rmd

Adding a README is essential for any GH repo, package or otherwise. It gives an intro to what a package does and how to install it, and provides links to any documentation sites.

A powerful way to generate README.md (markdown) files is to write a README.Rmd (rmarkdown), which you can then knit into a markdown file. This means you can run real code in the README and add dynamic variables that automatically update when your package changes.

Once added to the root directory of your repo, the README.md will automatically appear on the homepage of your repo.

See orthogene for an example.

Further explanations of Rmd files.

Enable package to automatically install Bioc packages

Some of your Imports are from Bioconductor. R doesn't know where to find these packages by default, so you need to add the line biocViews: somewhere before your Imports section in the DESCRIPTION file.

More info here.

Include API to ENCODE data

It would be great to include an API to import data from resources like ENCODE and/or Roadmap.

I've recently built an R package to help with this:
https://github.com/RajLabMSSM/echoannot

Create initial function for comparing dataset X to dataset Y

Assumption: dataset X is a sequencing file, dataset Y is from ENCODE

Want to do three initial analyses:
(1) Which percentage of peaks from X are found in Y?
(2) Which percentage of peaks from Y are found in X?
(3) What ChromHMM annotations are both associated with?

The output of the function would be a knitted markdown report explaining the results

Add an option to include information from Picard to plot duplicated read rates as well

Would enable us to plot these figures as well...

This would be important if optimising e.g. the rate of duplicates

Initially look into how hard this would be to implement, then lets discuss

Name columns in GRanges data

The built-in GRanges objects are missing column names for the metadata.

Could you add the column names back in?

colnames(GenomicRanges::mcols(CnT_H3K27ac)) <- new_col_names

Make documentation website

Easier than it sounds. Once you have all the other things I mentioned in the Issues set up, you can automatically generate a documentation website with devtools::build_site().

Better yet, once you get these GHA workflows set up, they'll rebuild and launch the documentation website for you every time.

Allow `EpiCompare` func to return outputs directly

I see in the EpiCompare function you automatically create an Rmd and render it to html. Very cool implementation!

One thing i'd suggest is to allow an arg to the func that gives users the option to instead directly return the relevant objects generated by EpiCompare, rather than render the html. This way they can pipe the output directly to other functions.

EpiCompare::EpiCompare - Provide more precise description of expected Picard input

The args relevant to Picard in the EpiCompare::EpiCompare function are a bit vague. Picard has a number of functionalities and outputs. Could benefit from a more detailed description of what is expected, in what format (file path? data.frame? GRanges?), and perhaps the first N lines of an example file.

Also, the picard and picard_names args can be merged in the same way suggested here: #27

Remove numbers from x-axis of ChromHMM plots

Looks like the ChromHMM plot x-axis labels get renamed at some point so that they don't start with numbers (by adding X#_)

This looks a bit messy and you could remove the "X#_" part from the data, or while plotting with gsub.

Include session info in the report

Always nice to include Session info at the end of Rmarkdown reports to enhance reproducibility.

# Session Info 

<details> 

```{r Session Info}
utils::sessionInfo()
```

</details>  

<br>

`EpiCompare`: lazy-load database is corrupt

Example

From the EpiCompare::EpiCompare documentation.

data("encode_H3K27ac")
data("CnT_H3K27ac")
data("CnR_H3K27ac")
data("hg19_blacklist")


out <- EpiCompare(peakfiles = list(CnR_H3K27ac, CnT_H3K27ac),
           names = c("CnR", "CnT"),
           blacklist = hg19_blacklist,
           reference = encode_H3K27ac,
           stat_plot = TRUE,
           chrmHMM_plot = TRUE,
           save_output = FALSE,
           output_dir = "./")

Produces the following error

Warning: Removed 16 rows containing non-finite values (stat_boxplot).
Warning: replacing previous import 'Biostrings::pattern' by 'grid::pattern' when loading 'genomation'
Warning in base::nchar(wide_chars$test, type = "width") :
  internal error -3 in R_decompress1
Quitting from lines 198-215 (EpiCompare.Rmd) 
Error in base::nchar(wide_chars$test, type = "width") : 
  lazy-load database '/Library/Frameworks/R.framework/Versions/4.1/Resources/library/cli/R/sysdata.rdb' is corrupt

I think this has something to do with a recent formatting change to RDA objects. Super annoying that they're not back-compatible, but if I'm right it probably means you need to regenerate any built-in datasets using an updated version of R.

Include links to EpiCompare in report

Embed links to EpiCompare in the file itself, so people know exactly where to find the tool even if the file get disassociated from the original code used to produce it.

# [EpiCompare](https://github.com/neurogenomics/EpiCompare) Report
## Sera Choi
[EpiCompare](https://github.com/neurogenomics/EpiCompare) is an R package that compares different epigenetic datasets for quality control and benchmarking purposes. The report is divided into three sections:

Update GHA

I recently updated my GHA so that they only use a single yml file, which also takes care of pushing to DockerHub. this is more efficient than the split version bc it doesn't duplicate the install/check steps.

Replace your GHA workflows with this one:

https://github.com/neurogenomics/templateR/tree/master/.github/workflows

Also, be sure to replace your DockerFile with the updated version:

https://github.com/neurogenomics/templateR/blob/master/Dockerfile

Get github actions so it completes successfully and there are build complete buttons etc

Add installation instructions to README

if(!require("remotes")) install.packages("remotes")
remotes::install_github("neurogenomics/EpiCompare")

Create function-specific unit tests

The unit tests that you have are great in that they cover much of the functionality of EpiCompare, but it can also be helpful to have more specific unit tests, at least for all exported functions. This makes it much easier to pinpoint issues if your full pipeline breaks later on.

Make the example datasets so they can be stored within the repo and used directly

The tutorial currently says to do this

data("encode_H3K27ac")
data("CnT_H3K27ac")

@bschilder can advise on how to store them in the package

Pairwise plots showing the probability of peaks for overlapping vs non-overlapping files

Because we don't have a pre-defined reference datasets as we did with ENCODE, this will probably need tabs once we stat generating them for multiple datasets

Add vignettes

Vignettes are runnable examples of how to use your package. You can make multiple vignettes if you have lots of different functionalities you want to demonstrate.

Some examples:
https://github.com/neurogenomics/orthogene/tree/main/vignettes

Add codecov badge

Add a codecov badge so you can display how awesomely thorough the code coverage of your unit tests is. You can generate this badge with:

badger::badge_codecov()

Use a named list

Rather than having separate arguments (peakfiles and names), simply ask users to provide a named list.

peakfiles = list(encode=<path1>, CnT=<path2>, CnR=<path3>)

This is much less likely to result in misaligned names.

If no names are provided all(names(peaks)==NULL) then you can create default sample names based on the file names (or simply "sample2", "sample2", etc) along with a warning that these are being named automatically.

Interactive plots

Love the interactive plots. A couple points of improvement.

ChromHMM

Would like the possibility to make these interactive.

Interactive on/off

Would be helpful to provide the option of making plots interactive or not to the user. Could simply be one arg applied to all relevant plots interact=TRUE.

Setting the default to TRUE is probably fine, but some users may want to turn this off with larger sample sizes since interactive plots can really slow down html files in those cases.

Remove globals if possible

You define global variables here:
https://github.com/neurogenomics/EpiCompare/blob/master/R/globals.R

I'm not totally familiar with this implementation, but in general globals defined like this are not preferable bc it makes it unclear to the user how they're defined, and opens up the possibility that user variables won't mean what they think they mean.

Could you explain what these are used for?

Add link to documentation site

e.g. https://github.com/neurogenomics/templateR#documentation

ChromHMM: add table of fraction of peaks in each category

Add name of reference file in report

Change default `output_dir`

A good convention to follow is to set the default directory for any saved files to tempdir(). This is a base R function that gives you the path to a temporary directory that gets deleted when you restart the R session.

This has the advantage of not clogging up people's disk storage with file as they test out the package. But it also has the disadvantage of not saving any files across R sessions unless you change the save directory. For this reason, you may want to keep it as is; output_dir has no default, meaning users must specify it. Just make sure that in all your examples/tests, output_dir is set to tempdir().

stat_plot: rescale y-axis and write what unique and overlapping peak are

Add examples for subfunctions

I see you have a Roxygen documentation for your main function complete with an example; this is fantastic!

I also recommend doing the same for some of your subfunctions. While it's not necessary to do for every single little function, it can still be helpful to have some level of documentation so you always remember what each func does and what it needs as input.

fix README example

there's a missing comma in the README example (2nd to last line).

Also, change output_dir="./" to output_dir=tempdir()

Set up GitHub Actions

GHA are an awesome new feature. They automatically check all your code an make sure it installs on different kinds of computers.

You can set these up in your repo by following these instructions:
https://github.com/neurogenomics/r_workflows

EpiCompare::EpiCompare docs

Something weird is going on with the docs for the EpiCompare function. This page should be documentation for the function, not just an output file from it.

https://neurogenomics.github.io/EpiCompare/reference/EpiCompare.html#3_Functional_Annotation

Add lab contact info

Could add this to the bottom of the README (and the html reports) so people can attribute the work to our lab.

## Contact
 
### [Neurogenomics Lab](https://www.neurogenomics.co.uk/)

UK Dementia Research Institute  
Department of Brain Sciences  
Faculty of Medicine  
Imperial College London   
[GitHub](https://github.com/neurogenomics)  
[DockerHub](https://hub.docker.com/orgs/neurogenomicslab)  

<br>

Include plot for showing GO enrichments based on genes that the peaks are near

E,g, this plot from Leyla's report

Separate each function into its own file

Currently all functions are wrapped within one file, within one function. It makes code a bit easier to read and find if they're separate into diff functions.

https://github.com/neurogenomics/EpiCompare/blob/master/R/EpiCompare.R

Then, within Rstudio, when you want to open the file where a given function is, simply hold CMD (on a Mac) and click on the function name in your code.

stuff <- myfunc(arg1, arg2)
```

This will automatically find where the function is defined and open the file for you.

Add different cell-type annotation files for chromHMM

Adapt so that it can handle 3+ samples

E.g. for the lines of text below we would need to include a matrix

neurogenomics / epicompare Goto Github PK

epicompare's Introduction

⚖EpiCompare⚖QC and Benchmarking of Epigenomic Datasets

Authors: Sera Choi, Brian Schilder, Leyla Abbasova, Alan Murphy, Nathan Skene

Updated: Mar-08-2023

Introduction

Installation

Standard

All dependencies

Development

Citation

Documentation

⚠️ Note on documentation versioning

Usage

Required Inputs

Optional Inputs

Optional Plots

Other Options

Outputs

Datasets

Session Info

Contact

epicompare's People

Contributors

Stargazers

Watchers

Forkers

epicompare's Issues

Code

Errors

Session info

Example

Produces the following error

ChromHMM

Interactive on/off

Recommend Projects

Recommend Topics

Recommend Org

⚖`EpiCompare`⚖
QC and Benchmarking of Epigenomic Datasets