Comments (4)
Hi Sunta3louxos,
CSAW
runs the CSAW.R and DB.R Rscripts. Briefly, genome is segmented into windows of defined size, windows overlapping the union of all peaks are further retained, read counts are collected per bam file, and window-level differential binding analysis is run. Then, a step combines adjacent windows into differentially bound regions given the associated values, and emits final "aggregated" p values and log-fold changes. FDR threshold is applied. The list of differentially bound regions is then split into "UP", "DOWN" and "MIXED" based on the aggregated log2FC.
get_nearest_transcript
takes those "UP" and "DOWN" DB regions and uses bedtools to annotated each region with the nearest transcript - both ends of a transcript are treated equally, i.e. TSS and TES.
get_nearest_gene
annotates the above result with gene ID and gene symbol.
calc_matrix_cov_CSAW
computes deepTools matrix on coverage bigwigs overlapping the "UP" and "DOWN" regions. This is unrelated to annotation with nearest transcript and gene.
plot_heatmap_cov_CSAW
plots the heatmap of coverage matrix calculated in the previous step, for "UP" and "DOWN" regions, respectively.
Rule "all" is just "rule all" in snakemake, once all the required targets are generated by the preceding rules, it's marked as completed.
The peaks in the report are coming from the FDR-filtered and direction-split (UP/DOWN) bed files coming out of CSAW, again irrespectively of the annotation with nearest transcript and gene.
Hope this helps,
Best wishes,
Katarzyna
from snakepipes.
I noticed that in the CSAW.R there is a sub sampling of the total using the MACS2 peaks.
I could not understand if this happens prior to differential peak calling. In my opinion it would be nice to also output and the total identified differential bound sites. We could then use other peakcallers to further filter the differential bind sites.
I hope this makes sense.
from snakepipes.
If I understand correctly, this is about whitelisting random genomic intervals for those intersecting the union of MACS2 peaks.
This helps reduce the number of mutliple tests in the end, such that you run the statistics only on windows that had evidence of binding in at least 1 sample.
CSAW is executed after peak calling and doesn't influence MACS2 results.
For every peak caller that you choose, there will be an instance of CSAW run, taking the respective peaks as input.
You can still do any intersections you wish manually.
from snakepipes.
For every peak caller that you choose, there will be an instance of CSAW run, taking the respective peaks as input.
You can still do any intersections you wish manually.
Thank you,
So, you sub-sample differential peaks found by CSAW with MACS, or any other peak caller, is this correct?
from snakepipes.
Related Issues (20)
- fix nearest Gene annotation on CSAW_DBR
- Genrich -e option addition
- add peak qc for SEACR as for MACS2 HOT 1
- add allelic-counting mode to mRNA seq
- fix SEACR with control
- error in ChIP-seq HOT 2
- update the snakePipes installed from the github
- Error in mRNA-seq allelic-mapping when using custom genome
- mRNA-seq: sambamba markdup fails because "too many open files"
- plotFinger fails due to insaficient wall time HOT 6
- [Question] where to find the exact command issued? HOT 2
- [Request] looping inputs for multiple ChIP-seq comparisons HOT 1
- allow user to specify mincount for filtering windows for DB with CSAW
- snakePipes createEnvs fails HOT 1
- human reference genome, no_alt and PARY HOT 2
- MACS3 in pipeline
- error while creating the environment using the git devel
- error in samtools sort HOT 1
- issue with multiqc
- filter_reads in DNAmapping pipeline doesnt do the filtering HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from snakepipes.