Giter Club home page Giter Club logo

Comments (4)

katsikora avatar katsikora commented on July 29, 2024

Hi Sunta3louxos,

CSAW runs the CSAW.R and DB.R Rscripts. Briefly, genome is segmented into windows of defined size, windows overlapping the union of all peaks are further retained, read counts are collected per bam file, and window-level differential binding analysis is run. Then, a step combines adjacent windows into differentially bound regions given the associated values, and emits final "aggregated" p values and log-fold changes. FDR threshold is applied. The list of differentially bound regions is then split into "UP", "DOWN" and "MIXED" based on the aggregated log2FC.

get_nearest_transcript takes those "UP" and "DOWN" DB regions and uses bedtools to annotated each region with the nearest transcript - both ends of a transcript are treated equally, i.e. TSS and TES.
get_nearest_gene annotates the above result with gene ID and gene symbol.

calc_matrix_cov_CSAW computes deepTools matrix on coverage bigwigs overlapping the "UP" and "DOWN" regions. This is unrelated to annotation with nearest transcript and gene.
plot_heatmap_cov_CSAW plots the heatmap of coverage matrix calculated in the previous step, for "UP" and "DOWN" regions, respectively.

Rule "all" is just "rule all" in snakemake, once all the required targets are generated by the preceding rules, it's marked as completed.

The peaks in the report are coming from the FDR-filtered and direction-split (UP/DOWN) bed files coming out of CSAW, again irrespectively of the annotation with nearest transcript and gene.

Hope this helps,
Best wishes,

Katarzyna

from snakepipes.

sunta3iouxos avatar sunta3iouxos commented on July 29, 2024

I noticed that in the CSAW.R there is a sub sampling of the total using the MACS2 peaks.
I could not understand if this happens prior to differential peak calling. In my opinion it would be nice to also output and the total identified differential bound sites. We could then use other peakcallers to further filter the differential bind sites.
I hope this makes sense.

from snakepipes.

katsikora avatar katsikora commented on July 29, 2024

If I understand correctly, this is about whitelisting random genomic intervals for those intersecting the union of MACS2 peaks.
This helps reduce the number of mutliple tests in the end, such that you run the statistics only on windows that had evidence of binding in at least 1 sample.
CSAW is executed after peak calling and doesn't influence MACS2 results.

For every peak caller that you choose, there will be an instance of CSAW run, taking the respective peaks as input.
You can still do any intersections you wish manually.

from snakepipes.

sunta3iouxos avatar sunta3iouxos commented on July 29, 2024

For every peak caller that you choose, there will be an instance of CSAW run, taking the respective peaks as input.
You can still do any intersections you wish manually.

Thank you,
So, you sub-sample differential peaks found by CSAW with MACS, or any other peak caller, is this correct?

from snakepipes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.