Giter Club home page Giter Club logo

cancer-hic-norm's People

Contributors

nservant avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

jennymoon90

cancer-hic-norm's Issues

cnv source

Hi, I am interested in applying LOIC and CAIC balancing to my HiC data. Can I use copy number information from long-read CNV analysis instead of extracting it from the Hi-C data?
Thank you

changes for droso

Hi Nicolas,

Thank you for the response.

I was able to get LOIC working by using the existing codebase in the github repository.

My organism of reference is drosophila, so I had to modify the code a bit to get it working.

I have some notes regarding the modifications, if you would like to incorporate the same into the package to make it more generalised for other users.

cghseg, a CRAN R package which is used by the run_seg function in lib_cnv_hic, is no longer present in CRAN and users need to download it from CRAN archives.

A function within the GLAD package, ChrNumeric, which converts non-numeric chr names to numeric ones only accepts human or mouse chromosome names. I had to replace the function within the GLAD namespace to make it work with drosophila.

when running cnv_ice.py, the script wanted symmetric matrices, with i and j being numeric names.
> But in the examples provided in the HiC-Pro package, https://github.com/nservant/HiC-Pro/blob/master/doc/MANUAL.md suggested that the names of the genome intervals should be in character format. So, later after running cnv_ice.py for the first time, I had to reconvert and reimport my Hi-C data.
> The importC function in annotate_hicdata.R forces symmetry on the HiTC object, it would be better if this was written to disk after importing, so that users can pass on the same symmetric matrix to cnv_ice.py.

Within the python package, iced/normalization/init.py line 74, it is assumed that the cnv bias vector provided has missing values (0). If this is not the case, then the package produces an error at X.sum() because subsequent calls are made to an empty array.
> It would be better to do a if any(rows_to_remove): at line 75 to check for such an occurrence before doing the operation.

rs.seg.gr$cnv from segment_hic_data.R and interpreting final output bed file

Hello,

I am trying to use cancer-hic-norm to infer copy number information from the Hi-C data. For the time being, I am just trying to infer the copy number data and not normalize my Hi-C data according to it.

For the script cancer-hic-norm-master/cnv_from_hic/segment_hic_data.R, when it comes to the part for plotting p2, it generates the following data frame.

dat <- data.frame(chr=as.vector(seqnames(rs.seg.gr)), pos = xpos, counts.cor = rs.seg.gr$counts.cor, smt = rs.seg.gr$smt, cn=rs.seg.gr$cnv)

However, this results in an error for me because rs.seg.gr does not have a column called cnv. If I remove this part from the command (remove cn=rs.seg.gr$cnv), script runs fine. Output file seems to only use the smt column so I was thinking this should be okay but I just wanted to make sure I'm not discarding some kind of critical information. Which step is the CNV column supposed to be added to the rs.seg.gr object?

Also, I wasn't sure how to interpret the final output bed file with copy number values at the 4th column. Is this a log2 ratio compared to the average number of reads for bins throughout the genome? For example, if I see the 4th column being 2 for a bin of my interest, does that mean that bin shows twice more reads compared to average of the genome when appropriate normalization and smoothing of the signal took place?

Thank you!

CAIC normalisation very slow for high resolution

Do you have any recommendations for speeding up the CAIC normalisation for high resolution matrices?

I have called the CNVs and am supplying the seg.bed file along with the abs.bed and .matrix files to the ice_cnv.py script.

It works great and quite fast for 1Mb and 500Kb, but when I get down to 100Kb it is almost prohibitively slow. I would like to run the normalisation at 40Kb, but it would be impossible at the current speed.

The LOIC runs pretty fast, the rate limiting step is in estimating the CNV bias.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.