Giter Club home page Giter Club logo

Comments (6)

mourisl avatar mourisl commented on August 24, 2024

Chromap does not remove PCR duplicates in Hi-C through the preset option. You can find the information at the bottom of the manual page here: https://zhanghaowen.com/chromap/chromap.html .

If you want to remove the duplicates, you can add the option "--remove-pcr-duplicates".

from chromap.

jimwry avatar jimwry commented on August 24, 2024

Chromap does not remove PCR duplicates in Hi-C through the preset option. You can find the information at the bottom of the manual page here: https://zhanghaowen.com/chromap/chromap.html .

If you want to remove the duplicates, you can add the option "--remove-pcr-duplicates".

Great, thank you so much!

from chromap.

mourisl avatar mourisl commented on August 24, 2024

I just want to mention one of the main reasons that we did not add the PCR deduplication as the default behavior. The deduplication is based on alignment coordinates, and we did not keep track of the internal alignment breakpoint (ligation site) in Hi-C data. So if two read pairs have the same endpoints, but have different ligation sites, one of them will still be removed in the deduplication step.

from chromap.

jimwry avatar jimwry commented on August 24, 2024

I just want to mention one of the main reasons that we did not add the PCR deduplication as the default behavior. The deduplication is based on alignment coordinates, and we did not keep track of the internal alignment breakpoint (ligation site) in Hi-C data. So if two read pairs have the same endpoints, but have different ligation sites, one of them will still be removed in the deduplication step.

Thank you for your detailed explanation. That make senses. I just want to clarify that, only read pairs, of which both end reads are duplicated, will be considered as duplicates in chromap? Thanks!

from chromap.

conchoecia avatar conchoecia commented on August 24, 2024

Just following up to see if we can get an answer for @jimwry's previous question? I'm working on integrating chromap with https://github.com/c-zhou/yahs. Thanks!

from chromap.

mourisl avatar mourisl commented on August 24, 2024

Sorry I missed that question. To answer the question, yes, we only consider the endpoints in deduplication for read pairs in the pairs output format. For single-end read, it is only the start site on the reference genome.

from chromap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.