Giter Club home page Giter Club logo

Comments (9)

chhylp123 avatar chhylp123 commented on May 26, 2024

Here is an example of A line:

A	utg000001l	2093	-	SRR11606870.1250244	0	16611	id:i:1250243	HG:A:a
A	utg000001l	3572	-	SRR11606870.2648803	0	17169	id:i:2648802	HG:A:a

For SRR11606870.1250244, 'utg000001l' is the unitig name, 2093 and (3572 - 1) is the start position and end position in utg000001l, '-' is the direction in utg000001l, 'HG:A:a' is the haplotype label of SRR11606870.1250244 which is only useful for haplotype-resolved assembly like trio-binning.

As for cigar, I personally think most reads are exactly mapped to unitig since all reads have been corrected. I'm not sure if this assumption is enough for your project...

from hifiasm.

apregier avatar apregier commented on May 26, 2024

Thanks for the quick response - just clarifying, there are 2 A lines above. It looks like they refer to different reads. I assumed that 2093 and 3572 are the start coordinates of alignment on the unitig and 0 is the start coordinate on the read in each line, and 16611 and 17169 are either the end coordinates or the alignment lengths?

I agree the reads should match the contigs almost exactly - for now it is useful enough for me to just to pull the coordinates, although if it would be easy to put in a cigar, or even better paf-style cs tag that would be really helpful, since I assume there will be some bubbles that are popped, correct? Or would the popped reads just get discarded?

from hifiasm.

lh3 avatar lh3 commented on May 26, 2024

It is technically difficult to output CIGAR because a small fraction of reads are not mapped exactly. We have no plan to deal with those. In addition, everything on the A-line is derived from corrected, not raw reads. CIGARs from corrected reads won't be useful to you anyway.

Or would the popped reads just get discarded?

Popped reads and contained reads are discarded. If you want to recruit reads in a region, you have to redo read alignment against the contigs.

from hifiasm.

apregier avatar apregier commented on May 26, 2024

Ok, thank you anyway!

from hifiasm.

chhylp123 avatar chhylp123 commented on May 26, 2024

I agree. Another possible solution is to jointly check p_ctg, a_ctg and r_utg. It will be more accurate in extreme cases, but maybe not as easy as directly alignment.

It is technically difficult to output CIGAR because a small fraction of reads are not mapped exactly. We have no plan to deal with those. In addition, everything on the A-line is derived from corrected, not raw reads. CIGARs from corrected reads won't be useful to you anyway.

Or would the popped reads just get discarded?

Popped reads and contained reads are discarded. If you want to recruit reads in a region, you have to redo read alignment against the contigs.

from hifiasm.

chhylp123 avatar chhylp123 commented on May 26, 2024

I assumed that 2093 and 3572 are the start coordinates of alignment on the unitig

Yes

and 0 is the start coordinate on the read in each line and 16611 and 17169 are either the end coordinates or the alignment lengths?

No, 16611 or 17169 is just read length, instead of alignment length.

from hifiasm.

zhenzhenyang-psu avatar zhenzhenyang-psu commented on May 26, 2024

Hi,
I wonder what the number 1250243 after the "id:i:" in "id:i:1250243" means. thanks!

from hifiasm.

chhylp123 avatar chhylp123 commented on May 26, 2024

It is the read ID, which is only useful for hifiasm itself.

from hifiasm.

zhenzhenyang-psu avatar zhenzhenyang-psu commented on May 26, 2024

i see. thanks!

from hifiasm.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.