Giter Club home page Giter Club logo

Comments (4)

lh3 avatar lh3 commented on May 28, 2024

The recommended practice is to stream fastq and copy over annotations with -C.

from bwa-mem2.

eboyden avatar eboyden commented on May 28, 2024

Illumina's bcl2fastq puts UMIs in the main header, not in the comment. Unfortunately most software including Picard and fgbio can only parse UMIs in Sam tags, yet no common software I have found has a tool to move them from read headers into tags (umi-tools claims to have this functionality but it doesn't work with PE data). I contacted the Samtools and Picard teams to suggest that they add support for parsing UMIs in read headers and moving them to Sam tags, and both teams said that instead of using bcl2fastq I should be doing bcl > uBam in order to put the UMIs into Sam tags directly; from there I can stream a uBam to a fastq, align it, and then combine the mapped and unmapped Bams to reconstitute the tag metadata. But it would be simpler to be able to align uBams directly and not have to stream fastqs or combine bams. Like I said I realize this isn't where the focus currently is, but it would be nice to add when time allows.

from bwa-mem2.

lh3 avatar lh3 commented on May 28, 2024

You can copy tags to fastq with "samtools fastq -T" and then bwa mem can copy these tags to output SAM with "-C". That is one pass. You don't need to "reconstitute the tag metadata".

You can alternatively copy UMIs to individual fastq records while streaming.

from bwa-mem2.

matthdsm avatar matthdsm commented on May 28, 2024

Hi @lh3

Do you mean to say that using samtools fastq -T + bwa-mem2 mem -C is equal to mapping and merging ubam + bam?
If so, that seems preferable over using the merge step at the end.

Thanks
M

from bwa-mem2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.