Giter Club home page Giter Club logo

aligner-testbed's Introduction

About me

I'm a bioinformatician with a computational biology PhD from ETH Zürich. Employed as full-time remote Principal Software Engineer and Associate Director Bioinformatics at PacBio since 2014 and responsible for most of PacBio's analysis tools. My day-to-day business is working on sequence analysis algorithms using hardware-near C++, designing software architectures, tech-leading, and Agile stuff. You can get my attention if you work on specialized bioinformatics algorithms, software architectures, or bare metal.

You can find more info on ArminToepfer.com, LinkedIn, or Twitter.

Get my public key curl https://keybase.io/armintoepfer/pgp_keys.asc | gpg --import

PacBio Tool Highlights

PhD Tool Highlights

aligner-testbed's People

Contributors

armintoepfer avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar  avatar

aligner-testbed's Issues

Use of kalloc in miniwfa, etc

I saw someone forked miniwfa, so I followed that and came here :-) It is recommended to initialize km = km_init() once per thread and then free it with km_destroy() close to the end of the thread (should document this). In your code, call km_init() before for(i) at this line and free it after the loop. Miniwfa is doing frequent heap allocation and relying on kalloc to reduce the overhead. Without kalloc, it can be ~30% slower. The slowdown may be more noticeable when you align different pairs in multiple threads.

WFA2-lib uses heuristic by default (see smarco/WFA2-lib#7). You need to add attributes.heuristic.strategy = wf_heuristic_none if you want to use the exact algorithm. Miniwfa doesn't support any heuristic as I don't have the need for now. It is not hard to implement them.

BTW: 1) I saw you mentioned banding on twitter. I actually like fixed bandwidth. Fixed banding is predictable. We can often get a good sense whether the bandwidth is enough by looking at the cigar or even before getting the alignment. We can then double the bandwidth when we worry it is too small. Such a band doubling heuristic would be at most twice as slow in comparison to optimal banding. Bwa-mem has a version of this. With wavefront reduction, it is hard to tell when the heuristic fails. And when the heuristic fails, we may need more time to do alignment as the score would be higher. 2) I am not sure if WFA works for spliced alignment in its current form. WFA assumes non-exact-match incurs a positive penalty. In spliced alignment, however, intron extension doesn't increase the score. (EDIT: we can make intron extension cost, though I am not sure about the effect of that) I guess there should be a way to make WFA work, but even in that case, WFA may need to fill most part of the DP matrix and lose its key advantage over NW.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.