Giter Club home page Giter Club logo

Comments (21)

arun-sub avatar arun-sub commented on May 28, 2024 2

I faced the same issue with AVX512 and included some test datasets and fixes in #51 . Could you please take a look and let me know if they work for you ?

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024 1

@arun-sub thanks, we started looking into this early last week and didn't see your new PR when I raised it. Will look into testing this.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

@arun-sub unfortunately your patch alone did not solve the instance we were investigating. @mp15 will be raising a PR with an extension to your work which is successful.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

I have managed to create a dataset that triggers the same issue using reference sequence only:

ref-mini.fq.gz

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

#51 will solve this issue.

from bwa-mem2.

yuk12 avatar yuk12 commented on May 28, 2024

I have added the reallocs (similar to the one implemented by arun in pull request #51). let me know if you face any issues.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

Can I confirm, this is all on master? I'm not missing visibility of a development branch or something?

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

@yuk12 I can confirm that this now runs to completion on 8 data sets that previously failed. It is significantly slower than the version @arun-sub provided from this branch:

https://github.com/arun-sub/bwa-mem2/commits/mem_corruption_fix

Specifically this commit:

https://github.com/arun-sub/bwa-mem2/tree/831d11c00127a36f83579e94a869c3b27c9db640

avx512bw_20200227/1:    CPU time :                                   50802.23 sec.
avx512bw_20200317/1:    CPU time :                                   114679.00 sec.

(equivalent hardware, 8 cpus)

from bwa-mem2.

yuk12 avatar yuk12 commented on May 28, 2024

@keiranmraine Will it be possible for you to share a dataset to reproduce this? Thanks.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

@yuk12 I am unable to share this data. We initially had this slow down with first fix provided by @arun-sub, aligning the allocations solved it:

831d11c?w=1

from bwa-mem2.

yuk12 avatar yuk12 commented on May 28, 2024

Cool. Thanks :)

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

@yuk12 do you want another round of testing?

from bwa-mem2.

yuk12 avatar yuk12 commented on May 28, 2024

Yes, please. Thanks in advance.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

@yuk12 results are in, much faster:

cgppipe@casm3-head1: grep 'CPU time' logs/avx512-20200317/*
logs/avx512-20200317/1:    CPU time :                                   114679.00 sec.
logs/avx512-20200317/2:    CPU time :                                   121971.41 sec.
logs/avx512-20200317/3:    CPU time :                                   134016.03 sec.
logs/avx512-20200317/4:    CPU time :                                   139938.64 sec.
logs/avx512-20200317/5:    CPU time :                                   133400.00 sec.
logs/avx512-20200317/6:    CPU time :                                   116532.14 sec.
logs/avx512-20200317/7:    CPU time :                                   139155.31 sec.

cgppipe@casm3-head1: grep 'CPU time' logs/avx512-20200330/*
logs/avx512-20200330/1:    CPU time :                                   101301.38 sec.
logs/avx512-20200330/2:    CPU time :                                   91917.00 sec.
logs/avx512-20200330/3:    CPU time :                                   119815.00 sec.
logs/avx512-20200330/4:    CPU time :                                   108887.12 sec.
logs/avx512-20200330/5:    CPU time :                                   107275.70 sec.
logs/avx512-20200330/6:    CPU time :                                   85909.55 sec.
logs/avx512-20200330/7:    CPU time :                                   121100.32 sec.

Dated folder based on date of commit used.

from bwa-mem2.

yuk12 avatar yuk12 commented on May 28, 2024

@keiranmraine Thanks a ton.
Do you see any slowdown as compared to previous commits?

Before this run you mentioned following slowdown:
avx512bw_20200227/1: CPU time : 50802.23 sec.
avx512bw_20200317/1: CPU time : 114679.00 sec.
is this ~2x slower? and now the new number is 101301.38 sec ?

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

I forgot to compare back to that version. But yes it does appear that this version is still significantly slower than the version provided by @arun-sub on Feb 27th.

#53 (comment)

I don't know what else may be involved.

from bwa-mem2.

arun-sub avatar arun-sub commented on May 28, 2024

@keiranmraine, there was a bug in my realloc implementation in #51 which @yuk12 pointed out. This could lead to different results if your dataset performed reallocations. It would be better to use runtimes of an older commit of BWA-MEM2 as your reference.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

Okay, so the fastest run I have on avx512bw prior to that one is 118164 s. I'll run comparison of each implementation (sse41, avx2, avx512bw) of the current build to confirm.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

I've just found out why that is so slow, missed a replace and the recent stats are actually avx2. Repeating all now. Sorry.

from bwa-mem2.

keiranmraine avatar keiranmraine commented on May 28, 2024

Corrected run time for 20203030 commits, this makes far more sense:

$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/1
   logs/sse41-20200330/1:    CPU time :                     130462.00 sec.
    logs/avx2-20200330/1:    CPU time :                      94966.13 sec.
logs/avx512bw-20200330/1:    CPU time :                      74183.03 sec.
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/2
   logs/sse41-20200330/2:    CPU time :                     114574.68 sec.
    logs/avx2-20200330/2:    CPU time :                      86097.48 sec.
logs/avx512bw-20200330/2:    CPU time :                      81591.84 sec.
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/3
   logs/sse41-20200330/3:    CPU time :                     140110.00 sec.
    logs/avx2-20200330/3:    CPU time :                     134825.52 sec.
logs/avx512bw-20200330/3:    CPU time :                      91147.00 sec.
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/4
   logs/sse41-20200330/4:    CPU time :                     162803.00 sec.
    logs/avx2-20200330/4:    CPU time :                     118373.68 sec.
logs/avx512bw-20200330/4:    CPU time :                      91178.53 sec.

from bwa-mem2.

yuk12 avatar yuk12 commented on May 28, 2024

Yes, these numbers make more sense. Thanks for the hard work @keiranmraine

from bwa-mem2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.