Comments (21)
I faced the same issue with AVX512 and included some test datasets and fixes in #51 . Could you please take a look and let me know if they work for you ?
from bwa-mem2.
@arun-sub thanks, we started looking into this early last week and didn't see your new PR when I raised it. Will look into testing this.
from bwa-mem2.
@arun-sub unfortunately your patch alone did not solve the instance we were investigating. @mp15 will be raising a PR with an extension to your work which is successful.
from bwa-mem2.
I have managed to create a dataset that triggers the same issue using reference sequence only:
from bwa-mem2.
#51 will solve this issue.
from bwa-mem2.
I have added the reallocs (similar to the one implemented by arun in pull request #51). let me know if you face any issues.
from bwa-mem2.
Can I confirm, this is all on master? I'm not missing visibility of a development branch or something?
from bwa-mem2.
@yuk12 I can confirm that this now runs to completion on 8 data sets that previously failed. It is significantly slower than the version @arun-sub provided from this branch:
https://github.com/arun-sub/bwa-mem2/commits/mem_corruption_fix
Specifically this commit:
https://github.com/arun-sub/bwa-mem2/tree/831d11c00127a36f83579e94a869c3b27c9db640
avx512bw_20200227/1: CPU time : 50802.23 sec.
avx512bw_20200317/1: CPU time : 114679.00 sec.
(equivalent hardware, 8 cpus)
from bwa-mem2.
@keiranmraine Will it be possible for you to share a dataset to reproduce this? Thanks.
from bwa-mem2.
@yuk12 I am unable to share this data. We initially had this slow down with first fix provided by @arun-sub, aligning the allocations solved it:
from bwa-mem2.
Cool. Thanks :)
from bwa-mem2.
@yuk12 do you want another round of testing?
from bwa-mem2.
Yes, please. Thanks in advance.
from bwa-mem2.
@yuk12 results are in, much faster:
cgppipe@casm3-head1: grep 'CPU time' logs/avx512-20200317/*
logs/avx512-20200317/1: CPU time : 114679.00 sec.
logs/avx512-20200317/2: CPU time : 121971.41 sec.
logs/avx512-20200317/3: CPU time : 134016.03 sec.
logs/avx512-20200317/4: CPU time : 139938.64 sec.
logs/avx512-20200317/5: CPU time : 133400.00 sec.
logs/avx512-20200317/6: CPU time : 116532.14 sec.
logs/avx512-20200317/7: CPU time : 139155.31 sec.
cgppipe@casm3-head1: grep 'CPU time' logs/avx512-20200330/*
logs/avx512-20200330/1: CPU time : 101301.38 sec.
logs/avx512-20200330/2: CPU time : 91917.00 sec.
logs/avx512-20200330/3: CPU time : 119815.00 sec.
logs/avx512-20200330/4: CPU time : 108887.12 sec.
logs/avx512-20200330/5: CPU time : 107275.70 sec.
logs/avx512-20200330/6: CPU time : 85909.55 sec.
logs/avx512-20200330/7: CPU time : 121100.32 sec.
Dated folder based on date of commit used.
from bwa-mem2.
@keiranmraine Thanks a ton.
Do you see any slowdown as compared to previous commits?
Before this run you mentioned following slowdown:
avx512bw_20200227/1: CPU time : 50802.23 sec.
avx512bw_20200317/1: CPU time : 114679.00 sec.
is this ~2x slower? and now the new number is 101301.38 sec ?
from bwa-mem2.
I forgot to compare back to that version. But yes it does appear that this version is still significantly slower than the version provided by @arun-sub on Feb 27th.
I don't know what else may be involved.
from bwa-mem2.
@keiranmraine, there was a bug in my realloc implementation in #51 which @yuk12 pointed out. This could lead to different results if your dataset performed reallocations. It would be better to use runtimes of an older commit of BWA-MEM2 as your reference.
from bwa-mem2.
Okay, so the fastest run I have on avx512bw prior to that one is 118164 s.
I'll run comparison of each implementation (sse41, avx2, avx512bw) of the current build to confirm.
from bwa-mem2.
I've just found out why that is so slow, missed a replace and the recent stats are actually avx2. Repeating all now. Sorry.
from bwa-mem2.
Corrected run time for 20203030 commits, this makes far more sense:
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/1
logs/sse41-20200330/1: CPU time : 130462.00 sec.
logs/avx2-20200330/1: CPU time : 94966.13 sec.
logs/avx512bw-20200330/1: CPU time : 74183.03 sec.
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/2
logs/sse41-20200330/2: CPU time : 114574.68 sec.
logs/avx2-20200330/2: CPU time : 86097.48 sec.
logs/avx512bw-20200330/2: CPU time : 81591.84 sec.
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/3
logs/sse41-20200330/3: CPU time : 140110.00 sec.
logs/avx2-20200330/3: CPU time : 134825.52 sec.
logs/avx512bw-20200330/3: CPU time : 91147.00 sec.
$ grep 'CPU time' logs/{sse41,avx2,avx512bw}-20200330/4
logs/sse41-20200330/4: CPU time : 162803.00 sec.
logs/avx2-20200330/4: CPU time : 118373.68 sec.
logs/avx512bw-20200330/4: CPU time : 91178.53 sec.
from bwa-mem2.
Yes, these numbers make more sense. Thanks for the hard work @keiranmraine
from bwa-mem2.
Related Issues (20)
- Does bwa-mem2 preserve a read order from fastq file? HOT 1
- Precompiled AVX512 binary not working for Zen 4 Genoa HOT 7
- Allocation of 40.69 GB for suffix_array failed. HOT 4
- Mapping quality: bad and normal HOT 1
- Index building error
- Read group tag RG:Z for every read is missing HOT 2
- Question regarding the mismatch penalty -B
- Update paths in bwa-mem2-lisa
- Tag New Release HOT 6
- Difference in mapping quality and alignment for bwa mem2 compared to bwa mem HOT 5
- Is bwa really haplotype aware in non-human?
- error while index building HOT 1
- BWA-MEM2-LISA crashed with code 139 HOT 3
- [Question] How to install bwa-mem2-lisa
- Problems creating index with bwa-mem2 HOT 1
- MacOS M2 chip support
- Segmentation fault with valle-inclan dataset - bwa-mem2 2.2.1 avx512bw HOT 1
- bwa and bwa-mem2 relative indexing memory usage HOT 2
- Segmentation fault using bwa-mem2:2.2.1--hd03093a_5 HOT 7
- Add option to align multiple .fastq files at once? HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bwa-mem2.