Giter Club home page Giter Club logo

Comments (3)

yuk12 avatar yuk12 commented on May 24, 2024

I tried with latest commit on the given input. It executes to completion.
Could you try with the latest commit.

Run info:
Compiler used: icc-18, gcc.
Command line same as mentioned above.
make CXX=icpc multi
./bwa-mem2 mem -t 8 /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa /scratch/omics/genomes-indexes-reads/human/reads/ERR3242459_1.fastq.gz /scratch/omics/genomes-indexes-reads/human/reads/ERR3242459_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242459_1\tPL:ILLUMINA\tPU:ERR3242459_1\tLB:ERR3242459\tSM:ERR3242459' > /scratch/mvasimud/sam_out/tmpv.sa

Here is the log (icc-18):

-----------------------------
Executing in AVX512 mode!!
-----------------------------
* Ref file: /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa
* Entering FMI_search
* Reference seq len for bi-index = 6418915857
* Count:
0, 1
1, 1877984092
2, 3209457929
3, 4540931766
4, 6418915857

* Reading other elements of the index from files /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa
* Index prefix: /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa
* Read 0 ALT contigs
* Done reading Index!!
* Reading reference genome..
* Binary seq file = /scratch/omics/genomes-indexes-reads/human/ref/hs38DH/hs38DH.fa.0123
* Reference genome size: 6418915856 bp
* Done reading reference genome !!


  1. Memory pre-allocation for Chaining: 1393.3971 MB
  2. Memory pre-allocation for BSW: 1916.9362 MB
  3. Memory pre-allocation for BWT: 618.5134 MB

  • Threads used (compute): 8
  • No. of pipeline threads: 2

[0000] read_chunk: 100000000, work_chunk_size: 100000200, nseq: 666668
[0000][ M::kt_pipeline] read 666668 sequences (100000200 bp)...
[0000] Reallocating initial memory allocations!!
[0000] Calling mem_process_seqs.., task: 0
[0000] 1. Calling kt_for - worker_bwt
[0000] read_chunk: 100000000, work_chunk_size: 100000200, nseq: 666668
[0000][ M::kt_pipeline] read 666668 sequences (100000200 bp)...
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3209457928, n: 666668
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (0, 0, 0, 0)
[0000][PE] skip orientation FF as there are not enough pairs
[0000][PE] skip orientation FR as there are not enough pairs
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000] 3. Calling kt_for - worker_sam
[0000][ M::mem_process_seqs] Processed 666668 reads in 79.952 CPU sec, 10.282 real sec

[[Skipped intermediate logs]]

[0000] Calling mem_process_seqs.., task: 1097
[0000] 1. Calling kt_for - worker_bwt
[0000] read_chunk: 100000000, work_chunk_size: 0, nseq: 0
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3209457928, n: 346624
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (0, 0, 0, 0)
[0000][PE] skip orientation FF as there are not enough pairs
[0000][PE] skip orientation FR as there are not enough pairs
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000] 3. Calling kt_for - worker_sam
[0000][ M::mem_process_seqs] Processed 346624 reads in 23.609 CPU sec, 2.976 real sec
[0000] read_chunk: 100000000, work_chunk_size: 0, nseq: 0
[0000] Computation ends..

from bwa-mem2.

yihchii avatar yihchii commented on May 24, 2024

Hi @yuk12 with the latest binary (should be 20 days old by now) we have, bwa-mem2 failed with Segmentation fault.

  • Ran on AWS c5d.18xlarge (144GiB memory) with Linux 4.4.0-1099-aws and Ubuntu 16.04.6 LTS
  • Executing in AVX512 mode

See log and clip of error message below:

⟫ bwa-mem2 mem -t 36 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/ERR3242459_1.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/ERR3242459_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242459_1\tPL:ILLUMINA\tPU:ERR3242459_1\tLB:ERR3242459\tSM:ERR3242459' > test.sam
-----------------------------
Executing in AVX512 mode!!
-----------------------------
* Ref file: genome/hs38DH.fa
* Entering FMI_search
* Index file found. Loading index from genome/hs38DH.fa.bwt.8bit.32
* Reference seq len for bi-index = 6434693835
* Count:
0,      1
1,      1882204624
2,      3217346918
3,      4552489212
4,      6434693835

* Reading other elements of the index from files genome/hs38DH.fa
* Index prefix: genome/hs38DH.fa
* Read 3171 ALT contigs
* Done reading Index!!
* Reading reference genome..
* Binary seq file = genome/hs38DH.fa.0123

...
[0000] 3. Calling kt_for - worker_sam
        [0000][ M::mem_process_seqs] Processed 666668 reads in 106.872 CPU sec, 3.558 real sec
[0000] Calling mem_process_seqs.., task: 366
[0000] 1. Calling kt_for - worker_bwt
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3217346917, n: 666668
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (13, 317237, 2, 0)
[0000][PE] analyzing insert size distribution for orientation FF...
[0000][PE] (25, 50, 75) percentile: (242, 405, 1463)
[0000][PE] low and high boundaries for computing mean and std.dev: (1, 3905)
[0000][PE] mean and std.dev: (703.23, 720.10)
[0000][PE] low and high boundaries for proper pairs: (1, 5126)
[0000][PE] analyzing insert size distribution for orientation FR...
[0000][PE] (25, 50, 75) percentile: (359, 420, 495)
[0000][PE] low and high boundaries for computing mean and std.dev: (87, 767)
[0000][PE] mean and std.dev: (430.15, 100.14)
[0000][PE] low and high boundaries for proper pairs: (1, 903)
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000][PE] skip orientation FF
[0000] 3. Calling kt_for - worker_sam
[0000] read_chunk: 100000000, work_chunk_size: 100000200, nseq: 666668
        [0000][ M::kt_pipeline] read 666668 sequences (100000200 bp)...
Segmentation fault

from bwa-mem2.

yihchii avatar yihchii commented on May 24, 2024

When I reduced the threads usage to 36, it went Segmentation fault as well:

⟫ bwa-mem2 mem -t 36 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/ERR3242459_1.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/ERR3242459_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242459_1\tPL:ILLUMINA\tPU:ERR3242459_1\tLB:ERR3242459\tSM:ERR3242459' > test.sam
-----------------------------
Executing in AVX512 mode!!
-----------------------------
* Ref file: genome/hs38DH.fa
* Entering FMI_search
* Index file found. Loading index from genome/hs38DH.fa.bwt.8bit.32
* Reference seq len for bi-index = 6434693835
* Count:
0,      1
1,      1882204624
2,      3217346918
3,      4552489212
4,      6434693835

[0000] 3. Calling kt_for - worker_sam
        [0000][ M::mem_process_seqs] Processed 666668 reads in 106.872 CPU sec, 3.558 real sec
[0000] Calling mem_process_seqs.., task: 366
[0000] 1. Calling kt_for - worker_bwt
[0000] 2. Calling kt_for - worker_aln
[0000] Inferring insert size distribution of PE reads from data, l_pac: 3217346917, n: 666668
[0000][PE] # candidate unique pairs for (FF, FR, RF, RR): (13, 317237, 2, 0)
[0000][PE] analyzing insert size distribution for orientation FF...
[0000][PE] (25, 50, 75) percentile: (242, 405, 1463)
[0000][PE] low and high boundaries for computing mean and std.dev: (1, 3905)
[0000][PE] mean and std.dev: (703.23, 720.10)
[0000][PE] low and high boundaries for proper pairs: (1, 5126)
[0000][PE] analyzing insert size distribution for orientation FR...
[0000][PE] (25, 50, 75) percentile: (359, 420, 495)
[0000][PE] low and high boundaries for computing mean and std.dev: (87, 767)
[0000][PE] mean and std.dev: (430.15, 100.14)
[0000][PE] low and high boundaries for proper pairs: (1, 903)
[0000][PE] skip orientation RF as there are not enough pairs
[0000][PE] skip orientation RR as there are not enough pairs
[0000][PE] skip orientation FF
[0000] 3. Calling kt_for - worker_sam
[0000] read_chunk: 100000000, work_chunk_size: 100000200, nseq: 666668
        [0000][ M::kt_pipeline] read 666668 sequences (100000200 bp)...
Segmentation fault

When I further lowered the thread usage to 18 on this same instance, the mapping finished successfully:

⟫ bwa-mem2 mem -t 18 genome/hs38DH.fa /home/dnanexus/in/reads_fastqgzs/0/ERR3242459_1.fastq.gz /home/dnanexus/in/reads2_fastqgzs/0/ERR3242459_2.fastq.gz -K 100000000 -Y -R '@RG\tID:ERR3242459_1\tPL:ILLUMINA\tPU:ERR3242459_1\tLB:ERR3242459\tSM:ERR3242459' > test.sam
-----------------------------
Executing in AVX512 mode!!
-----------------------------
* Ref file: genome/hs38DH.fa
* Entering FMI_search
* Index file found. Loading index from genome/hs38DH.fa.bwt.8bit.32
* Reference seq len for bi-index = 6434693835
* Count:
0,      1
1,      1882204624
2,      3217346918
3,      4552489212
4,      6434693835
...
[0000] 3. Calling kt_for - worker_sam
        [0000][ M::mem_process_seqs] Processed 346624 reads in 38.236 CPU sec, 2.619 real sec
[0000] read_chunk: 100000000, work_chunk_size: 0, nseq: 0
[0000] Computation ends..
No. of OMP threads: 18
Processor is runnig @3000.175212 MHz
Runtime profile:

        Time taken for main_mem function: 8803.08 sec

        IO times (sec) :
        Reading IO time (reads) avg: 1085.79, (1085.79, 1085.79)
        Writing IO time (SAM) avg: 906.82, (906.82, 906.82)
        Reading IO time (Reference Genome) avg: 3.97, (3.97, 3.97)
        Index read time avg: 33.06, (33.06, 33.06)

        Overall time (sec) (Excluding Index reading time):
        PROCESS() (Total compute time + (read + SAM) IO time) : 8760.86
        MEM_PROCESS_SEQ() (Total compute time (Kernel + SAM)), avg: 8757.87, (8757.87, 8757.87)

         SAM Processing time (sec):
        --WORKER_SAM avg: 2894.57, (2894.57, 2894.57)

        Kernels' compute time (sec):
        Total kernel (smem+sal+bsw) time avg: 5724.08, (5724.08, 5724.08)
                SMEM compute avg: 2122.51, (2139.92, 2114.91)
                SAL compute avg: 776.54, (782.75, 744.07)
                BSW time, avg: 2179.92, (2185.53, 2176.70)

        Total re-allocs: 1766946271 out of total requests: -721685421, Rate: -2.45

Important parameter settings: 
        BATCH_SIZE: 512
        MAX_SEQ_LEN_REF: 256
        MAX_SEQ_LEN_QER: 128
        MAX_SEQ_LEN8: 128
        SEEDS_PER_READ: 500
        SIMD_WIDTH8 X: 64
        SIMD_WIDTH16 X: 32
        AVG_SEEDS_PER_READ: 64

Feel free to close this thread.

from bwa-mem2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.