Giter Club home page Giter Club logo

Comments (45)

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024 3

giphy

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Hi @emilio-r, it seems there was a compilation problem due to architecture aspects. Now it should be fixed (#67).

Could you please pull the current last docker image and re-try?

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

Hello @AndreaGuarracino, thank you for responding to my issue!
I pulled the new version just a few hours ago now (singularity pull pggb.sif docker://ghcr.io/pangenome/pggb:latest) and tried running it with edyeet, wfmash and different settings for the --poa-params. Unfortunately this version still results in the same error, and pggb terminates at the same junction as before.
Wfmash looks much the same as before, so here is the log from when I tried running it with edyeet just now:

Starting pggb on Fri Feb 26 07:39:24 CET 2021

Command: /usr/local/bin/pggb -i /proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa -s 500 -w 25000 -p 85 -a 85 -n 14 -t 10 --poa-params 1,4,6,2,26,1 -v -o /proj/uppstore2017270/Emilio/pggb/new_version_out/Erik_suggested/asm20_edyeet

PARAMETERS

general:
input-fasta: /proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa
output-dir: /proj/uppstore2017270/Emilio/pggb/new_version_out/Erik_suggested/asm20_edyeet
resume: false
threads: 10
alignment:
mapping-tool: edyeet
no-splits: false
segment-length: 500
block-length: 1500
no-merge-segments: false
map-pct-id: 85
align-pct-id: 85
n-secondary: 14
mash-kmer: 16
wfmash: false
exclude-delim: false
seqwish:
min-match-len: 19
transclose-batch: 1000000
smoothxg:
block-weight-max: 25000
path-jump-max: 5000
edge-jump-max: 5000
poa-length-max: 10000
poa-params: 1,4,6,2,26,1
consensus-spec: 10,100,1000,10000
block-id-min: 0
ratio-contain: 0
odgi:
viz: true
layout: false
stats: false
reporting:
multiqc: false

Running pggb

[edyeet::map] Reference = [/proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa]
[edyeet::map] Query = [/proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa]
[edyeet::map] Kmer size = 16
[edyeet::map] Window size = 20
[edyeet::map] Segment length = 500 (read split allowed)
[edyeet::map] Block length min = 1500
[edyeet::map] Alphabet = DNA
[edyeet::map] Percentage identity threshold = 85%
[edyeet::map] Mapping output file = /crex/proj/uppstore2017270/Emilio/pggb/edyeet-GU1mob
[edyeet::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[edyeet::map] Execution threads = 10
[edyeet::skch::Sketch::build] minimizers picked from reference = 2330611
[edyeet::skch::Sketch::index] unique minimizers = 761163
[edyeet::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 371539) ... (282, 1)
[edyeet::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minimizers occurring >= 67 times during lookup.
[edyeet::map] time spent computing the reference index: 0.880732 sec
[edyeet::skch::Map::mapQuery] WARNING, no .fai index found for /proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa, reading file to sum sequence length (slow)
[edyeet::skch::Map::mapQuery] mapped 100.00% @ 3.62e+06 bp/s elapsed: 00:00:00:06 remain: 00:00:00:00
[edyeet::skch::Map::mapQuery] count of mapped reads = 14, reads qualified for mapping = 15, total input reads = 15, total input bp = 24465645
[edyeet::map] time spent mapping the query: 6.79e+00 sec
[edyeet::map] mapping results saved in: /crex/proj/uppstore2017270/Emilio/pggb/edyeet-GU1mob
[edyeet::align] Reference = [/proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa]
[edyeet::align] Query = [/proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa]
[edyeet::align] Mapping file = /crex/proj/uppstore2017270/Emilio/pggb/edyeet-GU1mob
[edyeet::align] Edlib identity cut-off = 8.50e+01%
[edyeet::align] Alignment output file = /dev/stdout
[edyeet::align] time spent read the reference sequences: 5.84e-02 sec
Command terminated by signal 4
edyeet -X -s 500 -l 1500 -p 85 -n 14 -a 85 -k 16 -t 10 /proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa /proj/uppstore2017270/Emilio/pggb/indata/Erik_test_data/complete_reference_genomes.fa
52.02s user 0.70s system 642% cpu 8.20s total 255600Kb max memory

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@emilio-r confirmed that the PR #72 seems to have solved the problem.

Thank you for your patience and tests. Feel free to reopen in case of problems

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

@AndreaGuarracino, I used singularity to pull a fresh version of pggb this morning, and similar to what #95 reports I am again seeing PGGB terminate at this point:

Starting pggb on Wed May  5 07:53:56 CEST 2021

Command: /usr/local/bin/pggb -i ./New_indata/messy_genomes_concatenated.fa -s 500 -w 25000 -p 85 -n 9 -P 1,9,16,2,41,1 -Y _ -t 5 -v -I 0.6 -o ./runtest

PARAMETERS

general:
  input-fasta:        ./New_indata/messy_genomes_concatenated.fa
  output-dir:         ./runtest
  resume:             false
  pigz-compress:      false
  threads:            5
alignment:
  mapping-tool:       wfmash
  no-splits:          false
  segment-length:     500
  block-length:       1500
  no-merge-segments:  false
  map-pct-id:         85
  n-secondary:        9
  mash-kmer:          16
  exclude-delim:      _
seqwish:
  min-match-len:      19
  transclose-batch:   10000000
smoothxg:
  block-weight-max:   25000
  path-jump-max:      100
  edge-jump-max:      0
  poa-length-target:  5000
  poa-params:         1,9,16,2,41,1
  write-maf:          false
  consensus-prefix:   Consensus_
  consensus-spec:     false
  split-min-depth:    2000
  block-id-min:       0.6
  block-ratio-min:    0
odgi:
  viz:                true
  layout:             false
  stats:              false
reporting:
  multiqc:            false

Running pggb

[wfmash::map] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Kmer size = 16
[wfmash::map] Window size = 1
[wfmash::map] Segment length = 500 (read split allowed)
[wfmash::map] Block length min = 1500
[wfmash::map] Alphabet = DNA
[wfmash::map] Percentage identity threshold = 0.85%
[wfmash::map] Mapping output file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-DSxI1T
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 5
[wfmash::skch::Sketch::build] minimizers picked from reference = 16424393
[wfmash::skch::Sketch::index] unique minimizers = 5451643
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 2430033) ... (364, 1)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minimizers occurring >= 65 times during lookup.
[wfmash::map] time spent computing the reference index: 5.23249 sec
[wfmash::skch::Map::mapQuery] WARNING, no .fai index found for ./New_indata/messy_genomes_concatenated.fa, reading file to sum sequence length (slow)
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 2.47e+05 bp/s elapsed: 00:00:01:06 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 172, reads qualified for mapping = 174, total input reads = 174, total input bp = 16427933
[wfmash::map] time spent mapping the query: 6.66e+01 sec
[wfmash::map] mapping results saved in: /crex/proj/uppstore2017270/Emilio/pggb/wfmash-DSxI1T
[wfmash::align] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Mapping file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-DSxI1T
[wfmash::align] Alignment identity cutoff = 8.50e-01%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent read the reference sequences: 8.35e-02 sec
Command terminated by signal 4
wfmash -Y _ -s 500 -l 1500 -p 85 -n 9 -k 16 -t 5 ./New_indata/messy_genomes_concatenated.fa ./New_indata/messy_genomes_concatenated.fa
256.22s user 7.87s system 361% cpu 73.04s total 1222332Kb max memory

from pggb.

subwaystation avatar subwaystation commented on August 11, 2024

Hi @emilio-r @brettChapman from #95
I was able to reproduce the issue when running the docker image produce by our CI.
However, when I built the docker image locally, all runs smoothly.
Could you please try this out?
I suspect, that the machine of the CI uses some machine instructions during compilation that breaks the usage of the docker image on our machines. So I am exploring some possible solutions.

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

Hello @subwaystation, thank you for replying!

I just built PGGB locally and can confirm that this works for me too, at least when running with your DRB1 example. Although it isn't mentioned in this issue-thread, I also saw this behavior (where only a locally built PGGB worked) when I first reported this issue in February.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Hi @emilio-r and @brettChapman (#95),
sorry if this issue came up again (and again and again, loop). Would you please try the new docker image (docker pull ghcr.io/pangenome/pggb:202105050807012785f9) to check if the last fix in #96 killed the error?

from pggb.

subwaystation avatar subwaystation commented on August 11, 2024

I just pulled the latest image and it worked for me. Curious about your experiences :)

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

I just pulled the version that you suggested and tried running it with two samples that I were able to run without fault in a previous version of PGGB. While it does not fail at the same spot in both cases, it still terminates:
On another note: would it be possible to add a -version flag to a PGGB build? I think it would be very useful to help users find and report on different builds of PGGB.

Starting pggb on Thu May  6 08:23:45 CEST 2021

Command: /usr/local/bin/pggb -i ./New_indata/New_complete_reference_genomes.fa -s 500 -w 25000 -p 85 -n 14 -P 1,9,16,2,41,1 -t 10 -v -I 0.6 -o ./runtest_2

PARAMETERS

general:
  input-fasta:        ./New_indata/New_complete_reference_genomes.fa
  output-dir:         ./runtest_2
  resume:             false
  pigz-compress:      false
  threads:            10
alignment:
  mapping-tool:       wfmash
  no-splits:          false
  segment-length:     500
  block-length:       1500
  no-merge-segments:  false
  map-pct-id:         85
  n-secondary:        14
  mash-kmer:          16
  exclude-delim:      false
seqwish:
  min-match-len:      19
  transclose-batch:   10000000
smoothxg:
  block-weight-max:   25000
  path-jump-max:      100
  edge-jump-max:      0
  poa-length-target:  5000
  poa-params:         1,9,16,2,41,1
  write-maf:          false
  consensus-prefix:   Consensus_
  consensus-spec:     false
  split-min-depth:    2000
  block-id-min:       0.6
  block-ratio-min:    0
odgi:
  viz:                true
  layout:             false
  stats:              false
reporting:
  multiqc:            false

Running pggb

[wfmash::map] Reference = [./New_indata/New_complete_reference_genomes.fa]
[wfmash::map] Query = [./New_indata/New_complete_reference_genomes.fa]
[wfmash::map] Kmer size = 16
[wfmash::map] Window size = 1
[wfmash::map] Segment length = 500 (read split allowed)
[wfmash::map] Block length min = 1500
[wfmash::map] Alphabet = DNA
[wfmash::map] Percentage identity threshold = 0.85%
[wfmash::map] Mapping output file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-q0cE7H
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 10
[wfmash::skch::Sketch::build] minimizers picked from reference = 24464660
[wfmash::skch::Sketch::index] unique minimizers = 7140269
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 3259590) ... (294, 1)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minimizers occurring >= 73 times during lookup.
[wfmash::map] time spent computing the reference index: 7.00694 sec
[wfmash::skch::Map::mapQuery] WARNING, no .fai index found for ./New_indata/New_complete_reference_genomes.fa, reading file to sum sequence length (slow)
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 2.85e+05 bp/s elapsed: 00:00:01:25 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 14, reads qualified for mapping = 15, total input reads = 15, total input bp = 24465645
[wfmash::map] time spent mapping the query: 8.58e+01 sec
[wfmash::map] mapping results saved in: /crex/proj/uppstore2017270/Emilio/pggb/wfmash-q0cE7H
[wfmash::align] Reference = [./New_indata/New_complete_reference_genomes.fa]
[wfmash::align] Query = [./New_indata/New_complete_reference_genomes.fa]
[wfmash::align] Mapping file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-q0cE7H
[wfmash::align] Alignment identity cutoff = 8.50e-01%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent read the reference sequences: 7.81e-02 sec
[wfmash::align::computeAlignments] aligned 100.00% @ 1.39e+06 bp/s elapsed: 00:00:01:46 remain: 00:00:00:00
[wfmash::align::computeAlignments] count of mapped reads = 15, total aligned bp = 147824402
[wfmash::align] time spent computing the alignment: 1.06e+02 sec
[wfmash::align] alignment results saved in: /dev/stdout
wfmash -X -s 500 -l 1500 -p 85 -n 14 -k 16 -t 10 ./New_indata/New_complete_reference_genomes.fa ./New_indata/New_complete_reference_genomes.fa
1668.35s user 36.71s system 852% cpu 199.95s total 1381272Kb max memory
[seqwish::seqidx] 0.004 indexing sequences
[seqwish::seqidx] 0.210 index built
[seqwish::alignments] 0.210 processing alignments
[seqwish::alignments] 1.841 indexing
[seqwish::alignments] 5.609 index built
[seqwish::transclosure] 5.617 computing transitive closures
[seqwish::transclosure] 5.652 0.00% 0-10000000 overlap_collect
[seqwish::transclosure] 8.716 0.00% 0-10000000 rank_build
[seqwish::transclosure] 9.206 0.00% 0-10000000 parallel_union_find
[seqwish::transclosure] 10.023 0.00% 0-10000000 dset_write
[seqwish::transclosure] 10.332 0.00% 0-10000000 dset_compression
[seqwish::transclosure] 10.698 0.00% 0-10000000 dset_sort
[seqwish::transclosure] 10.814 0.00% 0-10000000 dset_invert
[seqwish::transclosure] 11.105 0.00% 0-10000000 graph_emission
[seqwish::transclosure] 11.265 92.99% 10000008-24465645 overlap_collect
[seqwish::transclosure] 11.472 92.99% 10000008-24465645 rank_build
[seqwish::transclosure] 11.634 92.99% 10000008-24465645 parallel_union_find
[seqwish::transclosure] 11.675 92.99% 10000008-24465645 dset_write
[seqwish::transclosure] 11.725 92.99% 10000008-24465645 dset_compression
[seqwish::transclosure] 11.750 92.99% 10000008-24465645 dset_sort
[seqwish::transclosure] 11.782 92.99% 10000008-24465645 dset_invert
[seqwish::transclosure] 11.818 92.99% 10000008-24465645 graph_emission
[seqwish::transclosure] 15.702 100.00% building node_iitree and path_iitree indexes
[seqwish::transclosure] 18.417 100.00% done
[seqwish::transclosure] 18.417 done with transitive closures
[seqwish::compact] 18.417 compacting nodes
[seqwish::compact] 19.310 done compacting
[seqwish::compact] 19.343 built node index
[seqwish::links] 19.343 finding graph links
[seqwish::links] 20.961 links derived
[seqwish::gfa] 20.961 writing graph
[seqwish::gfa] 24.974 done
seqwish -t 10 -s ./New_indata/New_complete_reference_genomes.fa -p ./runtest_2/New_complete_reference_genomes.fa.230fa74.wfmash.paf -k 19 -g ./runtest_2/New_complete_reference_genomes.fa.230fa74.34ee7b1.seqwish.gfa -B 10000000 -P
76.32s user 14.48s system 359% cpu 25.26s total 1416160Kb max memory
[smoothxg::main] loading graph
[smoothxg::main] prepping graph for smoothing
[odgi::gfa_to_handle] building nodes: 100.00% @ 1.31e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::gfa_to_handle] building edges: 100.00% @ 1.11e+06/s elapsed: 00:00:00:01 remain: 00:00:00:00
[odgi::gfa_to_handle] building paths: 100.00% @ 1.03e+01/s elapsed: 00:00:00:01 remain: 00:00:00:00
[smoothxg::prep] building path index
[smoothxg::prep] sorting graph
[odgi::path_linear_sgd] calculating linear SGD schedule (7.05e-12 1.00e+00 100 0 1.00e-02)
[odgi::path_linear_sgd] calculating zetas for 10037 zipf distributions
[odgi::path_linear_sgd] 1D path-guided SGD: 100.00% @ 5.22e+06/s elapsed: 00:00:01:43 remain: 00:00:00:00
[odgi::groom] grooming: 100.00% @ 1.36e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::groom] organizing handles: 100.00% @ 2.73e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::groom] flipped 343839 handles
[odgi::topological_order] sorting nodes: 100.00% @ 9.41e+04/s elapsed: 00:00:00:07 remain: 00:00:00:00
[smoothxg::prep] chopping graph to 100
[odgi::chop] 1302 node(s) to chop.
[smoothxg::prep] writing graph ./runtest_2/New_complete_reference_genomes.fa.230fa74.34ee7b1.seqwish.gfa.prep.gfa
[smoothxg::main] building xg index
[smoothxg::smoothable_blocks] computing blocks
[smoothxg::smoothable_blocks] computing blocks for 687423 handles: 100.00% @ 2.45e+04/s elapsed: 00:00:00:28 remain: 00:00:00:00
[smoothxg::break_and_split_blocks] cutting blocks that contain sequences longer than max-poa-length (10000) and depth >= 2000
[smoothxg::break_and_split_blocks] splitting 1704 blocks at identity 0.600 (WFA-based clustering) and at estimated-identity 0.600 (mash-based clustering)
[smoothxg::break_and_split_blocks] cutting and splitting 1704 blocks:  0.12% @ 2.64e+02/s elapsed: 00:00:00:00 remain: 00:00:00:06Command terminated by signal 4
smoothxg -t 10 -g ./runtest_2/New_complete_reference_genomes.fa.230fa74.34ee7b1.seqwish.gfa -w 25000 -K -d 2000 -I 0.6 -R 0 -j 100 -e 0 -l 5000 -p 1,9,16,2,41,1 -Q Consensus_ -V -o ./runtest_2/New_complete_reference_genomes.fa.230fa74.34ee7b1.545adf9.smooth.gfa
1109.70s user 20.77s system 662% cpu 170.64s total 637324Kb max memory

Starting pggb on Thu May  6 07:39:46 CEST 2021

Command: /usr/local/bin/pggb -i ./New_indata/messy_genomes_concatenated.fa -s 500 -w 25000 -p 85 -n 9 -P 1,9,16,2,41,1 -Y _ -t 10 -v -I 0.6 -o ./runtest

PARAMETERS

general:
  input-fasta:        ./New_indata/messy_genomes_concatenated.fa
  output-dir:         ./runtest
  resume:             false
  pigz-compress:      false
  threads:            10
alignment:
  mapping-tool:       wfmash
  no-splits:          false
  segment-length:     500
  block-length:       1500
  no-merge-segments:  false
  map-pct-id:         85
  n-secondary:        9
  mash-kmer:          16
  exclude-delim:      _
seqwish:
  min-match-len:      19
  transclose-batch:   10000000
smoothxg:
  block-weight-max:   25000
  path-jump-max:      100
  edge-jump-max:      0
  poa-length-target:  5000
  poa-params:         1,9,16,2,41,1
  write-maf:          false
  consensus-prefix:   Consensus_
  consensus-spec:     false
  split-min-depth:    2000
  block-id-min:       0.6
  block-ratio-min:    0
odgi:
  viz:                true
  layout:             false
  stats:              false
reporting:
  multiqc:            false

Running pggb

[wfmash::map] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Kmer size = 16
[wfmash::map] Window size = 1
[wfmash::map] Segment length = 500 (read split allowed)
[wfmash::map] Block length min = 1500
[wfmash::map] Alphabet = DNA
[wfmash::map] Percentage identity threshold = 0.85%
[wfmash::map] Mapping output file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-RIYXgl
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 10
[wfmash::skch::Sketch::build] minimizers picked from reference = 16424393
[wfmash::skch::Sketch::index] unique minimizers = 5451643
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 2430033) ... (364, 1)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minimizers occurring >= 65 times during lookup.
[wfmash::map] time spent computing the reference index: 5.86467 sec
[wfmash::skch::Map::mapQuery] WARNING, no .fai index found for ./New_indata/messy_genomes_concatenated.fa, reading file to sum sequence length (slow)
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 3.93e+05 bp/s elapsed: 00:00:00:41 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 172, reads qualified for mapping = 174, total input reads = 174, total input bp = 16427933
[wfmash::map] time spent mapping the query: 4.18e+01 sec
[wfmash::map] mapping results saved in: /crex/proj/uppstore2017270/Emilio/pggb/wfmash-RIYXgl
[wfmash::align] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Mapping file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-RIYXgl
[wfmash::align] Alignment identity cutoff = 8.50e-01%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent read the reference sequences: 6.11e-02 sec
[wfmash::align::computeAlignments] aligned 72.26% @ 1.61e+06 bp/s elapsed: 00:00:00:27 remain: 00:00:00:10[wflign::wflign_affine_wavefront] corrupted traceback (out of bounds) for KH0001_contig44 0 Bra-LPB32_contig1 50740
Command exited with non-zero status 1
wfmash -Y _ -s 500 -l 1500 -p 85 -n 9 -k 16 -t 10 ./New_indata/messy_genomes_concatenated.fa ./New_indata/messy_genomes_concatenated.fa
547.67s user 12.65s system 731% cpu 76.55s total 1222544Kb max memory

from pggb.

ekg avatar ekg commented on August 11, 2024

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

I can confirm that by removing the -I flag both runs mentioned above now completed with their expected output.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Hi @emilio-r, during the block splitting, a smoothxg's dependency is required. I've applied a fix to that dependency, to avoid that compiler issue. Hopefully, with the last docker image (docker pull ghcr.io/pangenome/pggb:20210506084242ebe987), the example should work also with the block split activated.

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

Hello again @AndreaGuarracino!
Running PGGB without the -I will make both samples pass through, and the most recent update made the first sample (New_complete_reference_genomes.fa) complete properly when running with the -I flag.
However, the second sample still fail when -I is applied:

I have now also seen that running without the -Y flag set (here set to -Y "_") solved this error in the second sample, as seen in the last log below.

Starting pggb on Thu May  6 12:03:53 CEST 2021

Command: /usr/local/bin/pggb -i ./New_indata/messy_genomes_concatenated.fa -s 500 -w 25000 -p 85 -n 9 -P 1,9,16,2,41,1 -t 10 -Y _ -v -I 0.6 -o ./runtest_6

PARAMETERS

general:
  input-fasta:        ./New_indata/messy_genomes_concatenated.fa
  output-dir:         ./runtest_6
  resume:             false
  pigz-compress:      false
  threads:            10
alignment:
  mapping-tool:       wfmash
  no-splits:          false
  segment-length:     500
  block-length:       1500
  no-merge-segments:  false
  map-pct-id:         85
  n-secondary:        9
  mash-kmer:          16
  exclude-delim:      _
seqwish:
  min-match-len:      19
  transclose-batch:   10000000
smoothxg:
  block-weight-max:   25000
  path-jump-max:      100
  edge-jump-max:      0
  poa-length-target:  5000
  poa-params:         1,9,16,2,41,1
  write-maf:          false
  consensus-prefix:   Consensus_
  consensus-spec:     false
  split-min-depth:    2000
  block-id-min:       0.6
  block-ratio-min:    0
odgi:
  viz:                true
  layout:             false
  stats:              false
reporting:
  multiqc:            false

Running pggb

[wfmash::map] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Kmer size = 16
[wfmash::map] Window size = 1
[wfmash::map] Segment length = 500 (read split allowed)
[wfmash::map] Block length min = 1500
[wfmash::map] Alphabet = DNA
[wfmash::map] Percentage identity threshold = 0.85%
[wfmash::map] Mapping output file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-NgEON1
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 10
[wfmash::skch::Sketch::build] minimizers picked from reference = 16424393
[wfmash::skch::Sketch::index] unique minimizers = 5451643
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 2430033) ... (364, 1)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minimizers occurring >= 65 times during lookup.
[wfmash::map] time spent computing the reference index: 3.96236 sec
[wfmash::skch::Map::mapQuery] WARNING, no .fai index found for ./New_indata/messy_genomes_concatenated.fa, reading file to sum sequence length (slow)
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 3.77e+05 bp/s elapsed: 00:00:00:43 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 172, reads qualified for mapping = 174, total input reads = 174, total input bp = 16427933
[wfmash::map] time spent mapping the query: 4.35e+01 sec
[wfmash::map] mapping results saved in: /crex/proj/uppstore2017270/Emilio/pggb/wfmash-NgEON1
[wfmash::align] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Mapping file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-NgEON1
[wfmash::align] Alignment identity cutoff = 8.50e-01%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent read the reference sequences: 6.13e-02 sec
[wfmash::align::computeAlignments] aligned 72.50% @ 1.56e+06 bp/s elapsed: 00:00:00:28 remain: 00:00:00:10[wflign::wflign_affine_wavefront] corrupted traceback (out of bounds) for KH0001_contig44 0 Bra-LPB32_contig1 50740
Command exited with non-zero status 1
wfmash -Y _ -s 500 -l 1500 -p 85 -n 9 -k 16 -t 10 ./New_indata/messy_genomes_concatenated.fa ./New_indata/messy_genomes_concatenated.fa
561.94s user 11.76s system 744% cpu 77.01s total 1224768Kb max memory
Starting pggb on Thu May  6 12:14:57 CEST 2021

Command: /usr/local/bin/pggb -i ./New_indata/messy_genomes_concatenated.fa -s 500 -w 25000 -p 85 -n 9 -P 1,9,16,2,41,1 -t 10 -v -I 0.6 -o ./runtest_7

PARAMETERS

general:
  input-fasta:        ./New_indata/messy_genomes_concatenated.fa
  output-dir:         ./runtest_7
  resume:             false
  pigz-compress:      false
  threads:            10
alignment:
  mapping-tool:       wfmash
  no-splits:          false
  segment-length:     500
  block-length:       1500
  no-merge-segments:  false
  map-pct-id:         85
  n-secondary:        9
  mash-kmer:          16
  exclude-delim:      false
seqwish:
  min-match-len:      19
  transclose-batch:   10000000
smoothxg:
  block-weight-max:   25000
  path-jump-max:      100
  edge-jump-max:      0
  poa-length-target:  5000
  poa-params:         1,9,16,2,41,1
  write-maf:          false
  consensus-prefix:   Consensus_
  consensus-spec:     false
  split-min-depth:    2000
  block-id-min:       0.6
  block-ratio-min:    0
odgi:
  viz:                true
  layout:             false
  stats:              false
reporting:
  multiqc:            false

Running pggb

[wfmash::map] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::map] Kmer size = 16
[wfmash::map] Window size = 1
[wfmash::map] Segment length = 500 (read split allowed)
[wfmash::map] Block length min = 1500
[wfmash::map] Alphabet = DNA
[wfmash::map] Percentage identity threshold = 0.85%
[wfmash::map] Mapping output file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-e4f40m
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 10
[wfmash::skch::Sketch::build] minimizers picked from reference = 16424393
[wfmash::skch::Sketch::index] unique minimizers = 5451643
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 2430033) ... (364, 1)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, ignore minimizers occurring >= 65 times during lookup.
[wfmash::map] time spent computing the reference index: 4.22797 sec
[wfmash::skch::Map::mapQuery] WARNING, no .fai index found for ./New_indata/messy_genomes_concatenated.fa, reading file to sum sequence length (slow)
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 3.75e+05 bp/s elapsed: 00:00:00:43 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 172, reads qualified for mapping = 174, total input reads = 174, total input bp = 16427933
[wfmash::map] time spent mapping the query: 4.38e+01 sec
[wfmash::map] mapping results saved in: /crex/proj/uppstore2017270/Emilio/pggb/wfmash-e4f40m
[wfmash::align] Reference = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Query = [./New_indata/messy_genomes_concatenated.fa]
[wfmash::align] Mapping file = /crex/proj/uppstore2017270/Emilio/pggb/wfmash-e4f40m
[wfmash::align] Alignment identity cutoff = 8.50e-01%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent read the reference sequences: 6.04e-02 sec
[wfmash::align::computeAlignments] aligned 100.00% @ 1.47e+06 bp/s elapsed: 00:00:00:42 remain: 00:00:00:00
[wfmash::align::computeAlignments] count of mapped reads = 174, total aligned bp = 62037750
[wfmash::align] time spent computing the alignment: 4.24e+01 sec
[wfmash::align] alignment results saved in: /dev/stdout
wfmash -X -s 500 -l 1500 -p 85 -n 9 -k 16 -t 10 ./New_indata/messy_genomes_concatenated.fa ./New_indata/messy_genomes_concatenated.fa
672.81s user 15.24s system 756% cpu 90.91s total 1222708Kb max memory
[seqwish::seqidx] 0.005 indexing sequences
[seqwish::seqidx] 0.154 index built
[seqwish::alignments] 0.154 processing alignments
[seqwish::alignments] 0.637 indexing
[seqwish::alignments] 2.259 index built
[seqwish::transclosure] 2.267 computing transitive closures
[seqwish::transclosure] 2.301 0.00% 0-10000000 overlap_collect
[seqwish::transclosure] 3.541 0.00% 0-10000000 rank_build
[seqwish::transclosure] 3.871 0.00% 0-10000000 parallel_union_find
[seqwish::transclosure] 4.242 0.00% 0-10000000 dset_write
[seqwish::transclosure] 4.470 0.00% 0-10000000 dset_compression
[seqwish::transclosure] 4.723 0.00% 0-10000000 dset_sort
[seqwish::transclosure] 4.854 0.00% 0-10000000 dset_invert
[seqwish::transclosure] 5.078 0.00% 0-10000000 graph_emission
[seqwish::transclosure] 5.185 95.60% 10000021-16427933 overlap_collect
[seqwish::transclosure] 5.277 95.60% 10000021-16427933 rank_build
[seqwish::transclosure] 5.354 95.60% 10000021-16427933 parallel_union_find
[seqwish::transclosure] 5.361 95.60% 10000021-16427933 dset_write
[seqwish::transclosure] 5.381 95.60% 10000021-16427933 dset_compression
[seqwish::transclosure] 5.396 95.60% 10000021-16427933 dset_sort
[seqwish::transclosure] 5.417 95.60% 10000021-16427933 dset_invert
[seqwish::transclosure] 5.437 95.60% 10000021-16427933 graph_emission
[seqwish::transclosure] 7.858 100.00% building node_iitree and path_iitree indexes
[seqwish::transclosure] 9.649 100.00% done
[seqwish::transclosure] 9.649 done with transitive closures
[seqwish::compact] 9.649 compacting nodes
[seqwish::compact] 10.163 done compacting
[seqwish::compact] 10.188 built node index
[seqwish::links] 10.188 finding graph links
[seqwish::links] 11.347 links derived
[seqwish::gfa] 11.347 writing graph
[seqwish::gfa] 13.570 done
seqwish -t 10 -s ./New_indata/messy_genomes_concatenated.fa -p ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.wfmash.paf -k 19 -g ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.seqwish.gfa -B 10000000 -P
34.11s user 5.68s system 290% cpu 13.68s total 868564Kb max memory
[smoothxg::main] loading graph
[smoothxg::main] prepping graph for smoothing
[odgi::gfa_to_handle] building nodes: 100.00% @ 9.23e+05/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::gfa_to_handle] building edges: 100.00% @ 9.38e+05/s elapsed: 00:00:00:01 remain: 00:00:00:00
[odgi::gfa_to_handle] building paths: 100.00% @ 1.86e+02/s elapsed: 00:00:00:00 remain: 00:00:00:00
[smoothxg::prep] building path index
[smoothxg::prep] sorting graph
[odgi::path_linear_sgd] calculating linear SGD schedule (1.31e-11 1.00e+00 100 0 1.00e-02)
[odgi::path_linear_sgd] calculating zetas for 10027 zipf distributions
[odgi::path_linear_sgd] 1D path-guided SGD: 100.00% @ 5.07e+06/s elapsed: 00:00:00:51 remain: 00:00:00:00
[odgi::groom] grooming: 100.00% @ 1.85e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::groom] organizing handles: 100.00% @ 1.85e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::groom] flipped 231632 handles
[odgi::topological_order] sorting nodes: 100.00% @ 1.09e+05/s elapsed: 00:00:00:04 remain: 00:00:00:00
[smoothxg::prep] chopping graph to 100
[odgi::chop] 1243 node(s) to chop.
[smoothxg::prep] writing graph ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.seqwish.gfa.prep.gfa
[smoothxg::main] building xg index
[smoothxg::smoothable_blocks] computing blocks
[smoothxg::smoothable_blocks] computing blocks for 468703 handles: 100.00% @ 3.60e+04/s elapsed: 00:00:00:13 remain: 00:00:00:00
[smoothxg::break_and_split_blocks] cutting blocks that contain sequences longer than max-poa-length (10000) and depth >= 2000
[smoothxg::break_and_split_blocks] splitting 1217 blocks at identity 0.600 (WFA-based clustering) and at estimated-identity 0.600 (mash-based clustering)
[smoothxg::break_and_split_blocks] cutting and splitting 1217 blocks: 100.00% @ 3.72e+02/s elapsed: 00:00:00:03 remain: 00:00:00:00
[smoothxg::break_and_split_blocks] cut 0 blocks of which 0 had repeats
[smoothxg::break_and_split_blocks] split 33 blocks
[smoothxg::smooth_and_lace] applying global abPOA to 1265 blocks: 100.00% @ 5.32e+01/s elapsed: 00:00:00:23 remain: 00:00:00:00
[smoothxg::smooth_and_lace] flipping 0 block graphs: 100.00% @ 1.62e+05/s elapsed: 00:00:00:00 remain: 00:00:00:00
[smoothxg::smooth_and_lace] sorting path_mappings
[smoothxg::smooth_and_lace] adding edges from 1265 graphs: 100.00% @ 2.53e+03/s elapsed: 00:00:00:00 remain: 00:00:00:00
[smoothxg::smooth_and_lace] embedding 10630 path fragments: 100.00% @ 2.02e+03/s elapsed: 00:00:00:05 remain: 00:00:00:00
[smoothxg::smooth_and_lace] validating 174 path sequences: 100.00% @ 2.24e+02/s elapsed: 00:00:00:00 remain: 00:00:00:00
[smoothxg::smooth_and_lace] adding nodes from 1265 graphs: 100.00% @ 1.89e+02/s elapsed: 00:00:00:06 remain: 00:00:00:00
[smoothxg::smooth_and_lace] walking edges in 174 paths: 100.00% @ 1.57e+02/s elapsed: 00:00:00:01 remain: 00:00:00:00
[smoothxg::main] unchopping smoothed graph
[odgi::unchop] unchopped 226 nodes into 110 new nodes.
[smoothxg::main] smoothed graph length 2573868bp in 703525 nodes
[smoothxg::main] writing smoothed graph to ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.545adf9.smooth.gfa
smoothxg -t 10 -g ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.seqwish.gfa -w 25000 -K -d 2000 -I 0.6 -R 0 -j 100 -e 0 -l 5000 -p 1,9,16,2,41,1 -Q Consensus_ -V -o ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.545adf9.smooth.gfa
736.65s user 73.30s system 631% cpu 128.18s total 1544656Kb max memory
[odgi::gfa_to_handle] building nodes: 100.00% @ 7.02e+05/s elapsed: 00:00:00:01 remain: 00:00:00:00
[odgi::gfa_to_handle] building edges: 100.00% @ 1.29e+06/s elapsed: 00:00:00:00 remain: 00:00:00:00
[odgi::gfa_to_handle] building paths: 100.00% @ 1.39e+02/s elapsed: 00:00:00:01 remain: 00:00:00:00
odgi build -t 10 -P -g ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.545adf9.smooth.gfa -o ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.545adf9.smooth.og
9.41s user 1.02s system 214% cpu 4.87s total 360604Kb max memory
odgi viz -i ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.545adf9.smooth.og -o ./runtest_7/messy_genomes_concatenated.fa.adfd3d2.34ee7b1.545adf9.smooth.og.viz_mqc.png -x 1500 -y 500 -P 10 -I Consensus_
1.66s user 0.19s system 99% cpu 1.86s total 265944Kb max memory

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Thank you again @emilio-r for reporting that. It seems the signal-4 error is fixed then. I am working on catching the new one!

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

I confirm that there was a problem caused by the short sequences in your input. The problem was fixed in wfmash, updating also the pggb docker image (docker pull ghcr.io/pangenome/pggb:20210506141530af4e13). Will we be able to offer you a complete error-free run?

from pggb.

emilio-r avatar emilio-r commented on August 11, 2024

Hello again, @AndreaGuarracino!
Yes, I just tested the latest build you suggested and that seems to have fixed running this sample with both -I and -Y. Thank you for the assistance!

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

You're welcome, thank you for your patience!

from pggb.

rtcz avatar rtcz commented on August 11, 2024

Hi! I have installed pggb in conda on two servers and on both of them I get the error.
Any help would be appreciated.

I have the newest version:
pggb 0.3.0 hdfd78af_1 bioconda

log file:

Starting pggb on Wed Apr 20 13:15:40 CEST 2022

Command: /data/tools/miniconda3/envs/pangenomics/bin/pggb --input-fasta run-78_analysis.fa --output output --map-pct-id 95 --segment-length 5000 --n-mappings 96 --threads 8

PARAMETERS

general:
  input-fasta:        run-78_analysis.fa
  output-dir:         output
  resume:             false
  pigz-compress:      false
  threads:            8
wfmash:
  mapping-tool:       wfmash
  no-splits:          false
  segment-length:     5000
  block-length:       false
  no-merge-segments:  false
  map-pct-id:         95
  n-mappings:         96
  mash-kmer:          false
  exclude-delim:      false
seqwish:
  min-match-len:      47
  transclose-batch:   10000000
smoothxg:
  n-haps:             96
  block_id_min:       .9500
  path-jump-max:      0
  edge-jump-max:      0
  poa-length-target:  4001,4507
  poa-params:         -P 1,19,39,3,81,1
  write-maf:          false
  consensus-prefix:   Consensus_
  consensus-spec:     false
  pad-max-depth:      100
  block-id-min:       .9500
  block-ratio-min:    0
  poa_threads:        8
  poa_padding:        0.03
odgi:
  viz:                true
  layout:             true
  stats:              false
gfaffix:
  normalize:          true
vg:
  deconstruct:        false  
reporting:
  multiqc:            false

Running pggb

Command terminated by signal 4
wfmash -X -s 5000 -p 95 -n 96 -t 8 run-78_analysis.fa run-78_analysis.fa
0.00s user 0.00s system 2% cpu 0.16s total 8720Kb max memory

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@rtcz, PGGB on bioconda had still problems. Can you please try again with the latest version released today, v0.3.1?

from pggb.

rtcz avatar rtcz commented on August 11, 2024

It works now, thank you!

from pggb.

michael-olufemi avatar michael-olufemi commented on August 11, 2024

Hello I am getting this error

Command terminated by signal 4
smoothxg -t 16 -T 16 -g ../pggb_output/in.fa.cff9e6a.417fcdf.seqwish.gfa -w 2800 -b ../pggb_output -X 100 -I .9000 -R 0 -j 0 -e 0 -l 700 -P 1,19,39,3,81,1 -O 0.001 -Y 400 -d 0 -D 0 -S -V -o ../pggb_output/in.fa.cff9e6a.417fcdf.53439a3.smooth.1.gfa

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@michael-olufemi, we would need much more information to understand what is going on.

Would you please specify the versions for all tools? And describe your installation and system?

from pggb.

michael-olufemi avatar michael-olufemi commented on August 11, 2024

@AndreaGuarracino
The version is pggb/0.4.1. The installation was done through singularity on a high performance cluster. The installation was done by the HPC admin, I don't know much details. This is the command line I am using

singularity exec $PGGBIMG pggb -i ../data/in.fa -n 4 -o ../pggb_output4 -r -t 16 -p 90 -s 5000 -V 'ref:#:1000'

I have also attached my log file
pggb.log

Thanks

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Interesting, I was expecting the error in a bit different place in the log. Would you please also specify the CPU and operating system of your high performance cluster?

from pggb.

michael-olufemi avatar michael-olufemi commented on August 11, 2024

The UMass Shared Green High Performance Computing Cluster is located in Holyoke MA and provides computing to the five University of Massachusetts Campuses and includes equipment (cores and storage) contributed by each UMass campus as well as jointly funded components (Network, Scheduler Software, and support staff). It is composed of 374 processing nodes with a total of 13968 cores, Dell/EMC high performance Isilon storage with ~1.4PB Isilon H500 and ~1.5PB Isilon A200. The cluster hardware consists of an Infiniband (IB) network composed of Mellanox FDR and EDR switching islands, a bridged 10GE network for storage and some IP traffic offloading, qty two (2) GPU nodes (Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz with 384GB RAM) with four Tesla V100 devices per node configured with NVLINK, qty three (3) GPU nodes (Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz with 192GB RAM) with three Tesla V100 devices per node, qty six (6) GPU nodes (Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz with 256GB RAM) with four Tesla K80 GPU per node, qty three (3) GPU nodes (Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz with 256GB RAM) with two Tesla C2075 GPU per node, qty eighty-eight (88) nodes with AMD Opteron(tm) Processor 6380 with 64 cores / 512GB RAM each, qty fourty (40) nodes with AMD Opteron 6278 with 64 cores / 512GB RAM each, qty twenty four (24) nodes with AMD Opteron 6378 with 64 cores / 512GB RAM each, qty eighty (80) Intel(R) Xeon(R) CPU E5-2650 v3 with 20 cores / 128GB RAM each, qty thirty two (32) Intel Xeon E5-2650 with 16 cores / 192GB RAM each, qty sixteen (16) Intel Xeon E5-2650 v3 with 20 cores / 128GB RAM each, qty thirty two (32) Intel Xeon Gold-6140 with 36 cores / 192GB RAM each, qty fourty-eight (48) Intel (R) Xeon(R) Gold 6148 CPU @ 2.40GHz with 36 cores / 192GB RAM each. The HPC environment runs the IBM LSF scheduling software for job management. The Massachusetts Green High Performance Computer Center (MGHPCC) facility has space, power, and cooling capacity to support 680 racks of computing and storage equipment, drawing up to 15MW of power. High speed network connections to the facility are available via dark fiber, the Northern Crossroads, and Internet 2. The MGHPCC facility has been awarded LEED Platinum status.

https://www.umassrc.org/wiki/index.php/Main_Page
It seems to be long, I hope it helps

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@michael-olufemi, I think I understand the problem. It is related to the portability of a tool used in pggb (SPOA in particular). I tackled this issue in #223, after the release of pggb 0.4.1.

Could you ask your admin to update pggb to version 0.5.0, try again, and let me know, please?

from pggb.

michael-olufemi avatar michael-olufemi commented on August 11, 2024

@AndreaGuarracino, I just tried it with the 0.50 version and I am still getting this error, this even came earlier

Command terminated by signal 4
seqwish -s ../data/in.fa -p ../pggb_output4/in.fa.6a0427f.alignments.wfmash.paf -k 19 -f 0 -g ../pggb_output4/in.fa.6a0427f.417fcdf.seqwish.gfa -B 10000000 -t 16 --temp-dir ../pggb_output4 -P
0.00s user 0.09s system 13% cpu 0.72s total 11264Kb max memory

from pggb.

ekg avatar ekg commented on August 11, 2024

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@michael-olufemi, thank you for your second trial. Now the error pops up with seqwish, but there is still hope. I have just released pggb 0.5.1 which incorporates a very fresh and potentially important update that involves seqwish and its portability.

Would you mind bothering your HPC administrator again for updating pggb to version 0.5.1 and (re)try again, please? If he can work only with bioconda, and not with Docker/Singularity, I still need to make a few changes there. I will ping you when the bioconda release is complete.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@michael-olufemi, pggb 0.5.1 is also on `bioconda' now https://anaconda.org/bioconda/pggb

from pggb.

michael-olufemi avatar michael-olufemi commented on August 11, 2024

@AndreaGuarracino

The bioconda didn't work at all, it terminated at wfmash

Command terminated by signal 4
wfmash -X -s 5000 -p 90 -n 4 -t 16 ../data/in.fa ../data/in.fa
0.12s user 0.67s system 60% cpu 1.31s total 16192Kb max memory

I would try and take @ekg advice and see if it would work on my local machine.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Arg, the problem continues to move around. I have pushed a tweak also for wfmash on bioconda: the same version number, but a different build. This is my last bullet/hope. Apologize for my multiple requests. @michael-olufemi , if you have still the energy to bother your admin, be sure that he installs the last wfmash build listed here https://anaconda.org/bioconda/wfmash/files. Else please let us know if building pggb locally worked (it should).

from pggb.

alexzaccaron avatar alexzaccaron commented on August 11, 2024

I am also having the same Command terminated by signal 4 problem with the latest release of pggb on Bioconda. But a fresh local installation worked fine for me. Here is the code I used to install locally.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

It hurts. @alexzaccaron, could you please report which version of pggb on bioconda you used and which step failed? It would ve useful the version of the tool that failed too.

from pggb.

alexzaccaron avatar alexzaccaron commented on August 11, 2024

@AndreaGuarracino, I am having the issue while using pggb v0.5.1 in a snakemake pipeline. Or at least that's the version I ask to install in the yml file I give to snakemake. I don't know why, but when I run /path/to/pggb --version I get fatal: not a git repository (or any of the parent directories): .git. In any case, the command I run is:

pggb -i data/in.fasta.gz -o output/01pggb/p95_s20k_k29 -t 8 -n 5 --run-abpoa

The output to terminal is:

[wfmash::map] Reference = [data/in.fasta.gz]
[wfmash::map] Query = [data/in.fasta.gz]
[wfmash::map] Kmer size = 19
[wfmash::map] Window size = 136
[wfmash::map] Segment length = 5000 (read split allowed)
[wfmash::map] Block length min = 25000
[wfmash::map] Chaining gap max = 100000
[wfmash::map] Percentage identity threshold = 90%
[wfmash::map] Skip self mappings
[wfmash::map] Mapping output file = /dev/stdout
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 8
[wfmash::skch::Sketch::build] minimizers picked from reference = 155355
[wfmash::skch::Sketch::index] unique minimizers = 26997
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 783) ... (92, 1)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, consider all minimizers during lookup.
[wfmash::map] time spent computing the reference index: 0.780624 sec
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 1.01e+06 bp/s elapsed: 00:00:00:10 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 4, reads qualified for mapping = 5, total input reads = 5, total input bp = 10565283
[wfmash::map] time spent mapping the query: 1.05e+01 sec
[wfmash::map] mapping results saved in: /dev/stdout
wfmash -s 5000 -l 25000 -p 90 -n 4 -k 19 -H 0.001 -X -t 8 --tmp-base output/01pggb/p95_s20k_k29 data/in.fasta.gz --approx-map
53.13s user 0.08s system 470% cpu 11.31s total 44784Kb max memory
[wfmash::align] Reference = [data/in.fasta.gz]
[wfmash::align] Query = [data/in.fasta.gz]
[wfmash::align] Mapping file = output/01pggb/p95_s20k_k29/wfmash-gKTpcl
[wfmash::align] Alignment identity cutoff = 0%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent loading the reference index: 0.000532299 sec
[wfmash::align::computeAlignments] aligned  0.00% @ 0.00e+00 bp/s elapsed: 00:00:00:00 remain: 00:00:00:00Command terminated by signal 4
wfmash -s 5000 -l 25000 -p 90 -n 4 -k 19 -H 0.001 -X -t 8 --tmp-base output/01pggb/p95_s20k_k29 data/in.fasta.gz -i output/01pggb/p95_s20k_k29/in.fasta.gz.fd8c760.mappings.wfmash.paf
0.14s user 0.04s system 59% cpu 0.31s total 24420Kb max memory

Here are the versions of the dependencies that conda install:

wfmash v0.10.0-4-gcf9bfb0
seqwish v0.7.7
smoothxg v0.6.7
odgi v0.8.1-1-ge91b1cd "Piccino"
gfaffix 0.1.4
vg version v1.40.0 "Suardi"

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Thank you, @alexzaccaron. I am able to reproduce the pggb --version problem. Could you please show me what you get with conda list? That would be very helpful, as I would see the exact packages' versions in your environment.

from pggb.

alexzaccaron avatar alexzaccaron commented on August 11, 2024

Thank you @AndreaGuarracino for looking into this. Seems like I do have pggb v0.5.1 build hdfd78af_0. Here is the output of conda list:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
bc                        1.07.1               h7f98852_0    conda-forge
bcftools                  1.16                 hfe4b78e_1    bioconda
brotli                    1.0.9                h166bdaf_8    conda-forge
brotli-bin                1.0.9                h166bdaf_8    conda-forge
brotlipy                  0.7.0           py310h5764c6d_1005    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.9.24            ha878542_0    conda-forge
certifi                   2022.9.24          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310h255011f_2    conda-forge
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
click                     8.1.3           py310hff52083_1    conda-forge
coloredlogs               15.0.1          py310hff52083_2    conda-forge
colormath                 3.0.0                      py_2    conda-forge
commonmark                0.9.1                      py_0    conda-forge
contourpy                 1.0.6           py310hbf28c38_0    conda-forge
cryptography              38.0.3          py310h597c629_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
dataclasses               0.8                pyhc8e2a94_3    conda-forge
fonttools                 4.38.0          py310h5764c6d_1    conda-forge
freetype                  2.12.1               hca18f0e_0    conda-forge
future                    0.18.2          py310hff52083_5    conda-forge
gfaffix                   0.1.4                hec16e2b_0    bioconda
gsl                       2.7                  he838d99_0    conda-forge
htslib                    1.16                 h6bc39ce_0    bioconda
humanfriendly             10.0            py310hff52083_4    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        5.0.0              pyha770c72_1    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
jpeg                      9e                   h166bdaf_2    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4           py310hbf28c38_1    conda-forge
krb5                      1.19.3               h3790be6_0    conda-forge
lcms2                     2.14                 h6ed2654_0    conda-forge
ld_impl_linux-64          2.39                 hc81fddc_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_8    conda-forge
libbrotlidec              1.0.9                h166bdaf_8    conda-forge
libbrotlienc              1.0.9                h166bdaf_8    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcurl                   7.86.0               h7bff187_1    conda-forge
libdeflate                1.13                 h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libjemalloc               5.2.1                h9c3ff4c_6    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libnghttp2                1.47.0               hdcd2b5c_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpng                    1.6.38               h753d276_0    conda-forge
libsqlite                 3.39.4               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtiff                   4.4.0                h0e0dad5_3    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.4                h166bdaf_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lzstring                  1.0.4                   py_1001    conda-forge
markdown                  3.4.1              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1           py310h5764c6d_2    conda-forge
matplotlib-base           3.6.2           py310h8d5ebf3_0    conda-forge
multiqc                   1.13               pyhdfd78af_0    bioconda
munkres                   1.0.7                      py_1    bioconda
ncurses                   6.3                  h27087fc_1    conda-forge
networkx                  2.8.8              pyhd8ed1ab_0    conda-forge
numpy                     1.23.4          py310h53a5b5f_1    conda-forge
odgi                      0.8.1           py310hc8f18ef_0    bioconda
openjpeg                  2.5.0                h7d73246_1    conda-forge
openssl                   1.1.1s               h166bdaf_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
perl                      5.32.1          2_h7f98852_perl5    conda-forge
pggb                      0.5.1                hdfd78af_0    bioconda
pigz                      2.6                  h27826a3_0    conda-forge
pillow                    9.2.0           py310h454ad03_3    conda-forge
pip                       22.3.1             pyhd8ed1ab_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygments                  2.13.0             pyhd8ed1ab_0    conda-forge
pyopenssl                 22.1.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1           py310hff52083_5    conda-forge
python                    3.10.6          h582c2e5_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    2_cp310    conda-forge
pyyaml                    6.0             py310h5764c6d_5    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.1             pyhd8ed1ab_1    conda-forge
rich                      12.6.0             pyhd8ed1ab_0    conda-forge
rich-click                1.5.2              pyhd8ed1ab_0    conda-forge
seqwish                   0.7.7                h5b5514e_0    bioconda
setuptools                65.5.1             pyhd8ed1ab_0    conda-forge
simplejson                3.17.6          py310h5764c6d_2    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smoothxg                  0.6.7                hfb1f815_0    bioconda
spectra                   0.0.11                     py_1    conda-forge
tabix                     1.11                 hdfd78af_0    bioconda
time                      1.8                  h516909a_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
typing_extensions         4.4.0              pyha770c72_0    conda-forge
tzdata                    2022f                h191b570_0    conda-forge
unicodedata2              15.0.0          py310h5764c6d_0    conda-forge
urllib3                   1.26.11            pyhd8ed1ab_0    conda-forge
vg                        1.40.0               h9ee0642_0    bioconda
wfmash                    0.10.0               hfdddef0_1    bioconda
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zipp                      3.10.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h6239696_4    conda-forge

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

@alexzaccaron, it is very useful! You have the last wfmash, version that I was hoping to solve the problem. I can't reproduce wfmash's signal 4 right now. Which operating system and CPU do you have?

from pggb.

alexzaccaron avatar alexzaccaron commented on August 11, 2024

@AndreaGuarracino, I am using Ubuntu 20.04.5 LTS, and the processor:

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   46 bits physical, 48 bits virtual
CPU(s):                          24
On-line CPU(s) list:             0-23
Thread(s) per core:              2
Core(s) per socket:              6
Socket(s):                       2
NUMA node(s):                    2
Vendor ID:                       GenuineIntel
CPU family:                      6
Model:                           45
Model name:                      Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Hi @alexzaccaron, I have just updated wfmash on bioconda. Hopefully it solves your Signal 4 problem.

wfmash 0.10.0 hfdddef0_2 bioconda

(to note the '_2' in the build number). If you can, please try again with this wfmash version and let me know.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

(@alexzaccaron, if you can update pggb too, I have fixed also the pggb --version problem on bioconda. pggb 0.5.1 hdfd78af_1 bioconda, to note the '_1' in the build number)

from pggb.

alexzaccaron avatar alexzaccaron commented on August 11, 2024

@AndreaGuarracino, unfortunately I still getting the signal 4 problem when using the Bioconda version, even with your latest update.

conda create -y -n pggb pggb=0.5.1=hdfd78af_1 wfmash=0.10.0=hfdddef0_2
conda activate pggb
pggb -i data/HLA/DRB1-3123.fa.gz -p 70 -s 500 -n 10 -t 16 -V 'gi|568815561:#' -o out -M

Output to terminal:

[wfmash::map] Reference = [data/HLA/DRB1-3123.fa.gz]
[wfmash::map] Query = [data/HLA/DRB1-3123.fa.gz]
[wfmash::map] Kmer size = 19
[wfmash::map] Window size = 19
[wfmash::map] Segment length = 500 (read split allowed)
[wfmash::map] Block length min = 25000
[wfmash::map] Chaining gap max = 100000
[wfmash::map] Percentage identity threshold = 70%
[wfmash::map] Skip self mappings
[wfmash::map] Mapping output file = /dev/stdout
[wfmash::map] Filter mode = 1 (1 = map, 2 = one-to-one, 3 = none)
[wfmash::map] Execution threads  = 16
[wfmash::skch::Sketch::build] minimizers picked from reference = 17592
[wfmash::skch::Sketch::index] unique minimizers = 4432
[wfmash::skch::Sketch::computeFreqHist] Frequency histogram of minimizers = (1, 34) ... (22, 2)
[wfmash::skch::Sketch::computeFreqHist] With threshold 0.001%, consider all minimizers during lookup.
[wfmash::map] time spent computing the reference index: 0.0830125 sec
[wfmash::skch::Map::mapQuery] mapped 100.00% @ 3.24e+05 bp/s elapsed: 00:00:00:00 remain: 00:00:00:00
[wfmash::skch::Map::mapQuery] count of mapped reads = 11, reads qualified for mapping = 12, total input reads = 12, total input bp = 163416
[wfmash::map] time spent mapping the query: 5.07e-01 sec
[wfmash::map] mapping results saved in: /dev/stdout
wfmash -s 500 -l 25000 -p 70 -n 9 -k 19 -H 0.001 -X -t 16 --tmp-base out data/HLA/DRB1-3123.fa.gz --approx-map
2.92s user 1.54s system 732% cpu 0.61s total 21384Kb max memory
[wfmash::align] Reference = [data/HLA/DRB1-3123.fa.gz]
[wfmash::align] Query = [data/HLA/DRB1-3123.fa.gz]
[wfmash::align] Mapping file = out/wfmash-65xxwE
[wfmash::align] Alignment identity cutoff = 0%
[wfmash::align] Alignment output file = /dev/stdout
[wfmash::align] time spent loading the reference index: 0.00216376 sec
[wfmash::align::computeAlignments] aligned 100.00% @ 7.06e+05 bp/s elapsed: 00:00:00:01 remain: 00:00:00:00
[wfmash::align::computeAlignments] count of mapped reads = 12, total aligned bp = 708095
[wfmash::align] time spent computing the alignment: 1.01e+00 sec
[wfmash::align] alignment results saved in: /dev/stdout
wfmash -s 500 -l 25000 -p 70 -n 9 -k 19 -H 0.001 -X -t 16 --tmp-base out data/HLA/DRB1-3123.fa.gz -i out/DRB1-3123.fa.gz.f38ac34.mappings.wfmash.paf
6.63s user 1.29s system 770% cpu 1.02s total 141776Kb max memory
Command terminated by signal 4
seqwish -s data/HLA/DRB1-3123.fa.gz -p out/DRB1-3123.fa.gz.f38ac34.alignments.wfmash.paf -k 19 -f 0 -g out/DRB1-3123.fa.gz.f38ac34.417fcdf.seqwish.gfa -B 10000000 -t 16 --temp-dir out -P
0.00s user 0.00s system 1% cpu 0.22s total 5108Kb max memory

Output of conda list:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
bc                        1.07.1               h7f98852_0    conda-forge
bcftools                  1.16                 hfe4b78e_1    bioconda
brotli                    1.0.9                h166bdaf_8    conda-forge
brotli-bin                1.0.9                h166bdaf_8    conda-forge
brotlipy                  0.7.0           py310h5764c6d_1005    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.9.24            ha878542_0    conda-forge
certifi                   2022.9.24          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310h255011f_2    conda-forge
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
click                     8.1.3           unix_pyhd8ed1ab_2    conda-forge
coloredlogs               15.0.1             pyhd8ed1ab_3    conda-forge
colormath                 3.0.0                      py_2    conda-forge
commonmark                0.9.1                      py_0    conda-forge
contourpy                 1.0.6           py310hbf28c38_0    conda-forge
cryptography              38.0.3          py310h597c629_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
dataclasses               0.8                pyhc8e2a94_3    conda-forge
fonttools                 4.38.0          py310h5764c6d_1    conda-forge
freetype                  2.12.1               hca18f0e_0    conda-forge
future                    0.18.2             pyhd8ed1ab_6    conda-forge
gfaffix                   0.1.4                hec16e2b_0    bioconda
gsl                       2.7                  he838d99_0    conda-forge
htslib                    1.16                 h6bc39ce_0    bioconda
humanfriendly             10.0            py310hff52083_4    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        5.0.0              pyha770c72_1    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
jpeg                      9e                   h166bdaf_2    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4           py310hbf28c38_1    conda-forge
krb5                      1.19.3               h3790be6_0    conda-forge
lcms2                     2.14                 h6ed2654_0    conda-forge
ld_impl_linux-64          2.39                 hc81fddc_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_8    conda-forge
libbrotlidec              1.0.9                h166bdaf_8    conda-forge
libbrotlienc              1.0.9                h166bdaf_8    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcurl                   7.86.0               h7bff187_1    conda-forge
libdeflate                1.13                 h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libjemalloc               5.2.1                h9c3ff4c_6    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libnghttp2                1.47.0               hdcd2b5c_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpng                    1.6.38               h753d276_0    conda-forge
libsqlite                 3.39.4               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtiff                   4.4.0                h0e0dad5_3    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.4                h166bdaf_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lzstring                  1.0.4                   py_1001    conda-forge
markdown                  3.4.1              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1           py310h5764c6d_2    conda-forge
matplotlib-base           3.6.2           py310h8d5ebf3_0    conda-forge
multiqc                   1.13               pyhdfd78af_0    bioconda
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
networkx                  2.8.8              pyhd8ed1ab_0    conda-forge
numpy                     1.23.4          py310h53a5b5f_1    conda-forge
odgi                      0.8.1           py310hc8f18ef_0    bioconda
openjpeg                  2.5.0                h7d73246_1    conda-forge
openssl                   1.1.1s               h166bdaf_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
perl                      5.32.1          2_h7f98852_perl5    conda-forge
pggb                      0.5.1                hdfd78af_1    bioconda
pigz                      2.6                  h27826a3_0    conda-forge
pillow                    9.2.0           py310h454ad03_3    conda-forge
pip                       22.3.1             pyhd8ed1ab_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygments                  2.13.0             pyhd8ed1ab_0    conda-forge
pyopenssl                 22.1.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.10.6          h582c2e5_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    2_cp310    conda-forge
pyyaml                    6.0             py310h5764c6d_5    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.1             pyhd8ed1ab_1    conda-forge
rich                      12.6.0             pyhd8ed1ab_0    conda-forge
rich-click                1.5.2              pyhd8ed1ab_0    conda-forge
seqwish                   0.7.7                h5b5514e_0    bioconda
setuptools                65.5.1             pyhd8ed1ab_0    conda-forge
simplejson                3.18.0          py310h5764c6d_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smoothxg                  0.6.7                hfb1f815_0    bioconda
spectra                   0.0.11                     py_1    conda-forge
tabix                     1.11                 hdfd78af_0    bioconda
time                      1.8                  h516909a_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
typing_extensions         4.4.0              pyha770c72_0    conda-forge
tzdata                    2022f                h191b570_0    conda-forge
unicodedata2              15.0.0          py310h5764c6d_0    conda-forge
urllib3                   1.26.11            pyhd8ed1ab_0    conda-forge
vg                        1.40.0               h9ee0642_0    bioconda
wfmash                    0.10.0               hfdddef0_2    bioconda
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zipp                      3.10.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h6239696_4    conda-forge

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Thank you, @alexzaccaron. With your debugging and reporting here all instructions and outputs, you are helping me a lot.

So, wfmash worked, and now pggb breaks with seqwish. I have just updated on bioconda both seqwish (seqwish-0.7.7-h5b5514e_1) and smoothxg (smoothxg-0.6.7-hfb1f815_1), the step after seqwish. I have applied the same fix that worked for wfmash.

If you can update both of them and re-re-try again, please let me know if now pggb works on your system.

from pggb.

alexzaccaron avatar alexzaccaron commented on August 11, 2024

@AndreaGuarracino, no problem. I should be thanking you for maintaining pggb. Your latest updates fixed the signal 4 problem! The following code finished successfully.

conda create -y -n pggb pggb=0.5.1=hdfd78af_1 wfmash=0.10.0=hfdddef0_2 seqwish=0.7.7=h5b5514e_1 smoothxg=0.6.7=hfb1f815_1
conda activate pggb
pggb -i data/HLA/DRB1-3123.fa.gz -p 70 -s 500 -n 10 -t 16 -V 'gi|568815561:#' -o out -M

Just for the record, here is the output of conda list:

# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       2_gnu    conda-forge
bc                        1.07.1               h7f98852_0    conda-forge
bcftools                  1.16                 hfe4b78e_1    bioconda
brotli                    1.0.9                h166bdaf_8    conda-forge
brotli-bin                1.0.9                h166bdaf_8    conda-forge
brotlipy                  0.7.0           py310h5764c6d_1005    conda-forge
bzip2                     1.0.8                h7f98852_4    conda-forge
c-ares                    1.18.1               h7f98852_0    conda-forge
ca-certificates           2022.9.24            ha878542_0    conda-forge
certifi                   2022.9.24          pyhd8ed1ab_0    conda-forge
cffi                      1.15.1          py310h255011f_2    conda-forge
charset-normalizer        2.1.1              pyhd8ed1ab_0    conda-forge
click                     8.1.3           unix_pyhd8ed1ab_2    conda-forge
coloredlogs               15.0.1             pyhd8ed1ab_3    conda-forge
colormath                 3.0.0                      py_2    conda-forge
commonmark                0.9.1                      py_0    conda-forge
contourpy                 1.0.6           py310hbf28c38_0    conda-forge
cryptography              38.0.3          py310h597c629_0    conda-forge
cycler                    0.11.0             pyhd8ed1ab_0    conda-forge
dataclasses               0.8                pyhc8e2a94_3    conda-forge
fonttools                 4.38.0          py310h5764c6d_1    conda-forge
freetype                  2.12.1               hca18f0e_0    conda-forge
future                    0.18.2             pyhd8ed1ab_6    conda-forge
gfaffix                   0.1.4                hec16e2b_0    bioconda
gsl                       2.7                  he838d99_0    conda-forge
htslib                    1.16                 h6bc39ce_0    bioconda
humanfriendly             10.0            py310hff52083_4    conda-forge
idna                      2.10               pyh9f0ad1d_0    conda-forge
importlib-metadata        5.0.0              pyha770c72_1    conda-forge
jinja2                    3.1.2              pyhd8ed1ab_1    conda-forge
jpeg                      9e                   h166bdaf_2    conda-forge
keyutils                  1.6.1                h166bdaf_0    conda-forge
kiwisolver                1.4.4           py310hbf28c38_1    conda-forge
krb5                      1.19.3               h3790be6_0    conda-forge
lcms2                     2.14                 h6ed2654_0    conda-forge
ld_impl_linux-64          2.39                 hc81fddc_0    conda-forge
lerc                      4.0.0                h27087fc_0    conda-forge
libblas                   3.9.0           16_linux64_openblas    conda-forge
libbrotlicommon           1.0.9                h166bdaf_8    conda-forge
libbrotlidec              1.0.9                h166bdaf_8    conda-forge
libbrotlienc              1.0.9                h166bdaf_8    conda-forge
libcblas                  3.9.0           16_linux64_openblas    conda-forge
libcurl                   7.86.0               h7bff187_1    conda-forge
libdeflate                1.13                 h166bdaf_0    conda-forge
libedit                   3.1.20191231         he28a2e2_2    conda-forge
libev                     4.33                 h516909a_1    conda-forge
libffi                    3.4.2                h7f98852_5    conda-forge
libgcc-ng                 12.2.0              h65d4601_19    conda-forge
libgfortran-ng            12.2.0              h69a702a_19    conda-forge
libgfortran5              12.2.0              h337968e_19    conda-forge
libgomp                   12.2.0              h65d4601_19    conda-forge
libjemalloc               5.2.1                h9c3ff4c_6    conda-forge
liblapack                 3.9.0           16_linux64_openblas    conda-forge
libnghttp2                1.47.0               hdcd2b5c_1    conda-forge
libnsl                    2.0.0                h7f98852_0    conda-forge
libopenblas               0.3.21          pthreads_h78a6416_3    conda-forge
libpng                    1.6.38               h753d276_0    conda-forge
libsqlite                 3.39.4               h753d276_0    conda-forge
libssh2                   1.10.0               haa6b8db_3    conda-forge
libstdcxx-ng              12.2.0              h46fd767_19    conda-forge
libtiff                   4.4.0                h0e0dad5_3    conda-forge
libuuid                   2.32.1            h7f98852_1000    conda-forge
libwebp-base              1.2.4                h166bdaf_0    conda-forge
libxcb                    1.13              h7f98852_1004    conda-forge
libzlib                   1.2.13               h166bdaf_4    conda-forge
lzstring                  1.0.4                   py_1001    conda-forge
markdown                  3.4.1              pyhd8ed1ab_0    conda-forge
markupsafe                2.1.1           py310h5764c6d_2    conda-forge
matplotlib-base           3.6.2           py310h8d5ebf3_0    conda-forge
multiqc                   1.13               pyhdfd78af_0    bioconda
munkres                   1.1.4              pyh9f0ad1d_0    conda-forge
ncurses                   6.3                  h27087fc_1    conda-forge
networkx                  2.8.8              pyhd8ed1ab_0    conda-forge
numpy                     1.23.4          py310h53a5b5f_1    conda-forge
odgi                      0.8.1           py310hc8f18ef_0    bioconda
openjpeg                  2.5.0                h7d73246_1    conda-forge
openssl                   1.1.1s               h166bdaf_0    conda-forge
packaging                 21.3               pyhd8ed1ab_0    conda-forge
perl                      5.32.1          2_h7f98852_perl5    conda-forge
pggb                      0.5.1                hdfd78af_1    bioconda
pigz                      2.6                  h27826a3_0    conda-forge
pillow                    9.2.0           py310h454ad03_3    conda-forge
pip                       22.3.1             pyhd8ed1ab_0    conda-forge
pthread-stubs             0.4               h36c2ea0_1001    conda-forge
pycparser                 2.21               pyhd8ed1ab_0    conda-forge
pygments                  2.13.0             pyhd8ed1ab_0    conda-forge
pyopenssl                 22.1.0             pyhd8ed1ab_0    conda-forge
pyparsing                 3.0.9              pyhd8ed1ab_0    conda-forge
pysocks                   1.7.1              pyha2e5f31_6    conda-forge
python                    3.10.6          h582c2e5_0_cpython    conda-forge
python-dateutil           2.8.2              pyhd8ed1ab_0    conda-forge
python_abi                3.10                    2_cp310    conda-forge
pyyaml                    6.0             py310h5764c6d_5    conda-forge
readline                  8.1.2                h0f457ee_0    conda-forge
requests                  2.28.1             pyhd8ed1ab_1    conda-forge
rich                      12.6.0             pyhd8ed1ab_0    conda-forge
rich-click                1.5.2              pyhd8ed1ab_0    conda-forge
seqwish                   0.7.7                h5b5514e_1    bioconda
setuptools                65.5.1             pyhd8ed1ab_0    conda-forge
simplejson                3.18.0          py310h5764c6d_0    conda-forge
six                       1.16.0             pyh6c4a22f_0    conda-forge
smoothxg                  0.6.7                hfb1f815_1    bioconda
spectra                   0.0.11                     py_1    conda-forge
tabix                     1.11                 hdfd78af_0    bioconda
time                      1.8                  h516909a_0    conda-forge
tk                        8.6.12               h27826a3_0    conda-forge
typing_extensions         4.4.0              pyha770c72_0    conda-forge
tzdata                    2022f                h191b570_0    conda-forge
unicodedata2              15.0.0          py310h5764c6d_0    conda-forge
urllib3                   1.26.11            pyhd8ed1ab_0    conda-forge
vg                        1.40.0               h9ee0642_0    bioconda
wfmash                    0.10.0               hfdddef0_2    bioconda
wheel                     0.38.4             pyhd8ed1ab_0    conda-forge
xorg-libxau               1.0.9                h7f98852_0    conda-forge
xorg-libxdmcp             1.1.3                h7f98852_0    conda-forge
xz                        5.2.6                h166bdaf_0    conda-forge
yaml                      0.2.5                h7f98852_2    conda-forge
zipp                      3.10.0             pyhd8ed1ab_0    conda-forge
zlib                      1.2.13               h166bdaf_4    conda-forge
zstd                      1.5.2                h6239696_4    conda-forge

Cheers.

from pggb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.