Giter Club home page Giter Club logo

Comments (11)

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

You can't use the wfmash-xxxx file as -a/--input-paf because it is a temporary file of wfmash that contains only the mappings, that is the regions to align, so there are no CIGAR strings in it. seqwish warns you of this ([seqwish] WARNING: input alignment file wfmash-3TaQ4Q does not have CIGAR strings). Moreover, it seems that such a file presents invalid information in it, which is triggering the error. Try running pggb by using the output of wfmash (in your case, it should be called output/wfash-3TaQ4Q.paf).

from pggb.

Boer223 avatar Boer223 commented on August 11, 2024

@AndreaGuarracino
Thank you for your quickly reply! But when I use pggb -i 19-genomes.merge.fa -n 19 -o output -p 90 -s 100000 -t 5 -T 5 -M -Z to create the pan-genome graph, it cannot generate the paf file. There is only a wfmash-3TaQ4Q temp file.

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

Weird, or maybe you haven't waited long enough. What does the estimated mapping and alignment time say in the log? I suggest reducing -s 50000 and waiting a bit more. If the problem persists, please share the output/...log file.

from pggb.

Boer223 avatar Boer223 commented on August 11, 2024

It occurs the following log at last.

[E::fai_load3_core] Failed to open FASTA file 19-genomes.merge.fa
wfmash -X -s 100000 -p 90 -n 18 -t 16 19-genomes.merge.fa 19-genomes.merge.fa
15440.41s user 792.33s system 1172% cpu 1384.73s total 7245936Kb max memory

from pggb.

Boer223 avatar Boer223 commented on August 11, 2024

log.txt

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024
[E::fai_load3_core] Failed to open FASTA file 19-genomes.merge.fa

It is not able to see the FASTA file in input, very strange. Can I see your 19-genomes.merge.fa.fai file too? And also head /home/cuixb/data/analysis_data/graph-pan-genome/pggb-result/wfmash-3TaQ4Q?

from pggb.

Boer223 avatar Boer223 commented on August 11, 2024

19-genomes.merge.fa.fai file:
19-genomes.merge.fa.zip

head of wfmash-3TaQ4Q file:

Darmor_v10#1#A01	32958928	27800000	28300000	+	Darmor_v10#1#C01	48239358	47247687	47879060	5741	631373	10	id:f:90.9308
Darmor_v10#1#A01	32958928	0	3800000	+	Darmor_v5#1#chrC01	38829317	850	4733913	44055	4733063	12	id:f:93.0793
Darmor_v10#1#A01	32958928	27000000	29700000	+	Darmor_v5#1#chrC01	38829317	35738139	38333401	25321	2700000	12	id:f:93.7809
Darmor_v10#1#A01	32958928	30500000	31200000	+	Darmor_v5#1#chrC01	38829317	38267435	38823342	6814	700000	16	id:f:97.3442
Darmor_v10#1#A01	32958928	29900000	30500000	-	Darmor_v5#1#chrAnn_random	48658326	1918964	2515790	5847	600000	16	id:f:97.4553
Darmor_v10#1#A01	32958928	15700000	16300000	-	Darmor_v5#1#chrAnn_random	48658326	3155785	3717399	5876	600000	17	id:f:97.9259
Darmor_v10#1#A01	32958928	27800000	28300000	+	Express617#1#chrC01	44118044	38888171	39510831	5664	622660	10	id:f:90.972
Darmor_v10#1#A01	32958928	28700000	29900000	+	Express617#1#chrC01	44118044	40944888	42168781	11515	1223893	12	id:f:94.0823
Darmor_v10#1#A01	32958928	27800000	28300000	+	FAFU_ZS11#1#chrC01	54641295	49487432	50101595	5581	614163	10	id:f:90.8653
Darmor_v10#1#A01	32958928	31200000	31900000	+	FAFU_ZS11#1#chrC01	54641295	50286548	50945152	6412	700000	11	id:f:91.6069

the whole wfmash-3TaQ4Q file:
wfmash-3TaQ4Q.zip

from pggb.

AndreaGuarracino avatar AndreaGuarracino commented on August 11, 2024

The FASTA index seems healthy. The input contains a lot of sequences, but I don't think (hope) that's the problem. Can you try it with other, but smaller FASTA files? With FASTA files in the same folder where your current input is, and also FASTA files present in other folders? I am wondering if there is an issue that is specific to your system. In each test, please also delete and regenerate the FASTA index, to be safe.

from pggb.

ekg avatar ekg commented on August 11, 2024

from pggb.

Boer223 avatar Boer223 commented on August 11, 2024

@ekg As you said, I have confirmed the number of sequences of the reference genome and both two files return the same value.

image

from pggb.

Boer223 avatar Boer223 commented on August 11, 2024

When I reinstall the whole environment for pggb using conda, it runs successfully without error.

from pggb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.