When using --n-hap 4 , all the four <code class="notr

Yes, I agree with <a class="user-mention notranslate" data-hovercard-type="user" data-

Duplicated sequence "h1tg000001l" when `--n-hap 4 ` about hifiasm HOT 6 OPEN

zhangrengang commented on September 25, 2024

Duplicated sequence "h1tg000001l" when `--n-hap 4 `

from hifiasm.

Comments (6)

zhangrengang commented on September 25, 2024

I have another issue with --n-hap 4. It in fact output 8 haplotypes in size (2.8 Gb in total). While using default --n-hap 2, it output 1.5 Gb which is expected for our autotretaploid genome. However, the 1.5 Gb have missed some large regions (homoeologous collaspe), as confirmed by aligning with the reference and analyzing the coverage depth.

from hifiasm.

baozg commented on September 25, 2024

The same h1tg for all haplotypes is a known bug when we use hifiasm for tetraploid potato. But I never saw hifiasm will output 8 haplotypes when you use --n-hap 4. Do you have all the logs for this run? HiC-based phasing for polyploidy is still very unstable as I know, it depends on the heterozygous variants distribution of autotetraploid

from hifiasm.

chhylp123 commented on September 25, 2024

Yes, I agree with @baozg. Do you have the log file for hifiasm?

from hifiasm.

monian1113 commented on September 25, 2024

Hi, I ran into the same problem, my genome is a triploid, kmer predicts the genome size to be around 700M for a single haplotye, and whole genome size should be 2~2.1G, when I use version 0.19.5-r587 with the parameter "--n-hap 3 --h1 hic_R1.fastq --h2 hic_R2. fastq" , the result is hifi.hic.hap1.p_ctg.gfa.fa,1.5G; hifi.hic.hap2.p_ctg.gfa.fa,1008M; hifi.hic.hap3.p_ctg.gfa.fa,825M; hifi.hic.p_ctg.gfa.fa; and hifi.hic.p_ctg.gfa.fa. ctg.gfa.fa,1.5G; hifi.hic.p_utg.gfa.fa,2.3G; homozygous read coverage threshold: 33. Then when I add "--hom-cov 17", the result is hifi .hic.hap1.p_ctg.gfa.fa,2.0G; hifi.hic.hap2.p_ctg.gfa.fa,2.0G; hifi.hic.hap3.p_ctg.gfa.fa,2.0G; hifi.hic.p_ctg.gfa.fa,2.1G; hifi.hic.p_ utg.gfa.fa,2.3G. According to the size of each hap, it looks like that each hap contains all 3 sets of sequences. Is it possible that I am using the parameters incorrectly?

Also, when I use version 0.16.1-r375 with parameter "--n-hap 3 --h1 hic_R1.fastq --h2 hic_R2.fastq" , the result is hifi_hic.hic.hap1.p_ctg.fa,657M
hifi_hic.hic.hap2.p_ctg.fa,1.5G; hifi_hic.hic.p_ctg.gfa.fa,1.5G; hifi_hic.hic.p_utg.fa,2.2G; hifi_hic.hic.r_utg.gfa.fa,2.2G; and its hap1 and hap2 sizes are consistent with the state of my AAB triploid genome. When I use p_utg for 3ddna, the sequence is too fragmented and there are collapsed regions. So I combined hap1 and hap2, and then run with 3ddna. It seems to work well from the results, I wonder if my way of combining hap1 and hap2 to go to mount is appropriate?

from hifiasm.

chhylp123 commented on September 25, 2024

HiC phased triploid assembly is still tricky. If --n-hap 3 doesn't work well, could you please have a try with the normal diploid assembly, and then take 3d-dna to mannually fix the duplications?

from hifiasm.

monian1113 commented on September 25, 2024

Much thanks, I think there may also be a problem with my understanding of the “hom cov”, when I change the parameter to "--n-hap 3 --hom-cov 51"， the total size is as expected but there are indeed duplicates, which occasionally occurs when I am using the diploid mode of 0.16.1-r375, utilizing "hap1+hap2 " mounted, and I wonder about the possible reasons for this occurrence?

Overall, i think there are four options now: which one do you recommend more?

"0.16.1-r375's p-utg", which is very fragmented, with a large number of collapsed regions;
"0.16.1-r375's hap1+hap2 ", with localized duplications;
"p-utg of 0.19-5", which is very fragmented too, and much larger in size than "p-utg of 0.16.1-r375";
"hap1+hap2+hap3 of 0.19-5", with localized duplicates.

from hifiasm.

Duplicated sequence "h1tg000001l" when `--n-hap 4 ` about hifiasm HOT 6 OPEN

Comments (6)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent