Comments (6)
I have another issue with --n-hap 4
. It in fact output 8 haplotypes in size (2.8 Gb in total). While using default --n-hap 2
, it output 1.5 Gb which is expected for our autotretaploid genome. However, the 1.5 Gb have missed some large regions (homoeologous collaspe), as confirmed by aligning with the reference and analyzing the coverage depth.
from hifiasm.
The same h1tg
for all haplotypes is a known bug when we use hifiasm
for tetraploid potato. But I never saw hifiasm
will output 8 haplotypes when you use --n-hap 4
. Do you have all the logs for this run? HiC-based phasing for polyploidy is still very unstable as I know, it depends on the heterozygous variants distribution of autotetraploid
from hifiasm.
Yes, I agree with @baozg. Do you have the log file for hifiasm?
from hifiasm.
Hi, I ran into the same problem, my genome is a triploid, kmer predicts the genome size to be around 700M for a single haplotye, and whole genome size should be 2~2.1G, when I use version 0.19.5-r587 with the parameter "--n-hap 3 --h1 hic_R1.fastq --h2 hic_R2. fastq" , the result is hifi.hic.hap1.p_ctg.gfa.fa,1.5G; hifi.hic.hap2.p_ctg.gfa.fa,1008M; hifi.hic.hap3.p_ctg.gfa.fa,825M; hifi.hic.p_ctg.gfa.fa; and hifi.hic.p_ctg.gfa.fa. ctg.gfa.fa,1.5G; hifi.hic.p_utg.gfa.fa,2.3G; homozygous read coverage threshold: 33. Then when I add "--hom-cov 17", the result is hifi .hic.hap1.p_ctg.gfa.fa,2.0G; hifi.hic.hap2.p_ctg.gfa.fa,2.0G; hifi.hic.hap3.p_ctg.gfa.fa,2.0G; hifi.hic.p_ctg.gfa.fa,2.1G; hifi.hic.p_ utg.gfa.fa,2.3G. According to the size of each hap, it looks like that each hap contains all 3 sets of sequences. Is it possible that I am using the parameters incorrectly?
Also, when I use version 0.16.1-r375 with parameter "--n-hap 3 --h1 hic_R1.fastq --h2 hic_R2.fastq" , the result is hifi_hic.hic.hap1.p_ctg.fa,657M
hifi_hic.hic.hap2.p_ctg.fa,1.5G; hifi_hic.hic.p_ctg.gfa.fa,1.5G; hifi_hic.hic.p_utg.fa,2.2G; hifi_hic.hic.r_utg.gfa.fa,2.2G; and its hap1 and hap2 sizes are consistent with the state of my AAB triploid genome. When I use p_utg for 3ddna, the sequence is too fragmented and there are collapsed regions. So I combined hap1 and hap2, and then run with 3ddna. It seems to work well from the results, I wonder if my way of combining hap1 and hap2 to go to mount is appropriate?
from hifiasm.
HiC phased triploid assembly is still tricky. If --n-hap 3
doesn't work well, could you please have a try with the normal diploid assembly, and then take 3d-dna to mannually fix the duplications?
from hifiasm.
Much thanks, I think there may also be a problem with my understanding of the “hom cov”, when I change the parameter to "--n-hap 3 --hom-cov 51", the total size is as expected but there are indeed duplicates, which occasionally occurs when I am using the diploid mode of 0.16.1-r375, utilizing "hap1+hap2 " mounted, and I wonder about the possible reasons for this occurrence?
Overall, i think there are four options now: which one do you recommend more?
- "0.16.1-r375's p-utg", which is very fragmented, with a large number of collapsed regions;
- "0.16.1-r375's hap1+hap2 ", with localized duplications;
- "p-utg of 0.19-5", which is very fragmented too, and much larger in size than "p-utg of 0.16.1-r375";
- "hap1+hap2+hap3 of 0.19-5", with localized duplicates.
from hifiasm.
Related Issues (20)
- issue in generating hap1 and hap2 asm files
- larger assembly size than kmer estimation genome size HOT 1
- larger assembly size than kmer estimation genome size HOT 2
- Why more contigs always present in haplotype 1 than haplotype 2? HOT 2
- overlap parameter HOT 2
- How do you assemble chromosomes X and Y? HOT 3
- Add Options for Pore-C Data HOT 1
- Output interpretation with HiFi+ONT+HiC with inbred samples + `-l0` HOT 1
- low BUSCO scores HOT 1
- Mitigate Overlapping Sequence Assignments in Haplotypes HOT 3
- Help!!! Segmentation fault (core dumped) HOT 1
- Question about the depth of ONT ultra-long reads HOT 1
- Homotetraploid, super-large genome, with different parameters, the size of p_utg varies greatly? HOT 2
- setting K parameter in yak HOT 2
- how to make the correct genome size estimation for allotetraploid species? HOT 2
- Possible missing one haplotype in human assemblies HOT 2
- No haploid.gfa files output in trio-binning mode HOT 3
- Hifi + Hi-c + ONT assembly fails
- In Trio-binning, always more on hap1 despite (almost) same sequences for paternal and maternal
- discontinuous assembly with shorter pacbio hifi reads but high coverage HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from hifiasm.