Dear @hannespetur
Thank you and colleagues for the very nice svimmer and graphtyper software.
I would like to use svimmer and graphtyper for forced genotyping of the UNION of Manta ( many WGS) and SVIM-ASM (few assembly) discovered SVs in many WGS samples.
SVIM-ASM github
https://github.com/eldariont/svim-asm
The versions that I am using are svimmer/20211209
and graphtyper/2.7.3
When I try to get the (merged) UNION of SVs via svimmer I get this error.
Traceback (most recent call last):
File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "/tools/eb/software/Python/3.8.6-GCCcore-10.2.0/lib/python3.8/multiprocessing/pool.py", line 51, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/svimmer", line 82, in append_svs_from_vcf
svs.append(SV(record, check_type=not args.ignore_types, join_mode=args.join_mode, output_ids=args.ids))
File "/tools/eb/software/svimmer/20211209-GCC-10.2.0/sv.py", line 75, in __init__
assert False
AssertionError
This is caused by svimmer not recognizing the DUP:TANDEM
and DUP:INT
types that SVIM-ASM outputs.
I can use the svimmer argument --ignore-types
to get svimmer to work.
But then graphtyper complains about Unknown SV type
and I guess also drops the SVs of unknown type??
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
<warning> constructor.cpp:106 Unknown SV type DUP:TANDEM
Would it be possible to add a mapping for DUP:TANDEM
and DUP:INT
in the main branch of the svimmer code here?
Then the the combination of SVIM-ASM and svimmer/graphtyper would work for me and others with the same use case/combination of tools.
I also don't understand why SVs of type DUP
, CNV
and INV
are mapped to type INS
here
|
elif info_dict["SVTYPE"] == "ALU" or info_dict["SVTYPE"] == "LINE1" or info_dict["SVTYPE"] == "SVA" or \ |
That does not make sense to me. INS
is a novel sequence , DUP
, CNV
and INV
are sequences already found on the reference genome and therefore also need to genotyped differently in graphtyper?
Also what I find strange is that both svimmer and graphtyper do output SVs of type DUP.
That I can't square with the mapping of DUP, CNV and INV to INS. Or maybe the SV type is re-calculated again somewhere else in svimmer/graphtyper?
Thank you for your thoughts and help on this.