Comments (4)
Hello. One possibility is to use truvari bench --dup-to-ins
which will consider variants with SVTYPE==DUP
as SVTYPE==INS
. However, as duplications are not typically sequence resolved (i.e. ALT==<DUP>
instead of a sequence), you'll need to turn off sequence similarity with truvari bench --pctseq 0
. Alternatively, for some projects I've had success 'filling in' DUP sequences. For example, a DUP from chr1:1234-2234 presumably could be represented as an INS at chr1:1234-1235 with ALT equal to the reference sequence from chr1:1234-2234.
from truvari.
Thank you for your previous response. I did attempt to use the suggested parameters for my pilot runs. However, I have a hypothetical question. If I take or separate only duplications (ALT==) from any SV callers' .vcf files and benchmark them using the GIABv0.6_HG002 truth set for insertions (INS), or alternatively, if I merge INS and DUP .vcf files together, would either approach be considered correct?
from truvari.
Correct in this context is subjective. I typically don't separate/subset types of variants for many reasons. But some researchers may be interested in only DELs.
from truvari.
I'm relatively new to variant analysis studies and will be working on a project soon. In preparation, I'm considering the approach to benchmarking. I've noticed in some papers that researchers benchmark variants separately (e.g., deletions and insertions only) and others benchmark all variants together.
Considering your experience, would you suggest separating variants to achieve a more detailed Precision-Recall Curve (PRC) or testing all variants together for a comprehensive analysis? Your insights would greatly assist me in planning the benchmark strategy for my upcoming project. Many thanks.
from truvari.
Related Issues (20)
- Truvari refine fails when no regions to refine HOT 4
- TypeError, 'NoneType' and 'NoneType' HOT 2
- Failure in pip installation HOT 2
- Question: Does truvari have a upper limit on the file size? How to speed up? HOT 2
- BED Region off-by-one error HOT 4
- Zero matches between base and comp HOT 4
- AttributeError: 'CollapsedCalls' object has no attribute 'consolidate' | version 4.2.1 HOT 4
- Calculate SNV HOT 7
- complex genotype problem HOT 3
- GT integrate HOT 1
- No TP or FP calls for CNV HOT 1
- merging different SV type? HOT 3
- No FP or TP calls HOT 2
- Unable to run MAFFT HOT 9
- md5sum FIPS issue HOT 1
- Support vector for intra-sample merge HOT 6
- some questions about the results in fp.vcf.gz
- some questions about the results in fp.vcf.gz HOT 1
- Getting same numbers of TP-base and TP-comp HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from truvari.