Giter Club home page Giter Club logo

Comments (2)

ACEnglish avatar ACEnglish commented on July 3, 2024
  • __filtered is the number of variants that weren't compared to other variants due to being outside of parameter thresholds (e.g. --sizemin)
  • Wrote 172241 variants is how many variants are in your output vcf.
  • 2103726 variants collapsed is how many variants were merged into calls that were kept. They can be found in the --collapsed-output.
  • collapsed into 111970 variants is how many variants in your output VCF had >= 1 variant merged into them.

So for this example, 65% of your output variants are the representative variant from some number of merged variants. And 2103726 / 111970 or 18:1 is the ratio of how many variants collapsed into the merged representations.

I don't know if it applies to your dataset, but for variants I'm used to dealing with, 18:1 is pretty high. I suspect you've used looser thresholds (e.g. --pctsim 0). I would just caution you to look out for over merging. See our paper for details on what over merging might look like.

from truvari.

Meltpinkg avatar Meltpinkg commented on July 3, 2024

Thanks very much for your clear explanation. I've got that!
I'll check the merging according to your advice.

from truvari.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.