Giter Club home page Giter Club logo

Comments (2)

lh3 avatar lh3 commented on September 27, 2024

On the representation of alt contigs, I think we should develop a best practice before modifying the spec. What is the intended output from variant callers? Is it practical for callers to generate such output? How downstream tools are supposed to use the vcf?

Specifically, you proposed to add HT, but in my experience, alt contigs frequently recombine with each other, which makes the tag not applicable most of times. In addition, how are we supposed to use ALTLOCS? If we know a locus overlapping an alt contig, what can we do with it?

We will be clearer about the answers and then develop the right spec when more researchers get experiences on h38. Tools determine the adoption of alt contigs. It is not urgent to change the spec.

from hts-specs.

deannachurch avatar deannachurch commented on September 27, 2024

I think this is a bit of a chicken and egg problem. If we want variant callers to be able to use the Alt loci, we need to be able to express the variants in VCF. This doesn't work well with the current spec (see how dbSNP distributes data).

I think the issue is, there are multiple ways to use VCF- it is just a reporting tool. dbSNP uses it to dump data- so you want to report all genomic contexts for a given SNV. An argument could be made that in the context of an individual genome, you may only want to report one context for a SNP- but how do you handle that when you have multiple samples in the VCF? I fear that decision making will be hard.
I agree the trying to define some best practices is useful.
To attempt to address some specific issues:

  • knowing an alt-locus is allelic with a region on the chromosome let's you put the alt-locus in chromsome context which is useful for reporting. Granted, this could be done as some sort of post-processing step, but then you'd have to convert the data to some other format (which may be OK, but likely inconvenient for everyone who wants VCF).
  • I agree there are few loci with named haplotypes, but for the ones that exist, this is useful information.

This is really meant to start the discussion about how we want to represent variation on GRCh38. It will be good to have some concrete examples.

from hts-specs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.