Giter Club home page Giter Club logo

Comments (10)

mp15 avatar mp15 commented on June 18, 2024

I’ve just dug up an old discussion on this:
http://sourceforge.net/p/vcftools/mailman/message/27623814/

Perhaps we should incorporate Richard’s suggestion that these should be converted into N’s into the spec so that how to handle this is clear across all implementations?

from hts-specs.

auton1 avatar auton1 commented on June 18, 2024

I think the concerns from the previous discussion remain valid. I agree the spec should highlight this situation.

from hts-specs.

eitanbanks avatar eitanbanks commented on June 18, 2024

I really don't like adding IUPAC ambiguity codes to the REF.
I'm fine with pretty much any other solution.

from hts-specs.

pd3 avatar pd3 commented on June 18, 2024

For those who are not subscribed to the vcftools-spec mailing list, here are some of the relevant emails:
http://sourceforge.net/p/vcftools/mailman/message/34096206/
http://sourceforge.net/p/vcftools/mailman/message/34102529/
http://sourceforge.net/p/vcftools/mailman/message/34101323/

The options discussed offline and on the mailing list were these:

How to treat ambiguity codes in the reference sequence

  1. allow IUPAC codes in REF
  2. replace with N
  3. replace with one of the bases, for example, R=A/G becomes A
  4. both 1 or 2 are possible

How to treat abiguity codes in ALT

  1. allow IUPAC codes in ALT
  2. use symbolic alleles, such as <R>,<S>, etc

Can people +1/-1 their preferred options?

from hts-specs.

eitanbanks avatar eitanbanks commented on June 18, 2024

I like options 3 (for REF) and 2 (for ALT).

from hts-specs.

auton1 avatar auton1 commented on June 18, 2024

I think allow ambiguity in the REF is asking for trouble.

I vote 3 for REF.

For ALT, I have a slight preference for 2, but don't feel very strongly
about it.

On Thu, May 21, 2015 at 9:04 AM, Eric Banks [email protected]
wrote:

I like options 3 (for REF) and 2 (for ALT).


Reply to this email directly or view it on GitHub
#54 (comment).

Adam Auton
Assistant Professor,
Department of Genetics,
Albert Einstein College of Medicine,
1301 Morris Park Avenue,
Price Center, Room 353B,
Bronx, New York 10461

Tel: +1 (718) 678 1150

from hts-specs.

vadimzalunin avatar vadimzalunin commented on June 18, 2024

+1 to option 2: replace with N

from hts-specs.

cyenyxe avatar cyenyxe commented on June 18, 2024

I find point 1.4.1.4 a bit confusing from a tool developer perspective. When it states that IUPAC codes in the REF could be "reduced", does that mean that tools should still accept them and run the transformation themselves? Otherwise, being the file authors responsibility, it could be described like:

IUPAC ambiguity codes in the reference sequence must be reduced to a concrete base by using
the one that is first alphabetically (thus R=A/G as a reference base is converted to A in VCF.)

from hts-specs.

pd3 avatar pd3 commented on June 18, 2024

@cyenyxe You are right, "must be reduced" is what I meant. Thank you.

from hts-specs.

sjackman avatar sjackman commented on June 18, 2024

I've just run into this issue myself. My assembled genome of short reads has 9 sites with ambiguity codes. When I align the reads to the reference, it's clear what the true base should be. When I call variants using bcftools call -c --ploidy=1 -Oz these variants sites are not output by bcftools.

I'm okay with either the convert to N or convert to the first alphabetical nucleotide of the ambiguity code. My preference is to convert to N.

from hts-specs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.