Comments (5)
Hello @GSYongWu,
Thanks for your query and sorry for the late reply.
From the HGVS notation you provided, I can infer that you are using GRCh37 assembly with refseq cache. The RefSeq transcript do not necessarily always match the reference assembly. For that reason when we provide VEP annotation we need alignment information and RefSeq (an external source to Ensembl) provide it for us. See here -
https://www.ensembl.org/info/docs/tools/vep/script/vep_other.html#refseq_bam
In e112 we have updated our cache with the new alignment file from NCBI, that is why you are seeing this change.
Best regards,
Nakib
from ensembl-vep.
However, the coordinates provided by VEP e112 do not match those in the UCSC Genome Browser, nor can they be aligned with literature and other databases. Is this appropriate?
from ensembl-vep.
Hi @GSYongWu,
I am just replying here to say that, the HGVS output seems dodgy to me (mainly the HGVSp) too. I am looking at the alignment between Ensembl and RefSeq and will get back to you soon hopefully.
from ensembl-vep.
Hi @GSYongWu,
Sorry for late reply. I have looked into the issue.
First of all, RefSeq transcripts can differ in sequence to the reference genome to which they map; this is because the transcript models are built from primary sequence data and not the reference genome.
Here, when the NM_015326.5
transcript is mapped to GRCh37 assembly it gets 5' UTR and some coding region truncated -
https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg19&lastVirtModeType=default&lastVirtModeExtraState=&virtModeType=default&virtMode=0&nonVirtPosition=&position=chr1%3A206516178%2D206516278&hgsid=2313969114_3KKuutXr6hD7XyWLPdDzR1IZDUJK
The difference in e111 and e112 comes from the new realignment file from NCBI which now have the 909S sequence in the CIGAR string which add back the truncated sequence (83 + 909 = 982).
But, anyway, in such cases where the RefSeq transcript does not match the reference sequence, consequence calling using VEP would not be reliable. If you are using the GRCh38 assembly you should be getting better result.
Hope that answers the question.
Best regards,
Nakib
from ensembl-vep.
Related Issues (20)
- Docker v111.0 and. v112.0 got error bgzip not found in path HOT 3
- VEP error for structural variation VCF input HOT 8
- Missing genes on chrMT when using --refseq HOT 3
- Different cDNA length for several RefSeq transcript HOT 2
- gnomAD r4.1.0 for G2P HOT 1
- VEP v112 `ALLELE_NUM` empty in output VCF for input SV HOT 11
- --individual_zyg ind or all option triggers error. HOT 16
- Synonyms file does not work in the offline mode HOT 7
- Cant Install plugins HOT 4
- VEP112 predicts "inframe_insertion, stop_retained_variant" in cases where previously was predicted as "frameshift_variant, stop_gained" HOT 6
- Q: How to filter variants by a specific feature before --pick_order is applied? HOT 5
- `0` does not work as a variant identifier HOT 3
- StructuralVariantOverlap Hanging Indefinitely HOT 2
- "No cache found for homo_sapiens, version 105", but the latest version is 112 HOT 6
- issue specifiying cache dir (-d) and downloading files HOT 2
- Seeking clarity on --fields vs --custom usage HOT 4
- Request for Documentation and Containerization of Bio::EnsEMBL::XS Module for VEP to run faster. HOT 3
- filter_vep on HGNC_ID HOT 5
- SpliceAI plugin update please HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ensembl-vep.