Comments (11)
Hey @dennishendriksen,
Just to update you: I opened PR Ensembl/ensembl-variation#1095 to fix allele numbers for breakends. This will be available in the next version of VEP.
Thanks again for reporting this issue!
Cheers,
Nuno
from ensembl-vep.
The results you are obtaining for that breakpoint variant seem incorrect.
In VEP 111, we represented the alternative allele of the breakpoint (in your case, [1:109650635[GG
) to indicate all potential consequences. However, this is confusing if a breakpoint is composed by two or more chromosomal breakends.
As such, in VEP 112, we now separate the consequences of a breakpoint variant for each breakend:
[1:109650635[GG
: consequences for the breakend located inchr1:109650635
.G
: consequences for the original breakend in positionchr22:29767384
(represented as detailed in the VCF 4.4 standard, chapter 5.4.9: Single breakends)
To answer your questions:
Q1: Is this intended? I would expect this field to always contain a ALT allele index.
Unfortunately, it seems that VEP 112 is returning nothing for the allele number for breakpoint variants. I am going to check how to fix it.
Q2: (...) Could you explain what the dot in the new output means?
The representation depicts a single breakend and its orientation:
2 321681 bndW G G.
: breakend occurring at position 321682 with at least position 321681 (and maybe 321680, 321679, etc.) attached13 123457 bndX A .A
: breakend occurring at position 123456 with at least position 123457 (and maybe 123458, 123459, etc.) attached
More information at VCF 4.4 standard, chapter 5.4.9: Single breakends.
Q3: A last observation is that the number of consequences went down from 10 to 7. Could you explain this difference?
I'll also check if the changes are expected or not.
Thanks for reporting this issue! I'll report back as soon as possible.
Best regards,
Nuno
from ensembl-vep.
Hey @dennishendriksen,
The bug fix to the allele number in breakpoint variants has now been merged to the code in the next version of VEP (VEP 113).
I will close this issue but feel free to open a new one if you find further issues or have any suggestions.
Cheers,
Nuno
from ensembl-vep.
Hi @nuno-agostinho,
Thank you for this fix!
Q3: A last observation is that the number of consequences went down from 10 to 7. Could you explain this difference?
I'll also check if the changes are expected or not.
Did you get around to checking this?
Greetings,
@dennishendriksen
from ensembl-vep.
Sorry for closing the issue prematurely.
I was not able to replicate your results. Could you please send me the VEP command that you run to get those results?
Thanks,
Nuno
from ensembl-vep.
Hi @nuno-agostinho,
From the previously attached vcf:
vep --allele_number --allow_non_variant --assembly GRCh38 --buffer_size 1000 --cache --compress_output bgzip --custom [PATH]/hg38.phyloP100way.bw,phyloP,bigwig,exact,0 --database 0 --dir_cache [PATH]/cache --dir_plugins [PATH]/plugins --dont_skip --exclude_predicted --fasta [PATH]/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna.gz --flag_pick_allele --fork 4 --format vcf --hgvs --input_file GRCh37_normalized.vcf.gz --no_stats --numbers --offline --output_file GRCh37_annotated.vcf.gz --plugin Grantham --plugin SpliceAI,snv=[PATH]/spliceai_scores.masked.snv.hg38.vcf.gz,indel=[PATH]/spliceai_scores.masked.indel.hg38.vcf.gz --plugin Capice,GRCh37_capice_output.tsv.gz --plugin UTRannotator,[PATH]/uORF_5UTR_PUBLIC.txt --plugin Inheritance,[PATH]/inheritance_20240115.tsv --plugin VKGL,[PATH]/vkgl_consensus_20240401.tsv,1 --plugin gnomAD,[PATH]/gnomad.total.v4.1.sites.stripped.tsv.gz --plugin ClinVar,[PATH]/clinvar_20240603_stripped.tsv.gz --plugin AnnotSV,GRCh37_normalized.vcf.gz.tsv,AnnotSV_ranking_score;AnnotSV_ranking_criteria;ACMG_class --plugin AlphScore,[PATH]/AlphScore_final_20230825_stripped_GRCh38.tsv.gz --plugin ncER,[PATH]/GRCh38_ncER_perc.bed.gz --plugin FATHMM_MKL_NC,[PATH]/GRCh38_FATHMM-MKL_NC.tsv.gz --plugin ReMM,[PATH]/GRCh38_ReMM.tsv.gz --polyphen s --pubmed --refseq --safe --shift_3prime --sift s --symbol --total_length --use_given_ref --vcf
Greetings,
@dennishendriksen
from ensembl-vep.
Hey @dennishendriksen,
I am confused by your command, as you are mixing GRCh37 and GRCh38 data.
For GRCh38, the alternative breakend [1:109650635[G
should only return an intergenic variant1, whereas there are Transcript consequences if you use --assembly GRCh37
.
Could you check if the results make sense for you when using GRCh37 throughout the VEP command?
Thanks,
Nuno
Footnotes
-
However, the results only show results for the reference breakend (
.G
). This is a bug, it should also show intergenic variants if there are no other consequences. I will try to fix this. ↩
from ensembl-vep.
Hi @nuno-agostinho,
Apologies for the confusing filename, this is an artifact after liftover from GRCh37 to GRCh38. Both file content and command should be GRCh38. I'm not an expert on breakend notations, could it be that you missed the final G
in G>[1:109650635[GG
?
Greetings,
@dennishendriksen
from ensembl-vep.
could it be that you missed the final G in G>[1:109650635[GG?
Currently, the alternative sequence of a breakend is ignored by VEP. We intend to improve this in the future.
Upon further inspection, the difference may be related with updates to the Ensembl database. For instance, one of the consequences for the breakend [1:109650635[GG
in GRCh38 is associated with regulatory feature ENSR00001170488, which is not available in the current version of Ensembl.
If you want the same results as in VEP 111, you can download the previous VEP cache from http://ftp.ensembl.org/pub/release-111/variation/vep and then run VEP with option --db_version 111
. However, I would suggest to simply use the most up-to-date version of VEP cache when possible.
Hope this makes it clearer, but tell me if you want to discuss this further. Thanks!
Cheers,
Nuno
from ensembl-vep.
Hi @nuno-agostinho,
Good to know that it is a change in database content (I had not thought on running VEP v112 with the 111 database). Case closed, thank you for your effort and time, greatly appreciated.
Cheers,
@dennishendriksen
from ensembl-vep.
We are always here to help! Glad you reported the issue so that we could improve VEP.
Have a great day! 😄
Cheers,
Nuno
from ensembl-vep.
Related Issues (20)
- Docker v111.0 and. v112.0 got error bgzip not found in path HOT 3
- VEP error for structural variation VCF input HOT 8
- Missing genes on chrMT when using --refseq HOT 3
- Different cDNA length for several RefSeq transcript HOT 2
- gnomAD r4.1.0 for G2P HOT 1
- --individual_zyg ind or all option triggers error. HOT 16
- CDS coordinate misalignment for some genes in VEP version 112 HOT 5
- Synonyms file does not work in the offline mode HOT 7
- Cant Install plugins HOT 4
- VEP112 predicts "inframe_insertion, stop_retained_variant" in cases where previously was predicted as "frameshift_variant, stop_gained" HOT 6
- Q: How to filter variants by a specific feature before --pick_order is applied? HOT 5
- `0` does not work as a variant identifier HOT 3
- StructuralVariantOverlap Hanging Indefinitely HOT 2
- "No cache found for homo_sapiens, version 105", but the latest version is 112 HOT 6
- issue specifiying cache dir (-d) and downloading files HOT 2
- Seeking clarity on --fields vs --custom usage HOT 4
- Request for Documentation and Containerization of Bio::EnsEMBL::XS Module for VEP to run faster. HOT 3
- filter_vep on HGNC_ID HOT 5
- SpliceAI plugin update please HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ensembl-vep.