Giter Club home page Giter Club logo

Comments (14)

davmlaw avatar davmlaw commented on August 15, 2024 2

In the meantime, you can download the gnomAD4 VCFs and use --custom

gnomADv4.0 has both Exome and Genomes, and AFAIK there is no combined score available, so you need to sum the counts and re-calculate the AF if you want them merged

I have some scripts to do this (and cut down the INFO fields to the dozen or so I want) if you want to take a look:

https://github.com/SACGF/variantgrid/blob/master/annotation/annotation_data/generate_annotation/gnomad4.0_download.sh
https://github.com/SACGF/variantgrid/blob/master/annotation/annotation_data/generate_annotation/gnomad_data.py

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

Hi @davmlaw
thank you for this help. I appreciate it.
I tried to run the gnomad_data.py script, I already downloaded the gnomad4 data with the gnomad4.0_download.sh script; but I get syntax error on line 131 in the gnomad_data.py script

"with (open(chrom_script, "w") as cs): the arrow is under the w

SyntaxError: invalid syntax

I'll appreciate your help

from ensembl-vep.

jamie-m-a avatar jamie-m-a commented on August 15, 2024

Hi @ntm (and others interested).

We are currently not certain in which version of Ensembl we will update to the latest gnomAD, but it's likely to be 113. We tend not to rush to incorporate the initial major release version (4.0 in this case) because gnomAD typically provide an updated minor version not that long after each major release.

If you want to start using the data before we incorporate it to the cache, you can, as described in this thread, use the --custom option to use the latest data (with a bit of pre-parsing to get it in the right shape).

from ensembl-vep.

ntm avatar ntm commented on August 15, 2024

Thanks for the info @jamie-m-a , very helpful for planning ahead and deciding our course of action.
While we're talking variant frequencies, any progress on the integration of ALFA, maybe via a plugin as mentioned by @helensch ?
Thanks!

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

hi @ntm, were you able to run @davmlaw scripts? I am having issues running them. I'll appreciate you help. Please see my comment above. Thank you.

from ensembl-vep.

davmlaw avatar davmlaw commented on August 15, 2024

Hi @trust-odia - I've removed the outer brackets, which may fix the script (hard to know as it works on my machine)

You may also want to just download them from here:

https://variantgrid.com/download/annotation/VEP/annotation_data/GRCh38/gnomad4.0_GRCh38_combined_af.vcf.bgz
https://variantgrid.com/download/annotation/VEP/annotation_data/GRCh38/gnomad4.0_GRCh38_combined_af.vcf.bgz.tbi

from ensembl-vep.

davmlaw avatar davmlaw commented on August 15, 2024

You don't need to process the data if you just download the already processed/combined VCF in the comment above yours.

VEP should randomly seek inside the VCF so I don't think it should matter much how big the VCF is

If you want to make separate combined VCF files from exomes/genomes, download the individual per-chromosome files from the gnomAD site, then concatenate them, that shouldn't take much processing power either (leaving an old laptop on overnight)

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

@davmlaw

Hi Dave,

I use the combined gnomad4 data that you sent me (Thank you for this), to run VEP, just only for the AF freq, but the gnomADg_AF column does not have new values.

Please see my command:

vep -i /project/dshared/projects/VEP/chr20_NF.vcf.split.gz --tab -everything --buffer_size 1000 --offline --fork 4 --dir_plugins /project/shared/projects/PMBB_VEP_Annotations --dir_cache /home/todia --custom file=/project/shared/projects/gnomad4/gnomad4.0_GRCh38_combined_af.vcf.bgz,short_name=gnomad4,format=vcf -o /project/shared/projects/VEP/vep_out/testAnnot.chr20.txt

I will appreciate your response.

Thank you

from ensembl-vep.

davmlaw avatar davmlaw commented on August 15, 2024

@trust-odia I think you need to add fields=AFas an argument to custom - then it will add a column called gnomad4_AF (ie custom short_name + field)

See VEP Custom documentation

By default it just uses the VCF ID field, and calls the column the custom short name (ie gnomad4 in your case)

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

@davmlaw Thank you.

from ensembl-vep.

trust-odia avatar trust-odia commented on August 15, 2024

from ensembl-vep.

davmlaw avatar davmlaw commented on August 15, 2024

Hi, it's hard to know without looking at your files, but it could be due to chromosome mappings. I use the full contig names in the VCF, you may need to use --synonyms VEP Option

from ensembl-vep.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.