Comments (5)
Hi Eva,
Thanks for the question. This is convention for gVCF compression. If every depth change creates a new line, the gVCF files will get incredibly large, to the point that it would be larger than the BAM file itself. At that point, it is easier to calculate the read depth using the BAM file.
In case it is helpful, there is a boolean flag --include_med_dp
you can pass to make_examples
as make_examples_extra_args when calling run_deepvariant
that will include the median DP observed within the GVCF block rounded to the nearest integer.
Hopefully that helps!
from deepvariant.
Thank you Lucas for your answer. I understand it is for compression. I think I was looking for a similar function to GATK BP resolution. When you use targets, the output would be reduced by a lot. The gvcf becomes rather useless, since you cannot look at or filter individual sites on depth. Some sites might have a depth well over my threshold, but is filtered because the minimum depth is well under my threshold. I am thinking the median DP will have similar issues.
from deepvariant.
If the purpose of this is filtering, you can run something like https://github.com/brentp/mosdepth to get bp resolution coverage. Then create a bed file based on your requirement. Finally, intersect the VCF file with the bed and it should give you the variants you are looking for.
from deepvariant.
Dear @kishwarshafin, thank you for the information.
I opted for samtools depth. It also gives some option filter on base and mapping quality and such, next to target filtering.
from deepvariant.
@ESDeutekom , great, I will close the issue. Please feel free to reopen if you have further questions.
from deepvariant.
Related Issues (20)
- Postprocess_variants.py ValueError: ptrue must be between zero and one: nan HOT 12
- No vcf output after running HOT 1
- ValueError: Reference contigs span 3137161264 bases but only 63025520 bases (2.01%) were found in common among our input files. Check that the sources were created on a common genome reference build. HOT 1
- I read DeepVariant essay and want to download some data. HOT 4
- Error while running tests on Calling variants in non-autosomal contigs HOT 4
- Memory issue while running deepvariant_1.6.0.sif with ONT_R104 HOT 7
- Program always run HOT 3
- DeepVariant running slow HOT 1
- AssertionError: Some objects had attributes which were not restored: HOT 4
- I have a question about training DeepVariant. HOT 2
- Source code compilation of DeepVariant failed HOT 3
- What is deepvariant trained on? HOT 3
- DeepVariant Channel clarification HOT 2
- Info Singularity HOT 1
- dealing with chimeric alignments HOT 3
- Question about RNA-seq HOT 4
- Running with Conda HOT 4
- Checkpoint "Model files do not exist" when testing custom model HOT 1
- make_examples_core.py gets stuck on STAR aligned rnaseq BAM (deepvariant 1.6.1) HOT 1
- No output files (VCFs) for all genome HOT 7
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepvariant.