Comments (5)
from truvari.
Hello @ACEnglish ,
Thanks for your reply!
I have three files in that folder, i.e., HG002_SVs_Tier1_v0.6.vcf, HG002_SVs_Tier1_v0.6.vcf.gz, HG002_SVs_Tier1_v0.6.vcf.gz.tbi.
1. My first question is how to generate a vcf.tbi file.
I just know tabix can used for generation of a vcf.gz.tbi file, which the vcf file should been compressed with bgzip.
2. My second question is what the input base file is. HG002_SVs_Tier1_v0.6.vcf or HG002_SVs_Tier1_v0.6.vcf.gz?
In the older version of truvari, I used base.vcf.gz and comp.vcf.gz as input.
Now, when I use the same inputs as before, I will get this error:
NotImplementedError: seek not implemented in files compressed by method 1
However, when I use base.vcf and comp.vcf.gz as input, one another error raises:
ValueError: 'pysam.libcbcf.VariantFile.fetch' requires an index
Hope you can give me some advice!
Thanks.
tjiang
from truvari.
Sorry, I mistyped. You want to generate a tabix file on the compressed vcf.gz file. Then use the vcf.gz file.
e.g.:
bgzip HG002_SVs_Tier1_v0.6.vcf
tabix HG002_SVs_Tier1_v0.6.vcf.gz
Also be sure you have your hg2_ont_10x_indel.vcf.gz.tbi generated.
from truvari.
Hi @ACEnglish ,
I don't think I have described my problem clearly.
For the version of 1.2 or before, I used truvari to perform benchmark successfully. For the version of 1.3 or higher, I used the same command line, but truvari broke down.
I think it was mainly caused by pysam:
truvari -f /data/tjiang/SV_benchmark/ref/human_hs37d5.fasta -b /data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.vcf.gz --includebed /data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.bed -o bench-5x --passonly --giabreport -r 1000 -p 0 -c hg2_ont_sniffles_5x_indel.vcf.gz
2019-11-11 07:13:32,814 [INFO] Params:
{
"base": "/data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.vcf.gz",
"comp": "hg2_ont_sniffles_5x_indel.vcf.gz",
"output": "bench-5x",
"reference": "/data/tjiang/SV_benchmark/ref/human_hs37d5.fasta",
"giabreport": true,
"debug": false,
"prog": false,
"refdist": 1000,
"pctsim": 0.0,
"pctsize": 0.7,
"pctovl": 0.0,
"typeignore": false,
"gtcomp": false,
"bSample": null,
"cSample": null,
"sizemin": 50,
"sizefilt": 30,
"sizemax": 50000,
"passonly": true,
"no_ref": false,
"includebed": "/data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.bed",
"multimatch": false
}
Traceback (most recent call last):
File "/home/tjiang/miniconda3/envs/eva/bin/truvari", line 1044, in
main(sys.argv[1:])
File "/home/tjiang/miniconda3/envs/eva/bin/truvari", line 706, in main
vcf_base = pysam.VariantFile(args.base)
File "pysam/libcbcf.pyx", line 4015, in pysam.libcbcf.VariantFile.init
File "pysam/libcbcf.pyx", line 4274, in pysam.libcbcf.VariantFile.open
File "pysam/libchtslib.pyx", line 518, in pysam.libchtslib.HTSFile.tell
NotImplementedError: seek not implemented in files compressed by method 1
Hope you can understand my description this time.
And what can I do for solving this?
Thanks,
tjiang
from truvari.
I believe the problem is directly related to the following: pysam-developers/pysam#536
Specifically the last comment which says,
Cheers! I realised my issue was actually due to the way the VCF file was compressed (needed to be bgzip)
from truvari.
Related Issues (20)
- Duplication to Insertion doubt HOT 4
- Failure in pip installation HOT 2
- Question: Does truvari have a upper limit on the file size? How to speed up? HOT 2
- BED Region off-by-one error HOT 4
- Zero matches between base and comp HOT 4
- AttributeError: 'CollapsedCalls' object has no attribute 'consolidate' | version 4.2.1 HOT 4
- Calculate SNV HOT 7
- complex genotype problem HOT 3
- GT integrate HOT 1
- No TP or FP calls for CNV HOT 1
- merging different SV type? HOT 3
- No FP or TP calls HOT 2
- Unable to run MAFFT HOT 9
- md5sum FIPS issue HOT 1
- Support vector for intra-sample merge HOT 6
- some questions about the results in fp.vcf.gz
- some questions about the results in fp.vcf.gz HOT 1
- Getting same numbers of TP-base and TP-comp HOT 4
- Suggested minor documentation changes
- Truvari, STRs and Expansion Hunter - Query HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from truvari.