Giter Club home page Giter Club logo

Comments (5)

ACEnglish avatar ACEnglish commented on June 7, 2024

from truvari.

tjiangHIT avatar tjiangHIT commented on June 7, 2024

Hello @ACEnglish ,
Thanks for your reply!
I have three files in that folder, i.e., HG002_SVs_Tier1_v0.6.vcf, HG002_SVs_Tier1_v0.6.vcf.gz, HG002_SVs_Tier1_v0.6.vcf.gz.tbi.

1. My first question is how to generate a vcf.tbi file.
I just know tabix can used for generation of a vcf.gz.tbi file, which the vcf file should been compressed with bgzip.

2. My second question is what the input base file is. HG002_SVs_Tier1_v0.6.vcf or HG002_SVs_Tier1_v0.6.vcf.gz?
In the older version of truvari, I used base.vcf.gz and comp.vcf.gz as input.
Now, when I use the same inputs as before, I will get this error:

NotImplementedError: seek not implemented in files compressed by method 1

However, when I use base.vcf and comp.vcf.gz as input, one another error raises:

ValueError: 'pysam.libcbcf.VariantFile.fetch' requires an index

Hope you can give me some advice!
Thanks.
tjiang

from truvari.

ACEnglish avatar ACEnglish commented on June 7, 2024

Sorry, I mistyped. You want to generate a tabix file on the compressed vcf.gz file. Then use the vcf.gz file.

e.g.:

bgzip HG002_SVs_Tier1_v0.6.vcf
tabix HG002_SVs_Tier1_v0.6.vcf.gz

Also be sure you have your hg2_ont_10x_indel.vcf.gz.tbi generated.

from truvari.

tjiangHIT avatar tjiangHIT commented on June 7, 2024

Hi @ACEnglish ,
I don't think I have described my problem clearly.
For the version of 1.2 or before, I used truvari to perform benchmark successfully. For the version of 1.3 or higher, I used the same command line, but truvari broke down.
I think it was mainly caused by pysam:

truvari -f /data/tjiang/SV_benchmark/ref/human_hs37d5.fasta -b /data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.vcf.gz --includebed /data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.bed -o bench-5x --passonly --giabreport -r 1000 -p 0 -c hg2_ont_sniffles_5x_indel.vcf.gz
2019-11-11 07:13:32,814 [INFO] Params:
{
"base": "/data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.vcf.gz",
"comp": "hg2_ont_sniffles_5x_indel.vcf.gz",
"output": "bench-5x",
"reference": "/data/tjiang/SV_benchmark/ref/human_hs37d5.fasta",
"giabreport": true,
"debug": false,
"prog": false,
"refdist": 1000,
"pctsim": 0.0,
"pctsize": 0.7,
"pctovl": 0.0,
"typeignore": false,
"gtcomp": false,
"bSample": null,
"cSample": null,
"sizemin": 50,
"sizefilt": 30,
"sizemax": 50000,
"passonly": true,
"no_ref": false,
"includebed": "/data/tjiang/SV_benchmark/giab/HG002_SVs_Tier1_v0.6.bed",
"multimatch": false
}
Traceback (most recent call last):
File "/home/tjiang/miniconda3/envs/eva/bin/truvari", line 1044, in
main(sys.argv[1:])
File "/home/tjiang/miniconda3/envs/eva/bin/truvari", line 706, in main
vcf_base = pysam.VariantFile(args.base)
File "pysam/libcbcf.pyx", line 4015, in pysam.libcbcf.VariantFile.init
File "pysam/libcbcf.pyx", line 4274, in pysam.libcbcf.VariantFile.open
File "pysam/libchtslib.pyx", line 518, in pysam.libchtslib.HTSFile.tell
NotImplementedError: seek not implemented in files compressed by method 1

Hope you can understand my description this time.
And what can I do for solving this?
Thanks,
tjiang

from truvari.

ACEnglish avatar ACEnglish commented on June 7, 2024

I believe the problem is directly related to the following: pysam-developers/pysam#536
Specifically the last comment which says,
Cheers! I realised my issue was actually due to the way the VCF file was compressed (needed to be bgzip)

from truvari.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.