rajlabmssm / echotabix Goto Github PK
View Code? Open in Web Editor NEWechoverse module: Tabix indexing and querying.
echoverse module: Tabix indexing and querying.
Would be useful to automatically handle situations where query_granges
spans multiple chromosomes. Could add an extra loop at the level of query
or query_vcf
/query_table
.
Otherwise, stuff like this can happen:
query_dat <- rbind(echodata::BST1[1:50,],
echodata::LRRK2[1:50,], fill=TRUE)
annot_dt <- echoannot::IMPACT_query(query_dat=query_dat,
populations="EUR")
testthat::expect_equal(dim(annot_dt),c(13,1419))
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Inferred format: 'table'
Querying tabular tabix file using: Rsamtools.
Checking query chromosome style is correct.
Chromosome format: 1
Retrieving data.
Error: scanTabix: '4' not present in tabix index
path: https://zenodo.org/record/7062238/files/IMPACT707_EUR_chr12.annot.bgz?download=1
index: https://zenodo.org/record/7062238/files/IMPACT707_EUR_chr12.annot.bgz.tbi?download=1
Checked and the file it's being written to does indeed already exist:
x <- "IMPACT707/Annotations/IMPACT707_EAS_chr1.annot.gz"
file.exists(echotabix::construct_tabix_path(target_path = x))
out < echotabix::convert(target_path = x,
chrom_col = "CHR",
start_col = "BP",
comment_char = "CHR",
force_new = FALSE)
and yet file is still being reprocessed:
========= echotabix::convert =========
Converting full summary stats file to tabix format for fast querying.
Inferred format: 'table'
Determining chrom type from file header.
Chromosome format: 1
Detecting column delimiter.
Identified column separator: \t
Sorting rows by coordinates via bash.
Searching for header row with zgrep.
( zgrep ^'CHR' .../IMPACT707_EAS_chr1.annot.gz; zgrep
-v ^'CHR' .../IMPACT707_EAS_chr1.annot.gz | sort
-k1,1n
-k2,2n ) > .../file3ef1cb2ba03_sorted.tsv
# Solution
A very small but important fix was editing this line in the .Rbuildignore. The syntax was wrong and was ignoring the rm_tbi.R file (not just .tbi files, which i do want to ignore). Therefore echotabix was blind to the rm_tbi function.
.*.tbi
--> .*\.tbi$
query_dat <- echodata::BST1[seq(1, 50), ]
locus_dir <- file.path(tempdir(), echodata::locus_dir)
LD_list <- echoLD::get_LD(
locus_dir = locus_dir,
query_dat = query_dat,
LD_reference = "1KGphase1")
Here is what I got:
LD_reference identified as: 1kg.
Using 1000Genomes as LD reference panel.
Constructing GRanges query using min/max ranges across one or more chromosomes.
+ as_blocks=TRUE: Will query a single range per chromosome that covers all regions requested (plus anything in between).
LD Reference Panel = 1KGphase1
Querying 1KG remote server.
========= echotabix::query =========
query_dat is already a GRanges object. Returning directly.
Explicit format: 'vcf'
Querying VCF tabix file.
Importing existing VCF file: /tmp/RtmpdpQfkr/VCF/RtmpdpQfkr.chr4-14884541-16649679.ALL.chr4.phase1_release_v3.20101123.snps_indels_svs.genotypes.vcf.bgz
> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Sonoma 14.1
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: Europe/London
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] echotabix_0.99.10
loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 dplyr_1.1.3 blob_1.2.4
[4] filelock_1.0.2 R.utils_2.12.2 Biostrings_2.70.1
[7] bitops_1.0-7 fastmap_1.1.1 RCurl_1.98-1.13
[10] BiocFileCache_2.10.1 VariantAnnotation_1.48.0 GenomicAlignments_1.38.0
[13] XML_3.99-0.15 digest_0.6.33 lifecycle_1.0.4
[16] KEGGREST_1.42.0 RSQLite_2.3.3 magrittr_2.0.3
[19] compiler_4.3.1 rlang_1.1.2 progress_1.2.2
[22] tools_4.3.1 utf8_1.2.4 yaml_2.3.7
[25] data.table_1.14.8 rtracklayer_1.62.0 htmlwidgets_1.6.2
[28] prettyunits_1.2.0 S4Arrays_1.2.0 bit_4.0.5
[31] curl_5.1.0 reticulate_1.34.0 DelayedArray_0.28.0
[34] xml2_1.3.5 pkgload_1.3.3 abind_1.4-5
[37] BiocParallel_1.36.0 purrr_1.0.2 BiocGenerics_0.48.1
[40] R.oo_1.25.0 grid_4.3.1 stats4_4.3.1
[43] echoconda_0.99.9 fansi_1.0.5 biomaRt_2.58.0
[46] SummarizedExperiment_1.32.0 cli_3.6.1 crayon_1.5.2
[49] generics_0.1.3 rstudioapi_0.15.0 tzdb_0.4.0
[52] httr_1.4.7 rjson_0.2.21 piggyback_0.1.5
[55] DBI_1.1.3 cachem_1.0.8 stringr_1.5.1
[58] zlibbioc_1.48.0 parallel_4.3.1 AnnotationDbi_1.64.1
[61] BiocManager_1.30.22 XVector_0.42.0 restfulr_0.0.15
[64] matrixStats_1.1.0 basilisk_1.14.0 vctrs_0.6.4
[67] Matrix_1.6-1.1 jsonlite_1.8.7 dir.expiry_1.10.0
[70] IRanges_2.36.0 hms_1.1.3 S4Vectors_0.40.1
[73] bit64_4.0.5 GenomicFeatures_1.54.1 tidyr_1.3.0
[76] glue_1.6.2 codetools_0.2-19 DT_0.30
[79] stringi_1.8.1 GenomeInfoDb_1.38.1 GenomicRanges_1.54.1
[82] BiocIO_1.12.0 tibble_3.2.1 pillar_1.9.0
[85] htmltools_0.5.7 basilisk.utils_1.14.0 rappdirs_0.3.3
[88] GenomeInfoDbData_1.2.11 BSgenome_1.70.1 R6_2.5.1
Missing a system dependency? Might be able to circumvent this with one of the echotabix
alternatives:
https://github.com/RajLabMSSM/echolocatoR/actions/runs/3357812148/jobs/5563951744#step:21:1
Run options(crayon.enabled = TRUE)
Loading required package: sessioninfo
── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘.../DESCRIPTION’ ... OK
* preparing ‘echolocatoR’:
* checking DESCRIPTION meta-information ... OK
* installing the package to build vignettes
* creating vignettes ... ERROR
Error: --- re-building ‘BD_GWAS.Rmd’ using rmarkdown
--- finished re-building ‘BD_GWAS.Rmd’
--- re-building ‘echolocatoR.Rmd’ using rmarkdown
Quitting from lines 85-95 (echolocatoR.Rmd)
Error: Error: processing vignette 'echolocatoR.Rmd' failed with diagnostics:
bgzip executable could be identified.
--- failed re-building ‘echolocatoR.Rmd’
--- re-building ‘finemapping_portal.Rmd’ using rmarkdown
Downloading: https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/ASXL3/multi_finemap/ASXL3.UKB.multi_finemap.csv.gz
trying URL 'https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/ASXL3/multi_finemap/ASXL3.UKB.multi_finemap.csv.gz'
Content type 'application/octet-stream' length 157348 bytes (153 KB)
==================================================
downloaded 153 KB
Downloading: https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/BIN3/multi_finemap/BIN3.UKB.multi_finemap.csv.gz
trying URL 'https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/BIN3/multi_finemap/BIN3.UKB.multi_finemap.csv.gz'
Content type 'application/octet-stream' length 230[37](https://github.com/RajLabMSSM/echolocatoR/actions/runs/3357812148/jobs/5563951744#step:21:38)0 bytes (224 KB)
==================================================
downloaded 224 KB
Downloading: https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/ASXL3/LD/ASXL3.UKB.LD.csv.gz
trying URL 'https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/ASXL3/LD/ASXL3.UKB.LD.csv.gz'
Content type 'application/octet-stream' length 66098 bytes (64 KB)
==================================================
downloaded 64 KB
Downloading: https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/BIN3/LD/BIN3.UKB.LD.csv.gz
trying URL 'https://github.com/RajLabMSSM/Fine_Mapping_Shiny/raw/master/www/data/GWAS/Nalls23andMe_2019/BIN3/LD/BIN3.UKB.LD.csv.gz'
Content type 'application/octet-stream' length 100237 bytes (97 KB)
==================================================
downloaded 97 KB
--- finished re-building ‘finemapping_portal.Rmd’
--- re-building ‘plot_locus.Rmd’ using rmarkdown
Failed with error: 'there is no package called 'pals''
Failed with error: 'there is no package called 'pals''
The magick package is required to crop "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/RtmpBvAYS5/Rbuildabab4d92368a/echolocatoR/vignettes/plot_locus_files/figure-html/trk_plot-1.png" but not available.
Failed with error: 'there is no package called 'pals''
Failed with error: 'there is no package called 'pals''
The magick package is required to crop "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/RtmpBvAYS5/Rbuildabab4d92368a/echolocatoR/vignettes/plot_locus_files/figure-html/modify track-1.png" but not available.
Failed with error: 'there is no package called 'pals''
Failed with error: 'there is no package called 'pals''
The magick package is required to crop "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/RtmpBvAYS5/Rbuildabab4d92368a/echolocatoR/vignettes/plot_locus_files/figure-html/trk_plot.xgr-1.png" but not available.
Failed with error: 'there is no package called 'pals''
Failed with error: 'there is no package called 'pals''
The magick package is required to crop "/private/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/RtmpBvAYS5/Rbuildabab4d92368a/echolocatoR/vignettes/plot_locus_files/figure-html/trk_plot.QTL-1.png" but not available.
--- finished re-building ‘plot_locus.Rmd’
--- re-building ‘QTLs.Rmd’ using rmarkdown
[tabix] the index file exists. Please use '-f' to overwrite.
Failed with error: 'there is no package called 'seqminer''
Quitting from lines 74-83 (QTLs.Rmd)
Error: Error: processing vignette 'QTLs.Rmd' failed with diagnostics:
there is no package called 'seqminer'
--- failed re-building ‘QTLs.Rmd’
--- re-building ‘summarise.Rmd’ using rmarkdown
The magick package is required to crop "/private/var/folders/24/8k[48](https://github.com/RajLabMSSM/echolocatoR/actions/runs/3357812148/jobs/5563951744#step:21:49)jl6d249_n_qfxwsl6xvm0000gn/T/RtmpBvAYS5/Rbuildabab4d92368a/echolocatoR/vignettes/summarise_files/figure-html/super_summary_plot()-1.png" but not available.
--- finished re-building ‘summarise.Rmd’
SUMMARY: processing the following files failed:
‘echolocatoR.Rmd’ ‘QTLs.Rmd’
Error: Error: Vignette re-building failed.
Execution halted
Error: Error in proc$get_built_file() : Build process failed
Calls: <Anonymous> ... build_package -> with_envvar -> force -> <Anonymous>
Execution halted
Error: Process completed with exit code 1.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.