mani2012 / batchqc Goto Github PK
View Code? Open in Web Editor NEWProvides Quality Control of sequencing samples by deducing if there is batch effect and adjusts for it.
Provides Quality Control of sequencing samples by deducing if there is batch effect and adjusts for it.
When I analyze the bladderdata
sample code (below) the limma data are the same in the uncorrected (batchqc_report.html
) and the ComBat corrected (combat_batchqc_report.html
) reports, and both match the ComBat corrected limma data displayed in the Shiny app, as can be seen in these screenshots (https://drive.google.com/file/d/1kjTFfem_pXqLneKxldAEsfylEtR_v2_b/view?usp=sharing).
Is there a way to fix this?
Thanks
Josh
R code used:
pheno <- pData(bladderEset)
edata <- exprs(bladderEset)
batch <- pheno$batch
condition <- pheno$cancer
batchQC(
edata,
batch = batch,
condition = condition,
report_file = "batchqc_report.html",
report_dir = ".",
report_option_binary = "111111111",
view_report = FALSE,
interactive = TRUE
)
Hello,
I tried running this few times, but it doesn't appear to provide any results in M2.
I don't get a shiny object and no results for SVA or combat or PCA table results.
Hi,
Thanks for sharing this wonderful package. I was wondering if I could download the combat corrected expression matrix from the shiny page.
Thanks!
Hi,
I tried to run the first example from the vignettes BatchQC Examples
library(BatchQC)
nbatch <- 3
ncond <- 2
npercond <- 10
data.matrix <- rnaseq_sim(ngenes=50, nbatch=nbatch, ncond=ncond, npercond=
npercond, basemean=10000, ggstep=50, bbstep=2000, ccstep=800,
basedisp=100, bdispstep=-10, swvar=1000, seed=1234)
batch <- rep(1:nbatch, each=ncond*npercond)
condition <- rep(rep(1:ncond, each=npercond), nbatch)
batchQC(data.matrix, batch=batch, condition=condition,
report_file="batchqc_report.html", report_dir=".",
report_option_binary="111111111",
view_report=FALSE, interactive=TRUE, batchqc_output=TRUE)
I got the error like:
Error in file(con, "w") : cannot open the connection
In addition: Warning message:
In file(con, "w") :
cannot open file 'batchqc_report.knit.md': Permission denied
My running environment is:
R version 3.6.1 (2019-07-05)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Fedora 31 (Thirty One)
Matrix products: default
BLAS/LAPACK: /usr/lib64/R/lib/libRblas.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets
[6] methods base
other attached packages:
[1] BatchQC_1.13.1
loaded via a namespace (and not attached):
[1] Biobase_2.44.0 splines_3.6.1
[3] bit64_0.9-7 gtools_3.8.1
[5] shiny_1.4.0 moments_0.14
[7] assertthat_0.2.1 stats4_3.6.1
[9] pander_0.6.3 blob_1.2.0
[11] yaml_2.2.0 backports_1.1.5
[13] pillar_1.4.2 RSQLite_2.1.2
[15] lattice_0.20-38 quantreg_5.51
[17] glue_1.3.1 limma_3.40.6
[19] digest_0.6.22 promises_1.1.0
[21] htmltools_0.4.0 httpuv_1.5.2
[23] Matrix_1.2-17 plyr_1.8.4
[25] XML_3.98-1.20 pkgconfig_2.0.3
[27] SparseM_1.77 genefilter_1.66.0
[29] purrr_0.3.3 xtable_1.8-4
[31] corpcor_1.6.9 gdata_2.18.0
[33] later_1.0.0 BiocParallel_1.18.1
[35] MatrixModels_0.4-1 tibble_2.1.3
[37] annotate_1.62.0 mgcv_1.8-30
[39] IRanges_2.18.3 ggvis_0.4.5
[41] BiocGenerics_0.30.0 survival_3.1-6
[43] magrittr_1.5 crayon_1.3.4
[45] mime_0.7 memoise_1.1.0
[47] mcmc_0.9-6 evaluate_0.14
[49] nlme_3.1-141 MASS_7.3-51.4
[51] gplots_3.0.1.1 tools_3.6.1
[53] matrixStats_0.55.0 stringr_1.4.0
[55] MCMCpack_1.4-4 S4Vectors_0.22.1
[57] AnnotationDbi_1.46.1 compiler_3.6.1
[59] caTools_1.17.1.2 rlang_0.4.1
[61] grid_3.6.1 RCurl_1.95-4.12
[63] rstudioapi_0.10 htmlwidgets_1.5.1
[65] bitops_1.0-6 base64enc_0.1-3
[67] rmarkdown_1.16 d3heatmap_0.6.1.2
[69] DBI_1.0.0 reshape2_1.4.3
[71] R6_2.4.0 knitr_1.25
[73] dplyr_0.8.3 fastmap_1.0.1
[75] bit_1.1-14 zeallot_0.1.0
[77] KernSmooth_2.23-16 stringi_1.4.3
[79] parallel_3.6.1 sva_3.32.1
[81] Rcpp_1.0.2 vctrs_0.2.0
[83] png_0.1-7 tidyselect_0.2.5
[85] xfun_0.10 coda_0.19-3
Hi,
I have been trying to make BatchQC work for the past two days to no avail. I keep getting the below error:
Quitting from lines 256-274 (batchqc_report.Rmd)
Error in if (spvaltext2 == 0) { : missing value where TRUE/FALSE needed
Having a closer look the problem seemed to appear in lines 263-264:
pval <- batchQC_shapeVariation(lcounts_adj, batch, plot = TRUE, groupCol =
rainbow(nlevels(bf))[bf])
Inside the batchQC_shapeVariation function, I tried to narrow down the problem to see where it occurs. My findings were that in line 34 (batch_ps <- batchEffectPvalue(gnormdata, sortgroups, robust=robustGene)) the function batchEffectPvalue returns the below:
batch_ps Named num [1:4] 0 0 NaN NaN
These two NaNs are producing the problem since the NaN in the if (spvaltext2 == 0) { cannot give TRUE or FALSE.
Inside the function batchEffectPvalue everything seems to run smoothly until we reach the
skewbatch <- unlist(lapply(1:length(batch2), function(x) apply(data[,batch2[[x]]], 1, skewness)))
kurtbatch <- unlist(lapply(1:length(batch2), function(x) apply(data[,batch2[[x]]], 1, kurtosis)))
By having a look at the skewbatch and kurtbatch objects I saw that there are some NaNs present. I believe that this is causing the problem downstream.
Now, I don't know whether this is a problem of skewness and kurtosis functions or is a problem with my data. I tried both raw counts and quantile normalized read counts (as suggested by you). I also filtered the quantile normalized counts for low standard deviation and made sure that none of my batches contain genes with only zeroes (as suggested in your website). I don't know what else I can do. I even thought of adding a 0 in the report_option_binary option to skip the creation of this graph but I am not sure about that since I might be leaving out an important part of the batchQC analysis.
Can you please help me?
Best regards,
Lefteris
PhD candidate, Newcastle University, UK
I just upgraded R, and I'm reinstalling a bunch of packages. The install is R 3.3.1
I installed BatchQC via devtools, which should have installed all dependencies, but a bunch were missing that I had to install by hand:
I'm not totally up on R package specification, but it seems like either the configuration is missing an option to install dependencies of dependencies, or else these dependencies should be included explicitly.
@wevanjohnson It might be a good idea to set up a virtual machine with a build server that tries to install BatchQC (and other CBM packages?) every time a commit is pushed to Github. That way we can ensure packages are easy to install for anyone, regardless of environment.
Hi,
I noticed the tooltips on PCA plot show incorrect values for chosen PCs when ComBat or SVA adjusted data is selected:
(EDIT: I just realized that the tooltip shows the PCs from None adjusted data even when ComBat or SVA is selected.)
This was generated with an example data but I get the same issue with my own data.
library(BatchQC)
data(example_batchqc_data)
batch <- batch_indicator$V1
condition <- batch_indicator$V2
batchQC(signature_data, batch=batch, condition=condition)
I'm using BatchQC_1.8.0 and can send a full session info if you can't replicate it.
(just a personal preference: A tooltip that shows "condition" in stead of "PCs" might be more helpful when there are numerous conditions.)
Also, thanks a lot for this tool.
Are there any plans to include combat_seq in BatchQC?
I have this error while running
batchQC(data, batch=batch, condition=condition, view_report=FALSE, interactive=FALSE)
Error in if (var(data.matrix[i, cond2[[j]]]) == 0) { :
missing value where TRUE/FALSE needed
my data don't have NA
sum(is.na(data))
[1] 0
Any suggestions?
Hi,
I installed batchQC from BIoconductor (today) and ran 2 of the examples--worked very well.
I'm trying with my own data, and got this error (before GUI activated):
Quitting from lines 178-179 (batchqc_report.Rmd)
Error in quantile.default(med_cor, p = 0.25) :
missing values and NaN's not allowed if 'na.rm' is FALSE
No NAs or NaNs in the expression data, batch or condition objects...
condition & batch vectors are type character.
Expression data has 189 samples & ~ 16K features (features z-scored)
table(condition,batch,useNA='always')
batch
condition 2014 2016 2017 2018 2020
25minus 5 1 7 9 6 0
26_31 0 4 8 4 1 0
32_37 6 1 17 54 22 0
38plus 5 1 10 19 9 0
0 0 0 0 0 0
Any thoughts on getting past this error?
Thanks much!
from looking at the source code, I am under the impression that data is always transformed to log2cpm when performing the correlation analyses in correlation.R even when log2cpm_transform = FALSE
I was using batchQC on microarray data and my data are already log transformed. Though subtle, the additional log transformation does affect the median pairwise correlations
using batchqc_corscatter - matches output from from batchQC(..., log2transfrom = FALSE)
Hello, thanks for the great package. I successfully went through the demo but am having some trouble with my data.
batchQC(counts, batch = batch, condition = cond, interactive = TRUE)
Error in if (var(data.matrix[i, cond2[[j]]]) == 0) { :
missing value where TRUE/FALSE needed
One clue is that my many of my conditions have only one replicate. Does this only work if there are multiple reps per condition (for all conditions)?
Thanks so much.
Hi,
This looks like a really nice package, but I'm having some trouble even running the examples in the interactive mode: https://bioconductor.org/packages/release/bioc/vignettes/BatchQC/inst/doc/BatchQCIntro.html. Running the first example (Simulate data and Apply BatchQC), the app opens, but then if I try to apply ComBat or SVA, it fails to show the output, with the error shown in the shiny window: "Error: argument "p" is missing, with no default".
The RStudio console reads as follows:
Quitting from lines 105-115 (batchqc_report.Rmd)
Warning: Error in graphics::layout: argument "p" is missing, with no default
163: graphics::layout
162: layout.matrix
142: eventReactiveHandler [/usr/local/lib/R/4.0/site-library/BatchQC/shiny/BatchQC/server.R#1013]
98: combatOutText
97: renderText [/usr/local/lib/R/4.0/site-library/BatchQC/shiny/BatchQC/server.R#1021]
96: func
83: origRenderFunc
82: output$combatOutText
2: shiny::runApp
1: batchQC
I realize this may be something about my installation/setup but I have no idea how to fix it and I'd really like to use the app. Any advice is appreciated. Thanks!
sessionInfo()
R version 4.0.2 (2020-06-22)
Platform: x86_64-apple-darwin18.7.0 (64-bit)
Running under: macOS Mojave 10.14.6
Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /usr/local/Cellar/openblas/0.3.10_1/lib/libopenblasp-r0.3.10.dylib
locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
attached base packages:
[1] parallel stats graphics grDevices utils datasets methods base
other attached packages:
[1] sva_3.36.0 BiocParallel_1.22.0 genefilter_1.70.0 mgcv_1.8-33
[5] nlme_3.1-149 Matrix_1.2-18 limma_3.44.3 reshape2_1.4.4
[9] heatmaply_1.1.1 viridis_0.5.1 viridisLite_0.3.0 plotly_4.9.2.1
[13] ggplot2_3.3.2 ggvis_0.4.6 pander_0.6.3 shiny_1.5.0
[17] BatchQC_1.16.2
loaded via a namespace (and not attached):
[1] BiocFileCache_1.12.1 plyr_1.8.6
[3] lazyeval_0.2.2 splines_4.0.2
[5] GenomeInfoDb_1.24.2 digest_0.6.25
[7] foreach_1.5.0 htmltools_0.5.0
[9] gdata_2.18.0 magrittr_1.5
[11] memoise_1.1.0 Biostrings_2.56.0
[13] annotate_1.66.0 matrixStats_0.56.0
[15] MCMCpack_1.4-9 askpass_1.1
[17] prettyunits_1.1.1 colorspace_1.4-1
[19] blob_1.2.1 rappdirs_0.3.1
[21] xfun_0.17 dplyr_1.0.2
[23] crayon_1.3.4 RCurl_1.98-1.2
[25] jsonlite_1.7.1 survival_3.2-3
[27] iterators_1.0.12 glue_1.4.2
[29] registry_0.5-1 gtable_0.3.0
[31] zlibbioc_1.34.0 XVector_0.28.0
[33] webshot_0.5.2 MatrixModels_0.4-1
[35] DelayedArray_0.14.1 BiocGenerics_0.34.0
[37] SparseM_1.78 scales_1.1.1
[39] DBI_1.1.0 edgeR_3.30.3
[41] TxDb.Mmusculus.UCSC.mm9.knownGene_3.2.2 Rcpp_1.0.5
[43] xtable_1.8-4 progress_1.2.2
[45] bit_4.0.4 deSolve_1.28
[47] stats4_4.0.2 htmlwidgets_1.5.1
[49] httr_1.4.2 gplots_3.1.0
[51] RColorBrewer_1.1-2 ellipsis_0.3.1
[53] pkgconfig_2.0.3 XML_3.99-0.5
[55] dbplyr_1.4.4 locfit_1.5-9.4
[57] tidyselect_1.1.0 rlang_0.4.7
[59] later_1.1.0.1 AnnotationDbi_1.50.3
[61] munsell_0.5.0 tools_4.0.2
[63] generics_0.0.2 moments_0.14
[65] RSQLite_2.2.0 evaluate_0.14
[67] stringr_1.4.0 fastmap_1.0.1
[69] yaml_2.2.1 mcmc_0.9-7
[71] knitr_1.29 bit64_4.0.2
[73] caTools_1.18.0 plgem_1.60.0
[75] purrr_0.3.4 dendextend_1.14.0
[77] rootSolve_1.8.2.1 mime_0.9
[79] quantreg_5.73 biomaRt_2.44.1
[81] compiler_4.0.2 rstudioapi_0.11
[83] curl_4.3 tibble_3.0.3
[85] geneplotter_1.66.0 stringi_1.4.6
[87] GenomicFeatures_1.40.1 lattice_0.20-41
[89] vctrs_0.3.3 pillar_1.4.6
[91] lifecycle_0.2.0 data.table_1.13.0
[93] bitops_1.0-6 conquer_1.0.2
[95] corpcor_1.6.9 seriation_1.2-9
[97] httpuv_1.5.4 rtracklayer_1.48.0
[99] GenomicRanges_1.40.0 R6_2.4.1
[101] promises_1.1.1 TSP_1.1-10
[103] KernSmooth_2.23-17 gridExtra_2.3
[105] IRanges_2.22.2 codetools_0.2-16
[107] MASS_7.3-52 gtools_3.8.2
[109] assertthat_0.2.1 SummarizedExperiment_1.18.2
[111] openssl_1.4.2 DESeq2_1.28.1
[113] withr_2.2.0 GenomicAlignments_1.24.0
[115] Rsamtools_2.4.0 S4Vectors_0.26.1
[117] GenomeInfoDbData_1.2.3 hms_0.5.3
[119] grid_4.0.2 tidyr_1.1.2
[121] coda_0.19-4 rmarkdown_2.3
[123] pROC_1.16.2 Biobase_2.48.0
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.