kharchenkolab / cacoa Goto Github PK

View Code? Open in Web Editor NEW

45.0 45.0 7.0 5.73 MB

Single-cell Case Control Analysis

R 87.59% C++ 12.41%

cacoa's People

Contributors

Stargazers

Watchers

Forkers

cnk113 kdragicevic christian-heyer gladelephant mcrewcow chilampoon

cacoa's Issues

unable to use Cacoa on a seurat object

Hi, may you please help provide a detailed way to use Cacoa on a seurat object?

I am getting errors I am not sure the reason of, e.g
when running "cao$estimateCellLoadings()"
I get:
"Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 't': arguments imply differing number of rows: 47652, 0"

Thank you for your help

fails when sample.groups describes a subset of samples

If sample.groups argument covers only a subset of samples in the conos object, the constructor accepts it, but both estimateCellLoadings() and estimateExpressionShiftMagnitudes() fail with uninformative error messages. Passing a smaller conos object, covering only the samples with valid levels in the sample.groups factor fixes the issue.

Cacoa not working with Seurat object

Hi team,
I have been using Seurat object for the analysis. Could you please get clarify what do you mean by sample.per.cell argument?
I have been inputting two factor group information under sample.groups and annotated cell type column as cell.groups.
if I use orig.ident column as sample.groups., I get an error saying sample groups must be a two level factor describing which samples are being contrasted.
Thank you,
Aditya

estimateCellLoadings() fails: object 'mx.first' not found

Error in nrow(mx.first) : object 'mx.first' not found
> traceback()
4: nrow(mx.first)
3: referenceSet(cnts, groups, p.thresh = 0.1)
2: runCoda(cnts, groups, n.boot = n.boot, n.seed = n.seed, ref.cell.type = ref.cell.type)
1: cao$estimateCellLoadings()

> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.5 (Ootpa)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.12.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] slingshot_2.5.1             TrajectoryUtils_1.2.0       SingleCellExperiment_1.16.0 SummarizedExperiment_1.24.0
 [5] Biobase_2.54.0              GenomicRanges_1.46.1        GenomeInfoDb_1.30.1         IRanges_2.28.0             
 [9] S4Vectors_0.32.4            BiocGenerics_0.40.0         MatrixGenerics_1.6.0        matrixStats_0.62.0         
[13] princurve_2.1.6             cacoa_0.4.0                 pagoda2_1.0.10              cowplot_1.1.1              
[17] sp_1.5-0                    SeuratObject_4.1.2          Seurat_4.2.0                qs_0.25.4                  
[21] conos_1.4.9                 igraph_1.3.5                Matrix_1.5-1                sccore_1.0.2               
[25] ggplot2_3.3.6               dplyr_1.0.10                magrittr_2.0.3              CRMetrics_0.2.1            

loaded via a namespace (and not attached):
  [1] ica_1.0-3                 foreach_1.5.2             lmtest_0.9-40             crayon_1.5.2             
  [5] spatstat.core_2.4-4       MASS_7.3-58.1             rhdf5filters_1.6.0        nlme_3.1-160             
  [9] backports_1.4.1           GOSemSim_2.20.0           rlang_1.0.6               XVector_0.34.0           
 [13] ROCR_1.0-11               irlba_2.3.5.1             SparseM_1.81              extrafontdb_1.0          
 [17] stringfish_0.15.7         extrafont_0.18            BiocParallel_1.28.3       rjson_0.2.21             
 [21] bit64_4.0.5               glue_1.6.2                sctransform_0.3.5         parallel_4.1.2           
 [25] vipor_0.4.5               spatstat.sparse_2.1-1     AnnotationDbi_1.56.2      DOSE_3.20.1              
 [29] spatstat.geom_2.4-0       tidyselect_1.2.0          fitdistrplus_1.1-8        XML_3.99-0.11            
 [33] tidyr_1.2.1               zoo_1.8-11                proj4_1.0-11              ggpubr_0.4.0             
 [37] xtable_1.8-4              MatrixModels_0.5-1        evaluate_0.17             cli_3.4.1                
 [41] zlibbioc_1.40.0           rstudioapi_0.14           miniUI_0.1.1.1            EnhancedVolcano_1.12.0   
 [45] bslib_0.4.0               rpart_4.1.16              fastmatch_1.1-3           pbmcapply_1.5.1          
 [49] treeio_1.18.1             maps_3.4.0                shiny_1.7.2               xfun_0.34                
 [53] clue_0.3-61               cluster_2.1.4             tidygraph_1.2.2           KEGGREST_1.34.0          
 [57] tibble_3.1.8              quantreg_5.94             ggrepel_0.9.1             ape_5.6-2                
 [61] listenv_0.8.0             Biostrings_2.62.0         png_0.1-7                 future_1.28.0            
 [65] withr_2.5.0               bitops_1.0-7              ggforce_0.4.1             plyr_1.8.7               
 [69] pillar_1.8.1              RcppParallel_5.1.5        GlobalOptions_0.1.2       cachem_1.0.6             
 [73] GetoptLong_1.0.5          clusterProfiler_4.2.2     DelayedMatrixStats_1.16.0 vctrs_0.5.0              
 [77] ellipsis_0.3.2            generics_0.1.3            urltools_1.7.3            RApiSerialize_0.1.2      
 [81] tools_4.1.2               beeswarm_0.4.0            munsell_0.5.0             tweenr_2.0.2             
 [85] fgsea_1.20.0              DelayedArray_0.20.0       fastmap_1.1.0             compiler_4.1.2           
 [89] abind_1.4-5               httpuv_1.6.6              plotly_4.10.0             rgeos_0.5-9              
 [93] GenomeInfoDbData_1.2.7    gridExtra_2.3             ggpp_0.4.5                lattice_0.20-45          
 [97] deldir_1.0-6              utf8_1.2.2                later_1.3.0               jsonlite_1.8.3           
[101] scales_1.2.1              dendsort_0.3.4            tidytree_0.4.1            pbapply_1.5-0            
[105] carData_3.0-5             sparseMatrixStats_1.6.0   genefilter_1.76.0         lazyeval_0.2.2           
[109] promises_1.2.0.1          car_3.1-1                 doParallel_1.0.17         R.utils_2.12.0           
[113] goftest_1.2-3             spatstat.utils_2.3-1      reticulate_1.26           brew_1.0-8               
[117] rmarkdown_2.17            ash_1.0-15                Rtsne_0.16                downloader_0.4           
[121] uwot_0.1.14               coda.base_0.5.2           Rook_1.1-1                survival_3.4-0           
[125] yaml_2.3.6                htmltools_0.5.3           memoise_2.0.1             candisc_0.8-6            
[129] locfit_1.5-9.6            graphlayouts_0.8.2        viridisLite_0.4.1         digest_0.6.30            
[133] assertthat_0.2.1          mime_0.12                 Rttf2pt1_1.3.11           N2R_1.0.1                
[137] RSQLite_2.2.18            yulab.utils_0.0.5         future.apply_1.9.1        data.table_1.14.2        
[141] blob_1.2.3                R.oo_1.25.0               drat_0.2.3                splines_4.1.2            
[145] labeling_0.4.2            Rhdf5lib_1.16.0           RCurl_1.98-1.9            broom_1.0.1              
[149] rhdf5_2.38.1              colorspace_2.0-3          mnormt_2.1.1              ggbeeswarm_0.6.0         
[153] shape_1.4.6               addinexamplesWV_0.2.0     aplot_0.1.8               ggrastr_1.0.1            
[157] sass_0.4.2                Rcpp_1.0.9                RANN_2.6.1                circlize_0.4.15          
[161] enrichplot_1.14.2         fansi_1.0.3               parallelly_1.32.1         R6_2.5.1                 
[165] grid_4.1.2                ggridges_0.5.4            lifecycle_1.0.3           formatR_1.12             
[169] ggsignif_0.6.4            leiden_0.4.3              jquerylib_0.1.4           DO.db_2.9                
[173] qvalue_2.26.0             RcppAnnoy_0.0.20          RColorBrewer_1.1-3        iterators_1.0.14         
[177] stringr_1.4.1             htmlwidgets_1.5.4         polyclip_1.10-0           triebeard_0.3.0          
[181] purrr_0.3.5               RMTstat_0.3.1             shadowtext_0.1.2          gridGraphics_0.5-1       
[185] ComplexHeatmap_2.10.0     mgcv_1.8-40               globals_0.16.1            leidenAlg_1.0.5          
[189] patchwork_1.1.2           spatstat.random_2.2-0     progressr_0.11.0          codetools_0.2-18         
[193] GO.db_3.14.0              psych_2.2.9               R.methodsS3_1.8.2         gtable_0.3.1             
[197] DBI_1.1.3                 ggpmisc_0.5.0             ggfun_0.0.7               tensor_1.5               
[201] httr_1.4.4                KernSmooth_2.23-20        stringi_1.7.8             reshape2_1.4.4           
[205] farver_2.1.1              annotate_1.72.0           viridis_0.6.2             ggtree_3.2.1             
[209] ggdendro_0.1.23           heplots_1.4-2             ggalt_0.4.0               geneplotter_1.72.0       
[213] ggplotify_0.1.0           scattermore_0.8           DESeq2_1.34.0             bit_4.0.4                
[217] scatterpie_0.1.8          spatstat.data_2.2-0       ggraph_2.1.0              pkgconfig_2.0.3          
[221] rstatix_0.7.0             knitr_1.40

estimateExpressionShiftMagnitudes uses all available memory

Tested on 118k cells, 12 clusters, largest cluster 36k cells. Job killed after swallowing all available memory (500 GB). I will look more into this and test n.subsamples and n.cores.

Add some ways to invalidate cache

Probably, we should have two ways:

parameter "ignore.cache=F" to each of the estimation* functions
function clearCache() for the Cacoa object

Warning in plotClusterFreeExpressionShifts()

Warning: `invoke()` is deprecated as of rlang 0.4.0.
Please use `exec()` or `inject()` instead.

Warning in estimateClusterFreeExpressionShifts()

Warning in private$getTopGenes(n.top.genes, gene.selection = gene.selection,  :
  Please run estimateClusterFreeDE() first to use gene.selection='z' or 'lfc'. Fall back to gene.selection='expression'.
as(<dgCMatrix>, "dgTMatrix") is deprecated since Matrix 1.5-0; do as(., "TsparseMatrix") instead

Troubles with estimateExpressionShiftMagnitudes() and estimateDiffCellDensity() functions

Dear Cacoa team,

Thank you for creating such a perspective tool for analysis single cell RNAseq data.

I currently tried to apply it to our data but faced several issues:

cao$estimateExpressionShiftMagnitudes() fails with the following message:

Filtering data... 
Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : 
  'NA' indices are not (yet?) supported for sparse Matrices

Could you please specify what exactly is wrong with the input data?

cao$estimateDiffCellDensity(type='permutation', verbose=T) function also fails with the following message:

> cao$estimateCellDensity()
> cao$estimateDiffCellDensity(type='permutation', verbose=T)

Error in smoothSignalOnGraph(., filter = g.filt, graph = graph) : 
  The provided graph is not connected. It has to be connected to estimate l.max.

It starts working when argument smooth is specified to FALSE.

This message looks confusing because

> gg <- cao$data.object@graphs$integrated_nn %>% graph_from_adjacency_matrix()
> igraph::is_connected(gg)
[1] TRUE

I tried to fix it by changing the k.param and prune.SNN arguments to make the graph of the original Seurat object more connected in the following way...

whole.integrated <- FindNeighbors(whole.integrated,
                                  k.param = 100, # instead of default k = 20
                                  prune.SNN = 0, # instead of default prune.SNN = 1/15
                                  )

...but it didn't help.

Respectively, function plotDiffCellDensity() works only with permutation parameter, otherwise it returns the same error.

Could you please clarify how these issues may be fixed? Thank you!

cao object can be downloaded here.

Add a vignette explaining parameters for all analyses

estimateCellLoadings fails when less than 3 cell types

I have an example with only two cell types. No matter which method I choose, I can't run estimateCellLoadings. This is due to several problems, for the standard method (lda) these problems start after running 'lm' since we then only have two coefficients, intercept and b.

cacoa/R/coda.R

Line 47 in 4b6e5aa

w <- res.lm$coefficients[-1]

A quick solution could be to add dummy cell type(s) with 0 counts. I.e., after running extractCodaData

cacoa/R/cacoa.R

Line 1496 in 4b6e5aa

 tmp <- private$extractCodaData(cells.to.remove=cells.to.remove, cells.to.remain=cells.to.remain, 

add

if (ncol(tmp$d.counts) < 3) {
  cnames <- colnames(tmp$d.counts)
  for (i in seq_len(3 - ncol(tmp$d.counts))) {
    cnts <- cbind(tmp$d.counts, 0)
    cnames <- c(cnames, paste0("dummy",i))
  }
  colnames(cnts) <- cnames
} else {
  cnts <- tmp$d.counts
}

After the calculations, we remove the dummy cell type(s) from the results. This is a quick and dirty fix, but it seems to work.

@pkharchenko, @iganna, @VPetukhov - do you have a better idea that could be implemented easily at this stage? I assume it's a rare case where one would have so few cell types, so we have to consider how much time we should spend on this.

ClusterFreeExpression Not S4 object

Thank you for the amazing package, I was running the steps and everything was working until the differential gene analysis. Error of not an S4 object. I was not sure what the problem was as all the previous steps ran fine with the cao environment.

cao$estimateClusterFreeExpressionShifts(gene.selection="expression", min.n.between=1, min.n.within=1)
Error in estimateClusterFreeExpressionShiftsC(t(cm), self$sample.per.cell[rownames(cm)], :
Not an S4 object.

I appreciate your help and this tool!

plotOntologyHeatmapCollapsed function got an error

Hi, I ran into an error when I called cao$plotOntologyHeatmapCollapsed function. I got an error

Error in `chr_as_locations()`:
! Can't rename columns that don't exist.
✖ Column `qvalues` doesn't exist.

Before I come to this step, everything ran well and functions cao$estimateDEPerCellType and cao$estimateOntology (I used 'GSEA')also worked. Here's some output info I got from rlang::last_error():

> pdf(paste0(rep.name, '/', rep.name,"_plotOntologyHeatmapCollapsed_up_in_MUT.pdf"), height=15, width=10)
> print(cao$plotOntologyHeatmapCollapsed(
+   name="GSEA", genes="up", n=50, clust.method="ward.D", size.range=c(1, 4)) + ggtitle("Up in MUT"))
Error in (function (cond)  : 
  error in evaluating the argument 'x' in selecting a method for function 'print': Can't rename columns that don't exist.
✖ Column `qvalues` doesn't exist.
> rlang::last_error()
<error/vctrs_error_subscript_oob>
Error in `chr_as_locations()`:
! Can't rename columns that don't exist.
✖ Column `qvalues` doesn't exist.
---
Backtrace:
  1. cao$plotOntologyHeatmapCollapsed(...)
  6. dplyr:::rename.data.frame(., geneID = core_enrichment, qvalue = qvalues)
  7. tidyselect::eval_rename(expr(c(...)), .data)
  8. tidyselect:::rename_impl(...)
  9. tidyselect:::eval_select_impl(...)
 18. tidyselect:::vars_select_eval(...)
 19. tidyselect:::walk_data_tree(expr, data_mask, context_mask, error_call)
 20. tidyselect:::eval_c(expr, data_mask, context_mask)
 21. tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
 22. tidyselect:::walk_data_tree(new, data_mask, context_mask)
 23. tidyselect:::as_indices_sel_impl(...)
 24. tidyselect:::as_indices_impl(x, vars, call = call, strict = strict)
 25. tidyselect:::chr_as_locations(x, vars, call = call)
Run `rlang::last_trace()` to see the full context.

I wonder is there anything I can do to fix this problem to generate GSEA results? Thank you so much for the help in advance!

Originally posted by @Angel-Wei in #21 (comment)

Troubles with estimateExpressionShiftMagnitudes()

Dear Cacoa team,

I've heard Cacoa on twitter and found it really useful in comparative analysis of scRNA-seq data. So I gave it a try at once. Thanks for your powerful tools.

But I faced issues when running estimateExpressionShiftMagnitudes():

Error in cbind(mtx, ext.mtx)[, col.names, drop = FALSE] : invalid or not-yet-implemented 'Matrix' subsetting
9. stop("invalid or not-yet-implemented 'Matrix' subsetting")
8. cbind(mtx, ext.mtx)[, col.names, drop = FALSE]
7. cbind(mtx, ext.mtx)[, col.names, drop = FALSE]
6. FUN(X[[i]], ...)
5. lapply(., sccore:::extendMatrix, genes)
4. lapply(., [, , genes, drop = FALSE)
3. cms.filt %<>% lapply(sccore:::extendMatrix, genes) %>% lapply([, , genes, drop = FALSE)
2. filterExpressionDistanceInput(count.matrices, cell.groups = cell.groups, sample.per.cell = self$sample.per.cell, sample.groups = self$sample.groups, min.cells.per.sample = min.cells.per.sample, min.samp.per.type = min.samp.per.type, min.gene.frac = min.gene.frac, genes = genes, verbose = verbose)

cao$estimateExpressionShiftMagnitudes(min.cells = 10, n.cells = 1000, dist = "cor", n.subsamples = 50)

Here is my code and sessions

library(Seurat)
library(cacoa)
library(cowplot)
library(tidyverse)

seu = readRDS("output.Zfp541KO_vs_WT.rds")

cell.groups <- seu$annotation
sample.per.cell <- seu$orig.ident
sample.groups <- c("WT", "Zfp541.KO")
names(sample.groups) <- c("WT", "Z541")
ref.level <- "WT"
target.level <- "Zfp541.KO"
embedding <- FetchData(seu, vars = paste0("umap_pred_",1:2))

cao <- Cacoa$new(seu, sample.groups=sample.groups, cell.groups=cell.groups, sample.per.cell=sample.per.cell, 
                 ref.level=ref.level, target.level=target.level, embedding=embedding)

cao$estimateExpressionShiftMagnitudes(min.cells=10, n.cells=1e3, dist="cor", n.subsamples=50)

> R version 4.1.2 (2021-11-01)
Platform: x86_64-conda-linux-gnu (64-bit)
Running under: CentOS Linux 7 (Core)

Matrix products: default
BLAS/LAPACK: /home/conda/envs/r4.1/lib/libopenblasp-r0.3.18.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8   
 [6] LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] forcats_0.5.1      stringr_1.4.0      dplyr_1.0.8        purrr_0.3.4        readr_2.1.2        tidyr_1.2.0        tibble_3.1.6      
 [8] ggplot2_3.3.5      tidyverse_1.3.1    cowplot_1.1.1      cacoa_0.2.0        Matrix_1.4-0       SeuratObject_4.0.4 Seurat_4.1.0      

loaded via a namespace (and not attached):
  [1] utf8_1.2.2                  reticulate_1.24             R.utils_2.11.0              tidyselect_1.1.2            RSQLite_2.2.9              
  [6] AnnotationDbi_1.56.2        htmlwidgets_1.5.4           grid_4.1.2                  Rtsne_0.15                  munsell_0.5.0              
 [11] codetools_0.2-18            ica_1.0-2                   future_1.23.0               miniUI_0.1.1.1              withr_2.4.3                
 [16] colorspace_2.0-3            Biobase_2.54.0              knitr_1.37                  rstudioapi_0.13             stats4_4.1.2               
 [21] ROCR_1.0-11                 ggsignif_0.6.3              tensor_1.5                  listenv_0.8.0               MatrixGenerics_1.6.0       
 [26] labeling_0.4.2              GenomeInfoDbData_1.2.7      polyclip_1.10-0             bit64_4.0.5                 farver_2.1.0               
 [31] parallelly_1.30.0           vctrs_0.3.8                 generics_0.1.2              xfun_0.29                   R6_2.5.1                   
 [36] GenomeInfoDb_1.30.1         bitops_1.0-7                spatstat.utils_2.3-0        cachem_1.0.6                DelayedArray_0.20.0        
 [41] assertthat_0.2.1            promises_1.2.0.1            scales_1.1.1                gtable_0.3.0                globals_0.14.0             
 [46] goftest_1.2-3               rlang_1.0.2                 splines_4.1.2               rstatix_0.7.0               lazyeval_0.2.2             
 [51] spatstat.geom_2.3-1         broom_0.7.12                yaml_2.3.5                  reshape2_1.4.4              abind_1.4-5                
 [56] modelr_0.1.8                backports_1.4.1             httpuv_1.6.5                tools_4.1.2                 sccore_1.0.1               
 [61] ellipsis_0.3.2              spatstat.core_2.3-2         RColorBrewer_1.1-2          BiocGenerics_0.40.0         ggridges_0.5.3             
 [66] Rcpp_1.0.8                  plyr_1.8.6                  zlibbioc_1.40.0             RCurl_1.98-1.5              ggpubr_0.4.0               
 [71] rpart_4.1.16                deldir_1.0-6                pbapply_1.5-0               S4Vectors_0.32.3            zoo_1.8-9                  
 [76] SummarizedExperiment_1.24.0 haven_2.4.3                 ggrepel_0.9.1               cluster_2.1.2               fs_1.5.2                   
 [81] magrittr_2.0.2              data.table_1.14.2           scattermore_0.7             lmtest_0.9-39               reprex_2.0.1               
 [86] RANN_2.6.1                  fitdistrplus_1.1-6          matrixStats_0.61.0          hms_1.1.1                   patchwork_1.1.1            
 [91] mime_0.12                   evaluate_0.15               xtable_1.8-4                XML_3.99-0.8                AUCell_1.16.0              
 [96] readxl_1.3.1                IRanges_2.28.0              gridExtra_2.3               compiler_4.1.2              KernSmooth_2.23-20         
[101] crayon_1.5.0                R.oo_1.24.0                 htmltools_0.5.2             mgcv_1.8-39                 later_1.2.0                
[106] tzdb_0.2.0                  lubridate_1.8.0             DBI_1.1.2                   dbplyr_2.1.1                MASS_7.3-56                
[111] car_3.0-12                  permute_0.9-7               cli_3.2.0                   R.methodsS3_1.8.1           parallel_4.1.2             
[116] igraph_1.2.11               GenomicRanges_1.46.1        pkgconfig_2.0.3             plotly_4.10.0               spatstat.sparse_2.1-0      
[121] xml2_1.3.3                  annotate_1.72.0             XVector_0.34.0              rvest_1.0.2                 digest_0.6.29              
[126] sctransform_0.3.3           RcppAnnoy_0.0.19            graph_1.72.0                spatstat.data_2.1-2         Biostrings_2.62.0          
[131] rmarkdown_2.11              cellranger_1.1.0            leiden_0.3.9                uwot_0.1.11                 GSEABase_1.56.0            
[136] shiny_1.7.1                 lifecycle_1.0.1             nlme_3.1-155                jsonlite_1.7.3              carData_3.0-5              
[141] viridisLite_0.4.0           fansi_1.0.3                 pillar_1.7.0                ggsci_2.9                   lattice_0.20-45            
[146] KEGGREST_1.34.0             fastmap_1.1.0               httr_1.4.2                  survival_3.2-13             glue_1.6.2                 
[151] png_0.1-7                   bit_4.0.4                   stringi_1.7.6               blob_1.2.2                  memoise_2.0.1              
[156] irlba_2.3.5                 future.apply_1.8.1          ape_5.6-2

The seurat object here

Error: dimension should be positive

when I convert seurat to Cacoa, error was reported:

cao<- Cacoa$new(seurat, sample.groups=newmetadata$treat, cell.groups=seurat$celltype, sample.per.cell=seurat$sample,
ref.level="PreTreat", target.level="PostTreat")

Which step did I do wrong??

Error: dimension should be positive
9.
stop("dimension should be positive", call. = FALSE)
8.
check_dim(dim)
7.
coda.base::ilr_basis(ncol(freqs), type = "default")
6.
getLoadings(samples.init$cnts[[ib]], samples.init$groups[[ib]])
5.
FUN(X[[i]], ...)
4.
lapply(1:length(samples.init$cnts), function(ib) {
getLoadings(samples.init$cnts[[ib]], samples.init$groups[[ib]])
})
3.
do.call(cbind, lapply(1:length(samples.init$cnts), function(ib) {
getLoadings(samples.init$cnts[[ib]], samples.init$groups[[ib]])
}))
2.
runCoda(cnts, groups, n.boot = n.boot, n.seed = n.seed, ref.cell.type = ref.cell.type)
1.
cao$estimateCellLoadings()

palette argument in plotSampleDistances works only for discrete covariates

The problem is in

cacoa/R/plot.R

Line 685 in 6a5efd2

gg <- gg + scale_color_manual(values=palette)

sample.groups are not extracted automatically anymore

We should discuss whether this is desired. Originally, sample.groups were extracted automatically based on sample names and ref.level. Now, if one doesn't set sample.groups argument (#8bf3d95):

cao <- cacoa:::Cacoa$new(con, cell.groups = annotation, ref.level="wt", n.cores=30)
Error: Must request at least one colour from a hue palette.
traceback()
6: stop("Must request at least one colour from a hue palette.",
call. = FALSE)
5: (scales::hue_pal())(length(levels(sample.groups)))
4: rev((scales::hue_pal())(length(levels(sample.groups))))
3: setNames(rev((scales::hue_pal())(length(levels(sample.groups)))),
levels(sample.groups))
2: .subset2(public_bind_env, "initialize")(...)
1: cacoa:::Cacoa$new(con, cell.groups = annotation, ref.level = "wt",
n.cores = 30)

invalid character indexing when running estimateClusterFreeDE

Hey,

I'm running into next error when trying to execute cluster-free DE changes (with cao$estimateClusterFreeDE ):

> de_cacoa = cao$estimateClusterFreeDE(min.expr.frac=0.01, robust = FALSE, adjust.pvalues=TRUE, smooth=TRUE, verbose=TRUE)
Estimating cluster-free Z-scores for 7979 most expressed genes
Warning in extractCellGraph.Seurat(self$data.object) :
  The provided adjacency matrix is not symmetric. Converting it to undirected graph.
as(<dsCMatrix>, "dgCMatrix") is deprecated since Matrix 1.5-0; do as(., "generalMatrix") instead
Error in intI(j, n = d[2L], dn[[2L]], give.dn = FALSE) : 
  invalid character indexing

What i have prior to that is only created seurat object (I convert it manually from SingleCellExperiment because original data is stored in SingleCellExperiment) and create a new Cacoa object:

# convert to seurat and calculate NN graph
seurat = CreateSeuratObject(counts = as.matrix(counts(sce)))
seurat[["RNA"]]@data = as.matrix(logcounts(sce))
seurat[["pca"]] <- CreateDimReducObject(embeddings = reducedDim(sce , "pca.corrected"), key = "pc_")
seurat[["umap_pca"]] <- CreateDimReducObject(embeddings = as.matrix(reducedDim(sce , "UMAP")), key = "umap_")
seurat = FindNeighbors(seurat , reduction = "pca" , dims = 1:30 , k.param = 50)
seurat = AddMetaData(seurat , as.character(sce$tomato) , col.name = "sample.groups")
seurat = AddMetaData(seurat , sce$celltype , col.name = "cell.groups")
seurat = AddMetaData(seurat , sce$sample , col.name = "sample.per.cell")

# actively set other params
ref.level = "Tal1+"
target.level = "Tal1-"
embedding = as.data.frame(reducedDim(sce , "UMAP"))

sample.groups = seurat$sample.groups
names(sample.groups) = sce$sample

cell.groups = seurat$cell.groups
names(cell.groups) = sce$cell

sample.per.cell = seurat$sample.per.cell
names(sample.per.cell) = sce$cell

# run cacoa
cao <- Cacoa$new(seurat, sample.groups=sample.groups, cell.groups=cell.groups, sample.per.cell=sample.per.cell, 
                 ref.level=ref.level, target.level=target.level, graph.name="RNA_nn", embedding = embedding)
cao$plot.params <- list(size=0.1, alpha=0.1, font.size=c(2, 3))
cao$plot.theme <- cao$plot.theme + theme(legend.background=element_blank())


de_cacoa = cao$estimateClusterFreeDE(min.expr.frac=0.01, robust = FALSE, adjust.pvalues=TRUE, smooth=TRUE, verbose=TRUE)

What is quite weird is that the very similar code ran yesterday - i then was playing w some settings and smth broke apparently , then i tried to retrace to my best recollection the script to the one which was running but no success. Meanwhile I also reinstalled cacoa package.

Any advice will be much appreciated! Thanks!

My session info is here:

> sessionInfo()
R version 4.2.3 (2023-03-15)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Ventura 13.3.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] sp_1.4-7                    SeuratObject_4.1.0          Seurat_4.1.0                cacoa_0.4.0                
 [5] Matrix_1.5-3                dplyr_1.0.9                 ggbeeswarm_0.6.0            stringr_1.4.0              
 [9] viridis_0.6.2               viridisLite_0.4.0           ggpubr_0.4.0                ggplot2_3.3.6              
[13] SingleCellExperiment_1.18.0 SummarizedExperiment_1.26.1 Biobase_2.56.0              GenomicRanges_1.48.0       
[17] GenomeInfoDb_1.32.1         IRanges_2.30.0              S4Vectors_0.34.0            BiocGenerics_0.42.0        
[21] MatrixGenerics_1.9.1        matrixStats_0.62.0          BiocParallel_1.30.0         miloR_1.4.0                
[25] edgeR_3.38.0                limma_3.52.0                miloDE_0.0.0.9000          

loaded via a namespace (and not attached):
  [1] scattermore_0.8           coda_0.19-4               tidyr_1.2.0               clusterGeneration_1.3.7  
  [5] knitr_1.39                irlba_2.3.5               DelayedArray_0.22.0       wesanderson_0.3.6        
  [9] data.table_1.14.2         rpart_4.1.19              hardhat_1.2.0             RCurl_1.98-1.6           
 [13] generics_0.1.2            ScaledMatrix_1.4.0        cowplot_1.1.1             RANN_2.6.1               
 [17] combinat_0.0-8            future_1.25.0             lubridate_1.8.0           spatstat.data_3.0-0      
 [21] httpuv_1.6.5              assertthat_0.2.1          gower_1.0.0               xfun_0.30                
 [25] evaluate_0.15             promises_1.2.0.1          fansi_1.0.3               readxl_1.4.0             
 [29] igraph_1.3.1              DBI_1.1.2                 tmvnsim_1.0-2             htmlwidgets_1.5.4        
 [33] spatstat.geom_3.0-3       tester_0.1.7              purrr_0.3.4               ellipsis_0.3.2           
 [37] paleotree_3.4.4           RSpectra_0.16-1           backports_1.4.1           ggcorrplot_0.1.3         
 [41] deldir_1.0-6              sparseMatrixStats_1.7.0   vctrs_0.4.1               ROCR_1.0-11              
 [45] abind_1.4-5               withr_2.5.0               batchelor_1.12.0          ggforce_0.3.3            
 [49] progressr_0.10.0          sctransform_0.3.3         geneBasisR_0.0.0.9000     scran_1.24.0             
 [53] parsnip_1.0.0             goftest_1.2-3             mnormt_2.0.2              phytools_1.0-3           
 [57] cluster_2.1.4             ape_5.6-2                 lazyeval_0.2.2            crayon_1.5.1             
 [61] recipes_1.0.1             pkgconfig_2.0.3           tweenr_1.0.2              nlme_3.1-162             
 [65] vipor_0.4.5               nnet_7.3-18               pals_1.7                  rlang_1.0.4              
 [69] globals_0.15.0            lifecycle_1.0.1           miniUI_0.1.1.1            fastDummies_1.6.3        
 [73] rsvd_1.0.5                dichromat_2.0-0.1         cellranger_1.1.0          polyclip_1.10-0          
 [77] RcppHNSW_0.3.0            lmtest_0.9-40             yardstick_1.1.0           phangorn_2.8.1           
 [81] carData_3.0-5             zoo_1.8-10                beeswarm_0.4.0            ggridges_0.5.3           
 [85] png_0.1-7                 bitops_1.0-7              KernSmooth_2.23-20        DelayedMatrixStats_1.18.0
 [89] RcppGreedySetCover_0.1.0  parallelly_1.31.1         spatstat.random_3.0-1     rstatix_0.7.0            
 [93] ggsignif_0.6.3            sccore_1.0.1              beachmat_2.12.0           scales_1.2.0             
 [97] magrittr_2.0.3            plyr_1.8.7                ica_1.0-2                 gdata_2.18.0             
[101] zlibbioc_1.42.0           compiler_4.2.3            dqrng_0.3.0               RColorBrewer_1.1-3       
[105] plotrix_3.8-2             fitdistrplus_1.1-8        cli_3.3.0                 XVector_0.36.0           
[109] listenv_0.8.0             patchwork_1.1.1           pbapply_1.5-0             MASS_7.3-58.2            
[113] mgcv_1.8-42               tidyselect_1.1.2          stringi_1.7.6             yaml_2.3.5               
[117] BiocSingular_1.12.0       locfit_1.5-9.5            ggrepel_0.9.1             pbmcapply_1.5.1          
[121] grid_4.2.3                fastmatch_1.1-3           tools_4.2.3               future.apply_1.9.0       
[125] parallel_4.2.3            rstudioapi_0.13           bluster_1.6.0             metapod_1.4.0            
[129] gridExtra_2.3             prodlim_2019.11.13        scatterplot3d_0.3-41      farver_2.1.0             
[133] Rtsne_0.16                ggraph_2.0.5              RcppZiggurat_0.1.6        digest_0.6.29            
[137] rgeos_0.5-9               lava_1.6.10               shiny_1.7.1               quadprog_1.5-8           
[141] Rcpp_1.0.8.3              car_3.0-13                broom_0.8.0               scuttle_1.6.0            
[145] later_1.3.0               RcppAnnoy_0.0.19          httr_1.4.3                colorspace_2.0-3         
[149] tensor_1.5                reticulate_1.24           splines_4.2.3             uwot_0.1.11              
[153] statmod_1.4.36            expm_0.999-6              spatstat.utils_3.0-1      scater_1.24.0            
[157] graphlayouts_0.8.0        mapproj_1.2.8             plotly_4.10.0             xtable_1.8-4             
[161] jsonlite_1.8.0            tidygraph_1.2.1           timeDate_4021.104         Rfast_2.0.6              
[165] ipred_0.9-13              MetBrewer_0.2.0           R6_2.5.1                  pillar_1.7.0             
[169] htmltools_0.5.2           mime_0.12                 glue_1.6.2                fastmap_1.1.0            
[173] BiocNeighbors_1.14.0      class_7.3-21              codetools_0.2-19          maps_3.4.0               
[177] furrr_0.3.0               utf8_1.2.2                lattice_0.20-45           spatstat.sparse_3.0-0    
[181] tibble_3.1.7              ResidualMatrix_1.6.0      Augur_1.0.3               numDeriv_2016.8-1.1      
[185] leiden_0.4.2              gtools_3.9.2              survival_3.5-3            rmarkdown_2.16           
[189] munsell_0.5.0             GenomeInfoDbData_1.2.8    rsample_1.0.0             reshape2_1.4.4           
[193] gtable_0.3.0              spatstat.core_2.4-2

Improve localZScores calculation

This function has been tested with 80k cells. I just tested with ~120k cells. It fails at:

z.scores <- (local.mean.mat[[non.ref.level]] - local.mean.mat[[ref.level]]) / pmax(stds, min.std)
Error: long vectors not supported yet

`estimateMetadataSeparation` with `space = "pseudo.bulk"` returns error

cao$estimateMetadataSeparation(space = "foo")
Error in match.arg(space) : 
  'arg' should be one of “expression.shifts”, “coda”, “pseudo.bulk”

but


cao$estimateMetadataSeparation(sample.meta = meta, space = "pseudo.bulk")
Error in self$getSampleDistanceMatrix(space = space, cell.type = NULL,  : 
  Not implemented space: pseudo.bulk!

estimateMetadataSeparation() gave an error - "object 'n.permutations' not found"

Dear Cacoa team,

Previously, I encountered an error of plotOntologyHeatmapCollapsed function and I got it resolved after installing the Cacoa in dev version (like mentioned in the issue I created earlier #45 .Thanks for the help!). Any other function in my pipeline was working fine. However,
when I was trying to use

cao$estimateMetadataSeparation(sample.meta=cao$misc$sample_metadata),

it gave me an error like this:

and I didn't see this in old Cacoa objects I created before I re-install the dev version. Though I didn't see failure of using this function affected other functions, I wonder Is there anything else I can do to fix this problem? Thank you so much for the help in this matter!

cao$plotOntology() filtering by "up" and "down"dos not work

The function cao$plotOntology() shows always the same result no matter which of "down", "up" or "all" I pass to genes parameter. I checked manually and I think it plots all GO terms. If it matters: I use the function on GSEA results.
Moreover, it does no work for me to plot the GSEA results as a barplot.

Best,
Laura

Move checkPackageInstalled to sccore

@evanbiederstedt @pkharchenko @VPetukhov Would it make sense to move checkPackageInstalled to sccore? It's a nice function to have available. In an unrelated project, we could use it but I'm not going to add Cacoa to depends for just this function.

Notch error in plotExpressionShiftMagnitudes

I receive this error: "notch went outside hinges. Try setting notch=FALSE" for each group on the plot, i.e., a lot of times when plotting with my internal test data.

Although the error indeed disappears with notch=F, we should consider if something should be done to this warning - it may seem confusing.

"Cacoa" is not exported to environment

cao <- Cacoa$new(con, sample.groups=condition, n.cores=20)
Error: object 'Cacoa' not found
cao <- cacoa:::Cacoa$new(con, sample.groups=condition, n.cores=20)

Installation errors

Dear cacoa team,

I had issues installing your packages simply stating that the R_HOME environment variable is ignored and then the that there was no such path or directory.

the underlying cause is that from Rcpp 0.11.0 onwars the Rcpp:::LdFlags() only returns an empty string, so this has to be removed from the Makefile in order to install your package on newer versions.

https://cran.r-project.org/web/packages/RcppClassic/vignettes/RcppClassic-intro.pdf

estimateExpressionShiftMagnitudes uses all available cores

Independent of the number of n.cores, all available cores are used.

It should be this snippet causing the problem since running with dist="cor" only uses the allowed cores:
if(dist=='JS') {
tcm <- t(tcm/pmax(1,rowSums(tcm)))
tcd <- pagoda2:::jsDist(tcm, ncores = 1); # Here
dimnames(tcd) <- list(colnames(tcm),colnames(tcm));
}

As you can see, I've tried to force ncores=1 for the call (which is within sccore:::plapply call with n.cores = n.cores), but this doesn't have an effect.

So it seems like the problem lies within P2.

plotSampleDistances with metadata gives error

cao$plotSampleDistances(space='expression.shifts', font.size=4, show.sample.size=T, method="UMAP", sample.colors=metadata$version)

if I do the following code, I get a black and white image:
cao$plotSampleDistances(space='expression.shifts', font.size=4, show.sample.size=T, method="UMAP", sample.colors=metadata$sequencing ,palette = rainbow(3))

Passing `data.object=NULL` makes passing `embedding` mandatory

Need to explicitly check if it's NULL and don't call extractEmbedding if so.
Need also to describe data.object=NULL in the README

object 'cda' not found in getCdaLoadings

When running report.Rmd for the HC dataset, I get "error in evaluating the argument 'x' in selecting a method for function 'mean': object 'cda' not found" on this line. As far as I can see, there is a possibility that the cda object is not created within the loop, but I don't know what it means.

@iganna , can you please check it?

Error calling `estimateExpressionShiftMagnitudes` function with Seurat object

Hi, I am trying to use cacoa with Seurat object and to follow vignettes.
Code below failed on estimateExpressionShiftMagnitudes function with the following error message:

Error in FUN(X[[i]], ...) : 
  no slot of name "counts" for this object of class "Assay5"

cao <- Cacoa$new(so, sample.groups=sample.groups, cell.groups=cell.groups, sample.per.cell=sample.per.cell, 
                 ref.level="control", target.level="case")

cao$estimateCellLoadings()
cao$estimateExpressionShiftMagnitudes()

estimateMetadataSeparation gives error

cao$estimateMetadataSeparation(sample.meta = metadata)

gives the following error:

estimateClusterFreeDE returns only NaNs and 0s (seurat object)

Hi!
I'm looking into cluster-free DE detection. I have a Seurat object with an added graph for joint embedding using FindNeighbors as suggested per the installation tutorial.
However, as an output of estimateClusterFreeDE I get matrix, where z, z.adj and lfc are all either NaNs or 0s (suggesting no DE) - even for genes for which I know I should get DE signal. This is consistent with the increase of k neighbors and using either kNN or sNN graphs.
Do you maybe have any ideas of why is this happening? I'm planning on trying conos as well but ideally would like to have it working for seurat.

Thanks!

Possible wrong contour for overlapping

Hi,

Thanks for developing cacoa.

While running estimateCellDensity and highlighting contours of clusters 2 and 9 in the following plot, I get overlapping contours,

code snippet

cao.8$estimateCellDensity()
options(repr.plot.height=12, repr.plot.width=12, res=150 ) 
p0 <- cao.8$plotEmbedding(groups=conos.8$clusters$leiden$groups, alpha=alpha, size=size, title='annotation', plot.na=F, show.legend=F)
pl <- cao.8$plotCellDensity(add.points = TRUE,contours = c(2,9),show.grid=T)
p1 <- cao.8$plotEmbedding(groups=sample_type, alpha=alpha, size=size, title='diagnosis', mark.groups=F) +
    theme(legend.position=c(0.85, 0.15)) +
    guides(color=guide_legend(override.aes = list(size=3,alpha=0.8),title=''))
cowplot::plot_grid(plotlist = c(list(p0),list(p1),pl),  nrow = 2)

Upon a close inspection, I see the there is indeed a large overlap between 2 and 9,

Even then, should not we have a larger contour for cluster 9?

estimateDiffCellDensity() drops negative values

After calculations, plotCellDensity() shows higher densities for some cells in ref.level. However,

> cao$test.results$cell.density$diff$wilcox$adj %>% min
[1] 0

So no negative values are included in plotDiffCellDensity().

Incorporate DE estimations

We have getPerCellTypeDEmat but we don't add the results to the Cacoa object. DEs are needed for onthology calculations, so this should be incorporated.

Also, can getPerCellTypeDEmat be used for Seurat objects?

no dependency on psych package declared

$estimateCellLoadings() requires psych package, which is not declared. Please double-check if it's needed.
edit: same thing goes for the 'coda.base' pacakage

Roxygen2 documentation

Hi there

It would be nice to get this on CRAN when manuscript is ready. The big impediment to this is the roxygen2 documentation. It's better to do this now rather than later.

--- if a function is not exported, use the tag #' @keywords internal. It's useful to know which functions are exported and which are not
--- if a function is exported, please document all fields. Please use the format #' @parameterName type Explanation here (default=xxx). It's very useful to know the data type in the rxoygen2 comments.
--- it will be useful for CRAN reviews if all exported functions have examples, but....that doesn't always happen. But let's try our best.

With this, the other steps will be much easier, and we can quickly get this on CRAN. Thanks everyone

Store "groups" in the Cacoa object

Parameter groups is often the same for all analyses. So we can store it as a field in the Cacoa object, and in functions set default value to the stored one: groups=self$groups. In the future it could be also nice to have a field with a metadata data.frame in ScanPy style.

Encounted an Error in if (maxP > -minP) { : missing value where TRUE/FALSE needed

Greetings, I had a poblem as below:

cao$estimateOntology(type="GSEA", org.db=org.Mm.eg.db::org.Mm.eg.db, verbose=FALSE, n.cores=1)
Using stored GO environment. Use ignore.cache=TRUE if you want to re-estimate it. Set ignore.cache=FALSE to suppress this message.
Error in if (maxP > -minP) { : missing value where TRUE/FALSE needed

You have my thanks!

estimateClusterFreeExpressionShifts() produces all NaN

I've tried to trace the error, but everything looks fine until estimateClusterFreeExpressionShiftsC() within estimateClusterFreeExpressionShifts(). I'm using v.0.4.0.

Data can be shared upon request. @evanbiederstedt, they are on Thor@KU, /data/neonatal/rasmus/con_d.rds and annotation vector is /data/neonatal/rasmus/annotation_final.rds.

I ran:

sample.groups <- con$samples %>% 
  names() %>% 
  setNames(strsplit(., "_") %>% sapply(`[[`, 1), .)

cao <- Cacoa$new(con, 
                 sample.groups = sample.groups, 
                 cell.groups = anno, 
                 ref.level = "Intact", 
                 target.level = "Injury",
                 n.cores = 100)

cao$estimateClusterFreeExpressionShifts()

I also tried first running cao$estimateClusterFreeDE(), and that also produced a lot of NaNs.

sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: x86_64-redhat-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux 8.5 (Ootpa)

Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblas-r0.3.12.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] qs_0.25.4               cacoa_0.4.0             sccore_1.0.2            sparseMatrixStats_1.6.0 MatrixGenerics_1.6.0   
 [6] matrixStats_0.62.0      ggplot2_3.3.6           cowplot_1.1.1           conos_1.4.9             igraph_1.3.5           
[11] Matrix_1.5-1            magrittr_2.0.3          dplyr_1.0.10           

loaded via a namespace (and not attached):
  [1] utf8_1.2.2             R.utils_2.12.0         tidyselect_1.2.0       RSQLite_2.2.18         AnnotationDbi_1.56.2  
  [6] grid_4.1.2             BiocParallel_1.28.3    Rtsne_0.16             scatterpie_0.1.8       munsell_0.5.0         
 [11] codetools_0.2-18       miniUI_0.1.1.1         withr_2.5.0            colorspace_2.0-3       GOSemSim_2.20.0       
 [16] Biobase_2.54.0         filelock_1.0.2         knitr_1.40             rstudioapi_0.14        stats4_4.1.2          
 [21] DOSE_3.20.1            labeling_0.4.2         addinexamplesWV_0.2.0  urltools_1.7.3         GenomeInfoDbData_1.2.7
 [26] polyclip_1.10-0        bit64_4.0.5            farver_2.1.1           rprojroot_2.0.3        downloader_0.4        
 [31] vctrs_0.5.0            treeio_1.18.1          generics_0.1.3         xfun_0.34              BiocFileCache_2.2.1   
 [36] R6_2.5.1               doParallel_1.0.17      GenomeInfoDb_1.30.1    ggbeeswarm_0.6.0       clue_0.3-61           
 [41] graphlayouts_0.8.2     bitops_1.0-7           cachem_1.0.6           fgsea_1.20.0           gridGraphics_0.5-1    
 [46] assertthat_0.2.1       promises_1.2.0.1       scales_1.2.1           ggraph_2.1.0           enrichplot_1.14.2     
 [51] beeswarm_0.4.0         gtable_0.3.1           processx_3.8.0         drat_0.2.3             tidygraph_1.2.2       
 [56] rlang_1.0.6            GlobalOptions_0.1.2    splines_4.1.2          lazyeval_0.2.2         brew_1.0-8            
 [61] yaml_2.3.6             reshape2_1.4.4         httpuv_1.6.6           qvalue_2.26.0          clusterProfiler_4.2.2 
 [66] tools_4.1.2            ggplotify_0.1.0        ellipsis_0.3.2         RColorBrewer_1.1-3     BiocGenerics_0.40.0   
 [71] Rcpp_1.0.9             plyr_1.8.7             progress_1.2.2         zlibbioc_1.40.0        purrr_0.3.5           
 [76] RCurl_1.98-1.9         ps_1.7.2               prettyunits_1.1.1      dendsort_0.3.4         GetoptLong_1.0.5      
 [81] viridis_0.6.2          S4Vectors_0.32.4       ggrepel_0.9.1          cluster_2.1.4          data.table_1.14.2     
 [86] DO.db_2.9              circlize_0.4.15        triebeard_0.3.0        stringfish_0.15.7      xtable_1.8-4          
 [91] mime_0.12              hms_1.1.2              patchwork_1.1.2        evaluate_0.17          XML_3.99-0.11         
 [96] RMTstat_0.3.1          N2R_1.0.1              IRanges_2.28.0         gridExtra_2.3          shape_1.4.6           
[101] compiler_4.1.2         biomaRt_2.50.3         tibble_3.1.8           crayon_1.5.2           shadowtext_0.1.2      
[106] R.oo_1.25.0            htmltools_0.5.3        later_1.3.0            ggfun_0.0.7            mgcv_1.8-40           
[111] tidyr_1.2.1            aplot_0.1.8            RcppParallel_5.1.5     RApiSerialize_0.1.2    DBI_1.1.3             
[116] tweenr_2.0.2           formatR_1.12           dbplyr_2.2.1           pagoda2_1.0.10         ComplexHeatmap_2.10.0 
[121] MASS_7.3-58.1          rappdirs_0.3.3         cli_3.4.1              R.methodsS3_1.8.2      parallel_4.1.2        
[126] pkgconfig_2.0.3        xml2_1.3.3             foreach_1.5.2          ggtree_3.2.1           vipor_0.4.5           
[131] XVector_0.34.0         leidenAlg_1.0.5        yulab.utils_0.0.5      stringr_1.4.1          callr_3.7.2           
[136] digest_0.6.30          Biostrings_2.62.0      rmarkdown_2.17         fastmatch_1.1-3        tidytree_0.4.1        
[141] Rook_1.1-1             curl_4.3.3             shiny_1.7.2            rjson_0.2.21           lifecycle_1.0.3       
[146] nlme_3.1-160           jsonlite_1.8.3         viridisLite_0.4.1      fansi_1.0.3            pillar_1.8.1          
[151] lattice_0.20-45        KEGGREST_1.34.0        fastmap_1.1.0          httr_1.4.4             pkgbuild_1.3.1        
[156] GO.db_3.14.0           glue_1.6.2             remotes_2.4.2          png_0.1-7              iterators_1.0.14      
[161] bit_4.0.4              ggforce_0.4.1          stringi_1.7.8          blob_1.2.3             memoise_2.0.1         
[166] irlba_2.3.5.1          ape_5.6-2

Unused argument

I can't get this to work, can anybody come up with a solution?

Upon installation, I get this:
Note: possible error in 'extractSampleGroups(data.object, ': unused arguments (ref.level, target.level)

The idea is to set sample groups automatically by defining ref.level (e.g., CTRL) and target level (e.g., epilepsy). However, this may be redundant as we wish to consider two groups, so we might only need to set 1 argument.

Understanding resampling parameters for DEG analysis

Hi,
I want to calculate DE genes and used the following command:

cao$estimatePerCellTypeDE(max.cell.count=50, name='de',
resampling.method='bootstrap', max.resamplings=29, n.cores=50)

My cell types have quite different sizes. Ranging from less than 100 to more than 3000. Which settings to you recommend? I noticed that when I change the number of max.cell.count to 100 I get way lower p.adj and thus a higher number of significant DE genes. Moreover I am always getting these warning messages (see image). Is it fine to ignore? (Microglia is the smallest cluster with only 55 cells.)
Should I calculate rather without resampling?

Thanks for your help!

kharchenkolab / cacoa Goto Github PK

cacoa's People

Contributors

Stargazers

Watchers

Forkers

cacoa's Issues

Recommend Projects

Recommend Topics

Recommend Org