Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Here is a more detailed breakdown of the steps that I'm taking <div class="snippet

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

de_analysis save intermediate steps + return likelihood optimization requests,about stephenslab/fasttopics

Comments (19)

pcarbo commented on August 23, 2024

@MarcElosua I would have expected this to complete before 48 h. A couple questions: (1) Is mtx_c a sparse matrix? (2) Does your Slurm job request 64 cores?

from fasttopics.

MarcElosua commented on August 23, 2024

Hi @pcarbo such a quick response! Thanks!

mtx_c is indeed a sparse matrix

And yes, I am asking for 125 cores - not specifying nc = 125 so that there is sufficient memory room. See below the slurm job script specs used

SBATCH --error=./%x.slurm.%J.err
#SBATCH --output=./%x.slurm.%J.out
#SBATCH -J FA_tumor
#SBATCH -t 2-00:00:00
#SBATCH -N 1 # number of nodes
#SBATCH -n 1 # number of tasks
#SBATCH -c 125 # cpu per task
#SBATCH -q vlong
#SBATCH -p genD
#SBATCH --mem=900GB

Thanks a lot for any help you can provide!

from fasttopics.

pcarbo commented on August 23, 2024

How many topics do you have?

from fasttopics.

MarcElosua commented on August 23, 2024

Working with 20 topics, but have also observed >2days when working with 10 topics. If it serves as a reference it took ~8h to run fit_topic_model

from fasttopics.

pcarbo commented on August 23, 2024

Something is not right. For a point of comparison, I ran the following on a larger data set (with 94,655 cells), and it took 5 hours:

de <- de_analysis(fit,counts,pseudocount = 0.1,
                  control = list(ns = 1e4,nc = 20,nsplit = 1000))

You are using way more resources than I did (20 CPUs, 16 GB memory), and yet it is taking a lot longer.

For some reason, your MCMC simulations are running much more slowly than I would expect.

from fasttopics.

MarcElosua commented on August 23, 2024

Here is a more detailed breakdown of the steps that I'm taking

# ...Previously loaded the Seurat object

# Extract count matrix and remove rows that are all 0s
mtx_c <- sce@assays$Spatial@counts
mtx_c <- mtx_c[sparseMatrixStats::rowSums2(mtx_c) > 0, ]

# Look at the dimensions of the data 
dim(mtx_c)
# [1] 43683 23185
# 43683 genes
# 23185 spots

# Peak into the sparse matrix
mtx_c[1:5, 1:3]
# 5 x 3 sparse Matrix of class "dgCMatrix"
#                   csutdh_opk1qq_AAACAGAGCGACTCCT-1 csutdh_opk1qq_AAACCGGGTAGGTACC-1 csutdh_opk1qq_AAACCGTTCGTCCAGG-1
# GRCh38-AL627309.1                                .                                .                                .
# GRCh38-AP006222.2                                .                                .                                .
# GRCh38-LINC01409                                 .                                .                                .
# GRCh38-FAM87B                                    .                                .                                .
# GRCh38-LINC01128                                 1                                .                                .

# Run fit_topic_model to obtain the topics and weights
fit <- fit_topic_model(
    X = t(mtx_c), # It needs to be cells by genes as specified in the vignette - https://stephenslab.github.io/fastTopics/articles/single_cell_rnaseq_basic.html
    k = 20, # assess this better with Davide
    numiter.main = 100,
    numiter.refine = 200,
    verbose = "progressbar")

# Look into fit
dim(fit$Fn)
# [1] 43683    20 - genes x topics
fit$Fn[1:5, 1:5]
#                             k1    k2    k3           k4           k5
# GRCh38-AL627309.1 9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-AP006222.2 9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-LINC01409  9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-FAM87B     9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-LINC01128  9.999999e-11 1e-10 1e-10 9.999999e-11 0.0152297941

dim(fit$Fy)
# [1] 43683    20 - genes x topics
fit$Fy[1:5, 1:5]
#                             k1    k2    k3           k4           k5
# GRCh38-AL627309.1 9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-AP006222.2 9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-LINC01409  9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-FAM87B     9.999999e-11 1e-10 1e-10 9.999999e-11 0.0000000001
# GRCh38-LINC01128  9.999999e-11 1e-10 1e-10 9.999999e-11 0.0152297941

dim(fit$F)
# [1] 43683    20 - genes x topics
fit$F[1:5, 1:5]
#                             k1    k2    k3           k4           k5
# GRCh38-AL627309.1 1.800379e-14 1.285928e-14 1.368928e-14 2.383761e-14 3.156191e-14
# GRCh38-AP006222.2 1.800379e-14 1.285928e-14 1.368928e-14 2.383761e-14 3.156191e-14
# GRCh38-LINC01409  1.800379e-14 1.285928e-14 1.368928e-14 2.383761e-14 3.156191e-14
# GRCh38-FAM87B     1.800379e-14 1.285928e-14 1.368928e-14 2.383761e-14 3.156191e-14
# GRCh38-LINC01128  1.800379e-14 1.285928e-14 1.368928e-14 2.383761e-14 4.806814e-06

dim(fit$Ln)
# [1] 23185    20 - spots x topics
fit$Ln[1:5, 1:5]
#                             k1    k2    k3           k4           k5
# csutdh_opk1qq_AAACAGAGCGACTCCT-1 0.0000000001 0.0110870651 0.003594051 1e-10 0.53617207
# csutdh_opk1qq_AAACCGGGTAGGTACC-1 0.0384935403 0.7008891265 0.945408278 1e-10 0.04829545
# csutdh_opk1qq_AAACCGTTCGTCCAGG-1 0.1056269710 2.7110490831 0.435122693 1e-10 0.05035761
# csutdh_opk1qq_AAACCTCATGAAGTTG-1 0.0000000001 1.3298386714 0.700745528 1e-10 0.07661366
# csutdh_opk1qq_AAACGAGACGGTTGAT-1 0.0000000001 0.0000000001 0.100134320 1e-10 0.52806386

dim(fit$Ly)
# [1] 23185    20 - spots x topics
fit$Ly[1:5, 1:5]
#                                            k1           k2          k3    k4         k5
# csutdh_opk1qq_AAACAGAGCGACTCCT-1 0.0000000001 0.0110870651 0.003594051 1e-10 0.53617207
# csutdh_opk1qq_AAACCGGGTAGGTACC-1 0.0384935403 0.7008891265 0.945408278 1e-10 0.04829545
# csutdh_opk1qq_AAACCGTTCGTCCAGG-1 0.1056269710 2.7110490831 0.435122693 1e-10 0.05035761
# csutdh_opk1qq_AAACCTCATGAAGTTG-1 0.0000000001 1.3298386714 0.700745528 1e-10 0.07661366
# csutdh_opk1qq_AAACGAGACGGTTGAT-1 0.0000000001 0.0000000001 0.100134320 1e-10 0.52806386

dim(fit$L)
# [1] 23185    20 - spots x topics
fit$L[1:5, 1:5]
#                                            k1           k2           k3          k4          k5
# csutdh_opk1qq_AAACAGAGCGACTCCT-1 7.488724e-11 1.162443e-02 0.003539774 5.655994e-11 0.229040671
# csutdh_opk1qq_AAACCGGGTAGGTACC-1 7.024839e-03 1.790792e-01 0.226908833 1.378319e-11 0.005027538
# csutdh_opk1qq_AAACCGTTCGTCCAGG-1 1.627984e-02 5.850058e-01 0.088200413 1.164063e-11 0.004427322
# csutdh_opk1qq_AAACCTCATGAAGTTG-1 1.779454e-11 3.313088e-01 0.163994959 1.343965e-11 0.007776667
# csutdh_opk1qq_AAACGAGACGGTTGAT-1 6.255644e-11 8.758291e-11 0.082383157 4.724688e-11 0.188433902

fit$loss
# [1] -116800253
fit$loss.fnly
# [1] -116800253

fit$iter
# [1] 300

fit$beta
# [1] 0.850992
fit$beta0
# [1] 0.850992
fit$betamax
# [1] 0.9419995

fit$progress %>% tail()
#      iter loglik      loglik.multinom     dev        res      delta.f      delta.l nonzeros.f nonzeros.l extrapolate      beta
# 295  295 -230460338      -230334518 242466982 0.07015410 2.842171e-14 8.881784e-16  0.3870316  0.6348286           1 0.5812390
# 296  296 -230460338      -230334518 242466982 0.03006745 6.340514e-04 8.424174e-06  0.3870316  0.6348286           1 0.6393629
# 297  297 -230460338      -230334518 242466982 0.06678204 8.516281e-04 1.256577e-05  0.3870304  0.6348286           1 0.7032992
# 298  298 -230460338      -230334518 242466982 0.12121473 1.277084e-03 1.739767e-05  0.3870304  0.6348286           1 0.7736291
# 299  299 -230460338      -230334518 242466982 0.21614196 2.247447e-03 2.617431e-05  0.3870304  0.6348264           1 0.8509920
# 300  300 -230460338      -230334518 242466982 0.03688693 1.608597e-04 2.337088e-06  0.3870304  0.6348264           0 0.8509920
#       betamax  timing
# 295 0.7749854 113.903
# 296 0.8137346 105.281
# 297 0.8544214 105.538
# 298 0.8971424 105.886
# 299 0.9419995 106.446
# 300 0.9419995 105.956

de <- de_analysis(
    fit = fit,
    X =  t(mtx_c),
    s = rowSums(t(mtx_c)),
    pseudocount = 0.1,
    control = list(ns = 1e4, nc = 64, nsplit = 64*10),
    verbose = TRUE)

packageVersion("fastTopics")
# [1] ‘0.6.142’

# I see in the description file you are on Version: 0.6-154 - I installed it from CRAN

Happy to share any additional code or subset of the data you think could help explain what is going on

from fasttopics.

MarcElosua commented on August 23, 2024

I forgot to mention in the previous post - I'm fastTopics on Visium data, not scRNAseq but that hasn't been an issue with any tool before

> mean(mtx_c > 0)
[1] 0.08682789

from fasttopics.

pcarbo commented on August 23, 2024

@MarcElosua Thanks for providing these details, unfortunately I don't see any obvious problem. As a practical suggestion, try doing a smaller test run and see how long that takes, e.g.,

de <- de_analysis(fit = fit,X =  t(mtx_c),control = list(ns = 1000, nc = 4),verbose = FALSE)

Also, I would split the analysis into two steps; fit the topic model first, then run a separate job with de_analysis.

from fasttopics.

pcarbo commented on August 23, 2024

One other suggestion is to install fastTopics from source (e.g., using remotes::install_github) on the cluster for better code optimization.

from fasttopics.

MarcElosua commented on August 23, 2024

Hi @pcarbo thank you so much for your suggestions,

I'm trying both of your approaches. For the first one - I set ns = 100 and it took ~20min on a 23k spot dataset. Moreover, I installed it from source as you suggested and it doesn't seem to do the trick, running for >24h now.

I ad encountered the computational efficiency discrepancy when compiling a package from source vs locally vs CRAN before and it makes total sense that it would play a part here as well!

Do you have any recommendations on any specific compiler module/version installed to optimize this? Anything else you can think of that would help?

Apologies for the extensive discussion in the issue and thank you so much for your help!!!

from fasttopics.

pcarbo commented on August 23, 2024

"I set ns = 100 and it took ~20min on a 23k spot dataset." What is your exact de_analysis call for this? And can you give details about the topic model—number of topics, and size of the counts matrix?

from fasttopics.

MarcElosua commented on August 23, 2024

Here is the exact code I ran

fit <- "{fa_sp}/{robj_dir}/fit_topic.rds" %>%
    glue::glue() %>%
    here::here() %>%
    readRDS(file = .)

dim(mtx_c)
# [1] 43683 23185 genes x spots
dim(fit$L)
# [1] 23185    20 spots x topics
dim(fit$F)
# [1] 43683    20 genes x topics

set.seed(1)
st <- Sys.time()
de <- de_analysis(
    fit = fit,
    X =  t(mtx_c),
    s = rowSums(t(mtx_c)),
    pseudocount = 0.1,
    control = list(ns = 1e2, nc = 10, nsplit = 100),
    verbose = TRUE)

Fitting 43683 Poisson models with k=20 using method="scd".
Using 10 RcppParallel threads.
Computing log-fold change statistics from 43683 Poisson models with k=20.
  |++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++| 100%
Stabilizing posterior log-fold change estimates using adaptive shrinkage.
Warning message:
In de_analysis(fit = fit, X = t(mtx_c), s = rowSums(t(mtx_c)), pseudocount = 0.1,  :
  One or more MCMC simulations yielded acceptance rates of zero; consider increasing the number of Monte Carlo samples (control$ns) or modifying the noise level of the random-walk proposal distribution (control$rw) to improve the acceptance rates

print(Sys.time() - st)
Time difference of 2.227252 hours

I reduced nc to 10 in this case to b2 able to run it locally and check the progress bar

Session Info

> sessionInfo()                                                                                                                                                                                            
R version 4.1.3 (2022-03-10)                                                                                                                                                                               
Platform: x86_64-conda-linux-gnu (64-bit)                                                                                                                                                                  
Running under: CentOS Linux 7 (Core)                                                                                                                                                                       
                                                                                                                                                                                                           
Matrix products: default                                                                                                                                                                                   
BLAS/LAPACK: /scratch/groups/singlecell/software/anaconda3/envs/spatial_r2/lib/libopenblasp-r0.3.20.so                                                                                                     
                                                                                                                                                                                                           
locale:                                                                                                                                                                                                    
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C                                                                                                                                                               
 [3] LC_TIME=es_ES.UTF-8        LC_COLLATE=en_US.UTF-8                                                                                                                                                     
 [5] LC_MONETARY=es_ES.UTF-8    LC_MESSAGES=en_US.UTF-8                                                                                                                                                    
 [7] LC_PAPER=es_ES.UTF-8       LC_NAME=C                                                                                                                                                                  
 [9] LC_ADDRESS=C               LC_TELEPHONE=C                                                                                                                                                             
[11] LC_MEASUREMENT=es_ES.UTF-8 LC_IDENTIFICATION=C                                                                                                                                                        
                                                                                                                                                                                                           
attached base packages:                                                                              
[1] stats     graphics  grDevices utils     datasets  methods   base                                 
                                                                                                     
other attached packages:                                                                             
 [1] cowplot_1.1.1        fastTopics_0.6-154   inflection_1.3.6                                      
 [4] sp_1.5-0             SeuratObject_4.1.0   Seurat_4.1.1                                          
 [7] Matrix_1.5-1         colorBlindness_0.1.9 scales_1.2.0                                          
[10] forcats_0.5.1        stringr_1.4.0        dplyr_1.0.9                                           
[13] purrr_0.3.4          readr_2.1.2          tidyr_1.2.0                                           
[16] tibble_3.1.8         ggplot2_3.3.6        tidyverse_1.3.2                                       
                                                                                                     
loaded via a namespace (and not attached):                                                           
  [1] readxl_1.4.0            backports_1.4.1         plyr_1.8.7                                     
  [4] igraph_1.3.4            lazyeval_0.2.2          splines_4.1.3                                  
  [7] listenv_0.8.0           scattermore_0.8         digest_0.6.29                                  
 [10] invgamma_1.1            htmltools_0.5.3         SQUAREM_2021.1                                    
 [13] fansi_1.0.3             magrittr_2.0.3          tensor_1.5             
 [16] googlesheets4_1.0.0     cluster_2.1.3           ROCR_1.0-11                                       
 [19] tzdb_0.3.0              globals_0.16.0          modelr_0.1.8                                      
 [22] RcppParallel_5.1.5      matrixStats_0.62.0      vroom_1.5.7            
 [25] MCMCpack_1.6-3          spatstat.sparse_2.1-1   prettyunits_1.1.1      
 [28] colorspace_2.0-3        rvest_1.0.2             ggrepel_0.9.1          
 [31] haven_2.5.0             crayon_1.5.1            jsonlite_1.8.0         
 [34] progressr_0.10.1        spatstat.data_2.2-0     survival_3.3-1         
 [37] zoo_1.8-10              glue_1.6.2              polyclip_1.10-0        
 [40] gtable_0.3.0            gargle_1.2.0            MatrixModels_0.5-1                                
 [43] leiden_0.4.2            future.apply_1.9.0      abind_1.4-5                                       
 [46] SparseM_1.81            DBI_1.1.3               spatstat.random_2.2-0  
 [49] miniUI_0.1.1.1          Rcpp_1.0.9              progress_1.2.2         
 [52] viridisLite_0.4.0       xtable_1.8-4            gridGraphics_0.5-1     
 [55] reticulate_1.25         spatstat.core_2.4-4     bit_4.0.4              
 [58] truncnorm_1.0-9         htmlwidgets_1.5.4       httr_1.4.3             
 [61] RColorBrewer_1.1-3      ellipsis_0.3.2          ica_1.0-3              
 [64] pkgconfig_2.0.3         uwot_0.1.11             dbplyr_2.2.1           
 [67] deldir_1.0-6            here_1.0.1              utf8_1.2.2             
 [70] tidyselect_1.1.2        rlang_1.0.4             reshape2_1.4.4         
 [73] later_1.2.0             munsell_0.5.0           cellranger_1.1.0       
 [76] tools_4.1.3             cli_3.3.0               generics_0.1.3         
 [79] broom_1.0.0             ggridges_0.5.3          fastmap_1.1.0          
 [82] goftest_1.2-3           mcmc_0.9-7              bit64_4.0.5            
 [85] fs_1.5.2                fitdistrplus_1.1-8      RANN_2.6.1             
 [88] sparseMatrixStats_1.6.0 pbapply_1.5-0           future_1.27.0          
 [91] nlme_3.1-158            mime_0.12               quantreg_5.94          
 [94] xml2_1.3.3              compiler_4.1.3          plotly_4.10.0          
 [97] png_0.1-7               spatstat.utils_2.3-1    reprex_2.0.1           
[100] stringi_1.7.8           rgeos_0.5-9             lattice_0.20-45        
[103] vctrs_0.4.1             pillar_1.8.0            lifecycle_1.0.1        
[106] spatstat.geom_2.4-0     lmtest_0.9-40           RcppAnnoy_0.0.19       
[109] data.table_1.14.2       irlba_2.3.5             httpuv_1.6.5           
[112] patchwork_1.1.1         R6_2.5.1                promises_1.2.0.1       
[115] KernSmooth_2.23-20      gridExtra_2.3           parallelly_1.32.1      
[118] codetools_0.2-18        MASS_7.3-58.1           assertthat_0.2.1       
[121] rprojroot_2.0.3         withr_2.5.0             sctransform_0.3.3      
[124] mgcv_1.8-40             parallel_4.1.3          hms_1.1.1              
[127] quadprog_1.5-8          grid_4.1.3              rpart_4.1.16           
[130] coda_0.19-4             MatrixGenerics_1.6.0    ashr_2.2-54            
[133] googledrive_2.0.0       Rtsne_0.16              mixsqp_0.3-48          
[136] shiny_1.7.2             lubridate_1.8.0

here is the OS system I am running it on

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

from fasttopics.

pcarbo commented on August 23, 2024

If you set nc = 1, does it take longer? I wonder if there is an issue with the multithreading.

from fasttopics.

MarcElosua commented on August 23, 2024

Running it as below the time increases to 4.16 hours as compared to the 2.22 hours required before.

de <- de_analysis(
    fit = fit,
    X =  t(mtx_c),
    s = rowSums(t(mtx_c)),
    pseudocount = 0.1,
    control = list(ns = 1e2, nc = 1),
    verbose = TRUE)

I'll track the slurm job and see if all the resources (CPUs and memory) are being used as a proxy of the parallelization, but still odd the time difference between your 5h and how much it takes in our cluster!

Do you think specifying a specific cpp compiler could help?
Thanks!

from fasttopics.

pcarbo commented on August 23, 2024

From a back-of-the envelope calculation, this timing is comparable to what I have gotten.
So I would expect if you set ns = 1000 and nc = 10, for example, then it would also take about 4 h.
If it takes longer, then there is something wrong with the multithreading.

from fasttopics.

MarcElosua commented on August 23, 2024

Getting back to this, I have run the previous example like this

# Read in topics
fit <- "{fa_sp}/{robj_dir}/fit_topic.rds" %>%
    glue::glue() %>%
    here::here() %>%
    readRDS(file = .)

# Run de_analysis
# Following this issue https://github.com/stephenslab/fastTopics/issues/37
set.seed(1)
st <- Sys.time()
de <- de_analysis(
    fit = fit,
    X =  t(mtx_c),
    s = rowSums(t(mtx_c)),
    pseudocount = 0.1,
    control = list(ns = 1e3, nc = 10, nsplit = 10*10),
    verbose = TRUE)

print("Total time for running de_analysis is:\n")
print(Sys.time() - st)

and it took 21.75308 hours so there must be an issue with the multithreading.

Moreover, it also seems that setting a higher number of nsplit = 10*10 also makes it last much longer. I am simultaneously running the same datasets but with control = list(ns = 1e3, nc = 64, nsplit = 64*10), expecting it to be much faster but it has been running for over a day now.

from fasttopics.

pcarbo commented on August 23, 2024

@MarcElosua That's really strange. I'm wondering if the tech staff supporting the cluster could provide some guidance here? In particular I think it would be important to monitor the CPU usage of the cluster nodes (e.g., using htop) and see whether they are being used as expected. Also, since nsplit can slow things down slightly, a smaller number (e.g., 10) is better (this is explained somewhere in the pbapply package).

from fasttopics.

pcarbo commented on August 23, 2024

@MarcElosua I think we figured out what is going on. See here.

from fasttopics.

MarcElosua commented on August 23, 2024

This is great! Thank you so much for looking into this. I had seen people setting it up as below

# https://stackoverflow.com/questions/54596694/openblas-issue-with-combat-function-of-the-r-bioconductor-sva-package-on-torqu
Sys.setenv(OPENBLAS_NUM_THREADS="1")

But I'll try the package you suggest in the documentation

RhpcBLASctl::blas_set_num_threads(1)

Thanks again for all your help!

from fasttopics.

de_analysis save intermediate steps + return likelihood optimization requests about fasttopics HOT 19 CLOSED

Comments (19)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent