Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Is SCINA non-deterministic as an EM-framework? about scina HOT 3 CLOSED

RegnerM2015 commented on September 16, 2024

Is SCINA non-deterministic as an EM-framework?

from scina.

Comments (3)

jcao89757 commented on September 16, 2024

Hi Regner,

Thank you for letting me know! This is an excellent question.

If you may take a glance at line 68-89 in EM_model.R, you will find that we initiate the model following specific steps that in respect to your input, especially the signature genes. There are not any steps elsewhere that can introduce random variables in the model, and what you have noticed that the output was identical each time is as expected.

I think a quick workaround is to investigate the probability matrix (results$probabilities), which is also the way I used in my SCINA paper. The rows of the matrix are cell types and each of the columns records the probabilities of one cell to be assigned to any of your presumed cell types. You may save the results$probabilities from the last ~10 (or the number you like) iterations and calculate and summarize the probability vector variation among the 10 iterations for each cell or the entire cell pool.

Hope this information could be of help! Please let me know if I'm not making myself clear on any points. Wish you all the best!

Best regards, Ze

from scina.

RegnerM2015 commented on September 16, 2024

Hi @jcao89757 ,

If I understand correctly, this means that SCINA will produce 100% stable results across 10 iterations? I may have confused myself thinking SCINA was similar to another tool called Cell Assign (https://github.com/Irrationone/cellassign). Could you point out the differences between SCINA and Cell Assign?

I investigated the probability matrices for 10 runs of SCINA on the pbmc_small dataset. It seems like in this small example with the pbmc_small dataset, all 80 cells receive the same probability across 10 iterations (please see script below and T cell signature attached). Thanks for your help!

signatures <- as.list(read.delim("./T_cell_signature.txt"))
pbmc <- pbmc_small


desired_length <- 10 # or whatever length you want
prob.fill <- vector(mode = "list", length = desired_length)

for (i in 1:10){
  results = SCINA(as.matrix(pbmc@assays$RNA@data), signatures, max_iter = 100, convergence_n = 10, 
                  convergence_rate = 0.999, sensitivity_cutoff = 0.9, rm_overlap=T, allow_unknown=TRUE, log_file='SCINA.log')
  prob.fill[[i]] <- results$probabilities
}

head(prob.fill[[1]][,1:2])
head(prob.fill[[2]][,1:2])
head(prob.fill[[3]][,1:2])
head(prob.fill[[4]][,1:2])
head(prob.fill[[5]][,1:2])
head(prob.fill[[6]][,1:2])
head(prob.fill[[7]][,1:2])
head(prob.fill[[8]][,1:2])
head(prob.fill[[9]][,1:2])
head(prob.fill[[10]][,1:2])
head(prob.fill[[1]][,79:80])
head(prob.fill[[2]][,79:80])
head(prob.fill[[3]][,79:80])
head(prob.fill[[4]][,79:80])
head(prob.fill[[5]][,79:80])
head(prob.fill[[6]][,79:80])
head(prob.fill[[7]][,79:80])
head(prob.fill[[8]][,79:80])
head(prob.fill[[9]][,79:80])
head(prob.fill[[10]][,79:80])

cell1.run1 <- prob.fill[[1]][,1] 
cell1.run10 <- prob.fill[[10]][,1]
cell1.run5 <- prob.fill[[5]][,1]

print(cell1.run1);print(cell1.run10);print(cell1.run5)

T_cell_signature.txt

from scina.

jcao89757 commented on September 16, 2024

Hi Regner,

You are right that SCINA results are 100% stable with the same exp_data and signatures. The CellAssign paper introduced an algorithm that takes similar input as SCINA and has a very close underlying assumption. However, the two methods have a few differences in expectation likelihood calculation. The CellAssign model initiates from randomly drawn priors, therefore it produces different results if run multiple times.

Thanks for sending the code! I see that your code aims to investigate the robustness among runs, not the convergence among iterations. As I explained that our algorithm is deterministic, the probability matrices are also stable with the same exp_data and signatures. You may then need to slightly modify the initiation part of the script EM_model.R to start from different initiations. Please let me know if you have further questions about modifying the code. Thanks!

Best regards, Ze

from scina.

Is SCINA non-deterministic as an EM-framework? about scina HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent