grosssbm / sbm Goto Github PK

View Code? Open in Web Editor NEW

17.0 2.0 6.0 115.46 MB

A package to sample and estimate variants of the stochastic blockmodel from network data

Home Page: https://grosssbm.github.io/sbm/

License: GNU General Public License v3.0

R 97.01% C++ 0.75% TeX 2.24%

stochastic-block-model sbm network-analysis

sbm's Introduction

sbm

The goal of the package sbm is to regroup into a unique framework tools for estimating and manipulating variants of the stochastic blockmodel.

Installation

The last stable version is available on CRAN with:

install.packages("sbm")

You can install the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("GrossSBM/sbm")

sbm's People

Contributors

Stargazers

Watchers

Forkers

tabea17 fanny62 hadley g-vernon auzaheta jiahuining

sbm's Issues

Bug: tested numbers of blocks in estimateSimpleSBM

Hello =),

I face two problems with function estimateSimpleSBM, when specifying the interval for the numbers of blocks to test.

Normally, the min and max bounds should be supplied through nbBlocksRange (vector with two elements), of the named list estimOptions (argument of estimateSimpleSBM).

So the two problems:

nbBlocksRange[1] doesn't define the minimum bound. In my tests, the lower bound is always 1, and I found no way to change it.
The maximum bound is not always nbBlocksRange[2]. Instead, it is for instance nbBlocksRange[1] when explorFactor<=nbBlocksRange[1]<nbBlocksRange[2] ... O_o

As you can see, the two problems appear in the example below:

require(sbm)
nNodes = 50
adj = matrix(round(runif(nNodes,0,1)),nNodes,nNodes)
SBM = estimateSimpleSBM(adj,directed=T,estimOptions = list(nbBlocksRange=c(10,20),explorFactor=1.5))
print(SBM$storedModels$nbBlocks)

It would be great if we could specify the floor number (first problem), because currently the calculation time considerably increases beyond about 20 blocks.

PS: Sorry, I wanted to check what happens exactly, but I don't know how to access the concerned function which, I guess, is the sub-method estimate of the method optimize for class SimpleSBM_fit

Best Regards,

add class for sampling from Simple and Bipartite SBM

Complete vignettes for Simple and Bipartite SBM with the fungus-tree data set

Everything is in the title. Do this in the dev branch.

Reformat details about estimOptions

For instance: 👍 \item{\code{nb_cores} } to do instead of \item{"nbCores"} 👎

add tests for SBM, SimpleSBM and BipartiteSBM

Add support for missing values in SBM

The different possible solutions are

use the implementation of SimpleSBM in MultipartiteSBM from GREMLIN (handling missing values)
extend missSBM to various emission law
modify the C++ code in blockmodels to handle missing values

SBM on correlation based network

Hello!
First of all thank you for the really interesting package, I'm a beginner in sbm and I would like to ask you for suggestions on this topics. I am interested in the search for latent group networks built starting from a correlation matrix and obtaining the adjacency matrix by thresholding the absolute values of correlation; therefore each link is associated with a weight that can vary from -1 to 1.
I would like to ask you if it was possible to adapt the functions present in the package to this type of correlation networks and my main doubts are:

What is the "best input" I can give to the estimate sbm functions, should I consider only significant positive links and binarize them into 0,1 for example?
Is it possible to keep the information of negative links as well? I would like if the vertices are connected by negative weighted links, then they are also labeled in different groups (like a sort of repulsion between them). Can I introduce those informations in the covariates argument?

Thanks in advance and have a nice day.

Bug: Problem with the predict method for bipartiteSBM with covariates

The predict method for bipartiteSBM with covariates and 'bernoulli' distribution is highly biased, I believe there is a problem with the computation as it is different than the result given by the following code of blockmodels (with the same returned model):

for(k in seq_along(covlbm)) {
  B <- B + sbm_cov$model_parameters[[4]]$beta[k] * covlbm[[k]]
}
1/(1+exp(-sbm_cov$memberships[[4]]$Z1 %*%
               sbm_cov$model_parameters[[4]]$m %*%
               t(sbm_cov$memberships[[4]]$Z2)-B))

If I sum the above expression in a network with 129 edges, I get about 125. But the sum of the predictions with the predict method of sbm return 65.

Bug: incorrect number of blocks in "membership" list for "estimateMultiplexSBM" method

The estimateMultiplexSBM method fits the model with different values of K (the number of blocks), and it selects the value of K that produces the highest ICL. However, I found that the number of blocks in the memberships and indMemberships lists were not equal to the optimal K.

Here is an example from when I ran the code. The estimateMultiplexSBM method fit the model for each K in {1, 2, 3, 4} and found that the highest ICL value was when K=4. However, the indMemberships list looked something like this:

Note: these are only the first 9 nodes.

As you can see, there are only 3 blocks present when there should be 4.

I assume this is because the memberships and indMemberships lists are not updated to store the memberships of the model with the highest ICL. They may be storing the memberships of the model with the second-highest ICL (which is probably the second-to-last model to be fit).

Is there a way to patch this quickly?

problem when defining an SBM object

THe following code does not work properly

A <- matrix(rbinom(100,1,.2),10,10)
type <- "simple"
mySBM <- SimpleSBM_fit$new(A, "poisson",directed=FALSE)