Comments (4)
Note: This message is Jupyter Notebook. You can download it or run it interactively.
Thanks for the question. Although the model is obviously very simple, it turned out it has one property which makes it a bit nasty for BayesPy built-in nodes. In short (and loosely speaking), the problem arises because you are conditioning on the same variable q two times for R. This causes a coupling between variables within a same node and BayesPy doesn't support that in general. Thus, a simple implementation won't work. Anyway, I'll show you what the simple implementation would look like although it won't work. It gives some ideas anyway:
%matplotlib notebook
import bayespy as bp
N = 10 # number of nodes
K = 3 # number of groups
# Directed connection probabilities among groups
eta = bp.nodes.Beta((1, 1), plates=(K, K))
# Group assignment probabilities
pi = bp.nodes.Dirichlet(np.ones(K))
# Group assignments for nodes
q = bp.nodes.Categorical(pi, plates=(N,))
# Adjacency matrix
R = bp.nodes.Mixture(
q[:,None], # need to add an axis so we'll get NxN matrix instead of N
bp.nodes.Mixture,
q[:,None], # need to add an axis so N-axis won't overlap the next K-axis of eta
bp.nodes.Bernoulli,
eta
)
# An alternative way of writing R using new MultiMixture from develop branch
R = bp.nodes.MultiMixture(
(q[:,None], q), # need to add an axis so we'll get (N,N) matrix instead of (N,)
bp.nodes.Bernoulli,
eta
)
However, you'll get an error that parents are not independent when constructing R
. BayesPy doesn't support that the same node is as a parent multiple times for another node because in general this would break the expectation computations. If you had two separate groups (e.g., customers and shops), and you did a similar analysis on how the links between customers and shops are grouped, you wouldn't have this issues.
Ok, but enough with this problem description, let's look at some possible solutions. There are basically two solutions: either capture this coupling with some construction or factorize explicitly with respect to those variables.
First, I'll describe the explicit factorization approach. The idea is that we use separate nodes for each element of q
and R
, and these nodes are collected to a list/array. We need to handle the diagonal of R
in a bit special way, and also the lists/arrays cause some extra code lines. Here's a quick implementation:
# Directed connection probabilities among groups
eta = bp.nodes.Beta([1, 1], plates=(K, K))
eta_diag = bp.nodes.Concatenate(*[eta[k,k,None] for k in range(K)])
# Group assignment probabilities
pi = bp.nodes.Dirichlet(np.ones(K))
# Group assignments for nodes
q = [bp.nodes.Categorical(pi, name="q_{0}".format(i))
for i in range(N)]
# Adjacency matrix N x N
R = np.empty((N, N), dtype=object)
for i in range(N):
for j in range(N):
if i != j:
R[i,j] = bp.nodes.Mixture(q[i], bp.nodes.Mixture, q[j], bp.nodes.Bernoulli, eta,
name="R_{0},{1}".format(i, j))
else:
R[i,i] = bp.nodes.Mixture(q[i], bp.nodes.Bernoulli, eta_diag)
nodes = list(R.flatten()) + [pi, eta] + list(q)
Q = bp.inference.VB(*nodes)
# Observe R
data = np.random.rand(N, N) > 0.7
for i in range(N):
for j in range(N):
R[i,j].observe(data[i,j])
# Initialize group assignments randomly
for i in range(N):
q[i].initialize_from_random()
# Run inference
Q.update(repeat=100)
# Plot results
bp.plot.pyplot.figure()
bp.plot.hinton(data)
bp.plot.pyplot.figure()
bp.plot.hinton(eta)
bp.plot.pyplot.figure()
bp.plot.hinton(bp.nodes.Concatenate(*[q[i][None] for i in range(N)]))
Note that this constructions loses the dependencies among the group assignments of the nodes because of the factorization. Another issue with this implementation is that R
and q
are lists/arrays containing separate nodes, so the computations won't be as efficient if they were modelled using single nodes with array-based computations. That is, the elements of R
and q
cannot be updated simultaneously but must be updated in turns. Thus, this might not scale well to large problems, unfortunately.
Ah, and also, you'll need to use develop
branch for this. I needed to make a few minor fixes. You can install the develop branch as:
pip install git+https://github.com/bayespy/bayespy.git@develop`
Sorry for the inconvenience.
A better implementation would handle the couplings between the elements of q
and the elements of R
. A quite complex approach would be to write q
as a single categorical variable with q
and R
are discrete categorical/Bernoulli variables, they both could be included in a single node which handles couplings and performs computations by using the junction tree algorithm. I haven't implemented it yet, there's an issue for it: https://github.com/bayespy/bayespy/issues/37 But I hope I'll find time to implement it soon, I've noticed that quite many reported issues are related to discrete variables and they could be solved just by using that new node. I'll try to implement the new feature in such a way that this construction you need would be easy (or at least possible) to implement.
from bayespy.
I sketched what the syntax might look like with the upcoming CategoricalGraph
node here #37 (comment):
# Directed connection probabilities among groups
eta = bp.nodes.Beta([1, 1], plates=(K, K))
# Group assignment probabilities
pi = bp.nodes.Dirichlet(np.ones(K))
X = CategoricalGraph(
'q': dict(
table=pi,
plates=(N,)
),
'R': dict(
table=eta,
given=['q', 'q'],
parent_plates_mappings=[ [0], [1] ],
plates=(N, N) # this line is optional in this case
)
)
X.observe('R', data)
If you have comments/questions on this, I'd be happy to hear. And just to be clear: this is not yet implemented.
from bayespy.
Thank you for such an amazing and detailed reply. The explicit factorization approach is quite interesting.
In any case I managed to implement the stochastic block model with manual Gibbs sampling and the implementation with bayespy was more of an experiment.
On the other hand, in a mixed membership stochastic block model where the group assignments for nodes are separate (one from i -> j, and other one from j -> i), should be easier to implement with bayespy.
from bayespy.
Ok, glad to hear you got Gibbs sampling working. And yes, I think a mixed membership stochastic block model is easy to implement with the current set of nodes.
from bayespy.
Related Issues (20)
- ImageComparisonFailure error during tests HOT 1
- Creating Gaussian Node Raises Error HOT 4
- Extended Marble Example HOT 1
- Fast Variational Bayesian Linear State-Space Model HOT 2
- Performance Doubt HOT 3
- Implamenting
- How to approach bayesian?
- Online/Iterative Learning for Bayespy?
- Higher order Markov chains? HOT 3
- Anaconda install of bayespy causes a downgrade of ipython HOT 6
- Gate for Non Categorical Variable HOT 4
- How to extract posterior over latent states from CategoricalMarkovChain HOT 2
- Remove deprecated time.clock() call HOT 1
- Array of counts doesn't work in multinomial mixture HOT 2
- Using the posterior distribution? HOT 1
- Gaussian mixture HOT 2
- Increase the usage of augmented assignment statements
- Errors in the demos HOT 1
- More projects for "Similar projects"
- Equations not showing in docs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bayespy.