Comments (18)
Thanks for the suggestion! Indeed, this would be a helpful example. I'll try to do it on Monday and add it as an IPython notebook to the documentation. It should be quite trivial, but I don't have the time today. But wanted to let you know that the question is noted and I'll get back to you.
from bayespy.
Thanks for the kind gesture! Infact replicating some examples from Murphy's BNT should be useful! I am coming up with the syntax and should be able to contribute many of these!
Aside: What is the roadmap for MCMC stuff?
from bayespy.
Since there are already several MCMC packages (PyMC, OpenBUGS, Stan, ...), I don't have a very high priority for that. I'd like to get this VB stuff working well. After that, maybe expectation propagation (EP). And after that, maybe Gibbs sampling. But at least currently, I don't have plans for more general MCMC than Gibbs.
from bayespy.
I had a little spare time so did this for fun. First, this is the example exactly as it is given in the wikipedia:
from bayespy.nodes import CategoricalMarkovChain, Categorical, Mixture
N = 100
# Initial state probabilities
a0 = [0.6, 0.4]
# State transition probabilities
A = [[0.7, 0.3],[0.4,0.6]]
# Markov chain
Z = CategoricalMarkovChain(a0, A, states=N)
# Emission probabilities
P = [[0.1, 0.4, 0.5], [0.6, 0.3, 0.1]]
# Observed process
Y = Mixture(Z, Categorical, P)
# Generate dummy data
import numpy as np
data = np.random.randint(3, size=N)
# Observe data
Y.observe(data)
Then, if you want that some parameters (initial state probabilities, state transition probabilities and/or emission probabilities) are unknown, just put priors for them. For instance, for the state transition probabilities:
from bayespy.nodes import Dirichlet
A = Dirichlet([[1e-3, 1e-3], [1e-3, 1e-3]])
The priors you mentioned, could be given as:
from bayespy.nodes import Dirichlet
a0 = Dirichlet([20,10])
A = Dirichlet([[8,2], [2,8]])
P = Dirichlet([[3,4,3], [10,6,4]])
I didn't test too much, so if you run into any problems please let me know. Maybe add names to the nodes.
from bayespy.
This is brilliant! I didn't know that you BayesPy allowed both bayesian and non-bayesian approaches (the first one being non-bayesian).
I just added the following 3 lines to test out VB
from bayespy.inference import VB
Q = VB(a0, A, P, Y)
Q.update(repeat=20)
However, I run into error while running VB on both of these:
For the first one (exactly copied from Wiki)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-33-58d0b110b4ee> in <module>()
1 from bayespy.inference import VB
----> 2 Q = VB(a0, A, P, Y)
3 Q.update(repeat=20)
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/vmp.py in __init__(self, tol, autosave_iterations, autosave_filename, callback, *nodes)
44
45 # Remove duplicate nodes
---> 46 self.model = utils.utils.unique(nodes)
47
48 self._figures = {}
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/utils/utils.py in unique(l)
128 seen = set()
129 seen_add = seen.add
--> 130 return [ x for x in l if x not in seen and not seen_add(x)]
131
132 def tempfile(prefix='', suffix=''):
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/utils/utils.py in <listcomp>(.0)
128 seen = set()
129 seen_add = seen.add
--> 130 return [ x for x in l if x not in seen and not seen_add(x)]
131
132 def tempfile(prefix='', suffix=''):
TypeError: unhashable type: 'list'
For the second one:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-36-1ce487e03fd2> in <module>()
1 Q = VB(a0, A, P, Z, Y)
----> 2 Q.update(repeat=20)
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/vmp.py in update(self, repeat, plot, *nodes)
101 X = self[node]
102 if hasattr(X, 'update') and callable(X.update):
--> 103 X.update()
104 if plot:
105 self.plot(X)
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/stochastic.py in update(self)
175 if not np.all(self.observed):
176 u_parents = self._message_from_parents()
--> 177 m_children = self._message_from_children()
178 self._update_distribution_and_lowerbound(m_children, *u_parents)
179
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_from_children(self)
514 #msg = [np.array(0.0) for i in range(len(self.dims))]
515 for (child,index) in self.children:
--> 516 m = child._message_to_parent(index)
517 for i in range(len(self.dims)):
518 if m[i] is not None:
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_to_parent(self, index)
434
435 # Compute the message and mask
--> 436 (m, mask) = self._get_message_and_mask_to_parent(index)
437 mask = utils.squeeze(mask)
438
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/deterministic.py in _get_message_and_mask_to_parent(self, index)
68 def _get_message_and_mask_to_parent(self, index):
69 u_parents = self._message_from_parents(exclude=index)
---> 70 m_children = self._message_from_children()
71 return self._compute_message_and_mask_to_parent(index,
72 m_children,
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_from_children(self)
514 #msg = [np.array(0.0) for i in range(len(self.dims))]
515 for (child,index) in self.children:
--> 516 m = child._message_to_parent(index)
517 for i in range(len(self.dims)):
518 if m[i] is not None:
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_to_parent(self, index)
434
435 # Compute the message and mask
--> 436 (m, mask) = self._get_message_and_mask_to_parent(index)
437 mask = utils.squeeze(mask)
438
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/stochastic.py in _get_message_and_mask_to_parent(self, index)
114 index,
115 self.u,
--> 116 *u_parents)
117 mask = self._distribution.compute_mask_to_parent(index, self.mask)
118 ## m = self._compute_message_to_parent(self.parents[index], index, self.u, *u_parents)
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/mixture.py in compute_message_to_parent(self, parent, index, u, *u_parents)
58 # Reshape(g):
59 # Shape(g) = [Nn,..,N0,K]
---> 60 g = utils.moveaxis(g, self.cluster_plate, -1)
61
62 # Compute phi:
/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/utils/utils.py in moveaxis(A, axis_from, axis_to)
544 % (axis_from,
545 axis_to,
--> 546 np.shape(A)))
547
548 axes = np.arange(np.ndim(A))
ValueError: Can't move axis -1 to position -1. Axis index out of bounds for array with shape ()
I seriously hope that I am not doing anything really stupid!
from bayespy.
I don't think you're doing anything wrong. I'll check those a bit later. In any case, the package should provide better feedback. The package isn't very mature yet and there are some bugs. Thus it is great to get some feedback from other users when they hit some weird problems when they are trying to do something relatively simple.
To the non-bayes vs bayes comment: It is bayesian to use the exact values of the parameters if you know them. That means your uncertainty is zero = the distribution is a delta distribution = you use a fixed value. :) But if you are uncertain about the parameters, then the distributions should represent your knowledge.
I'll get back to you later today or more likely tomorrow.
from bayespy.
To the non-bayes vs bayes comment: It is bayesian to use the exact values of the parameters if you know them. That means your uncertainty is zero = the distribution is a delta distribution = you use a fixed value. :) But if you are uncertain about the parameters, then the distributions should represent your knowledge.
Right! Once this is done, I would be keen (with your help if you intend to) to do a simple comparison of this basic HMM using VB, using Forward Backward (maybe use hmmlearn or yahmm)..and maybe even Gibbs sampling based.
I haven't had success with PyMC yet though!
Aside: I did create a simple Gibbs sampler for Bayes Nets some time back. Do you envision that a Gibbs sampler for BayesPy to be much different?
Eventually, I intend to use BayesNet for much more complex stuff like Factorial HMMs. The packages looks great! Kudos!
from bayespy.
Ok, I think I got this solved.
In the first case, you were creating VB with arguments that were not nodes. The constants were lists not nodes. I added a check for this and it now raises a meaningful error in this case. Also: You didn't add Z in the list although that is the unknown variable your interested in. Also: As there's only one unknown variable, one iteration gives you the exact answer, no need to repeat. Thus, the best way to do it is like this:
from bayespy.inference import VB
Q = VB(Y, Z)
Q.update()
Yep, building the VB object is a bit cumbersome. It should be made more automatic (for instance, give only node Y and it searches for all connected nodes to find the other nodes). It's on the TODO list. At the moment, I can only apologize for the inconvenience. :)
In the second case, there was a few bugs. I fixed those and added a few unit tests. Also, you were missing Z in the list, so do it like this:
from bayespy.inference import VB
Q = VB(Y, Z, A, a0, P)
Q.update(repeat=20)
You may need to do random initializations for some variables in some cases. But I'll write a more detailed example to the docs and discuss that too.
from bayespy.
I wrote the following example:
http://www.bayespy.org/_notebooks/hmm_discrete.html
If you comments or questions, please send a message here.
In addition, were you interested in some other particular examples I should add? The current list is available here:
http://www.bayespy.org/examples.html
But they are not ready yet. I am working on them currently..
from bayespy.
This is amazing!
Would you mind if I edit the notebook a bit?
I think adding a few more basic examples on the following might be really helpful
- Sprinkler rain example from Wiki to illustrate how evidence changes beliefs. This can easily be contrasted to a MCMC based solution (using PyMC) as well (which is out of scope of this example).
- Bayesian linear regression
- Bayesian clustering/GMM
- DBN examples from Murphy's BNT
Personally, I would be interested in HMM variants- Gaussian HMM, Factorial HMM and the more complex ones where approximate inference is much more handy!
I previously wrote a Additive FHMM using sklearn's HMM toolbox here
What do you think would be needed to create such a model?
from bayespy.
Yep, you can edit the notebook and, for instance, make a pull request if you want.
(Linear) regression is a good idea for an example. I'll add that.
The additive FHMM is almost possible with current nodes. Not exactly as you presented it, but quite similar model is possible. The exact model may require implementing one general purpose gating node that I've been thinking for some time now. But I'll take a closer look.
from bayespy.
Ok, Additive FHMM seems to be a bit more complex than it first appeared. If I understood correctly, the problem is that the states of the different chains become dependent because of the sum of mu variables. That is, p(x|y,mu,...) does not factorize with respect to the different chains. This is quite intuitive: if you change some of the mu selected by one chain, you probably need to change the mu:s selected by other chains too. I think there is no standard distribution for modelling a dependent set of categorical variables and even if there was, the computational cost would be large in this case, something like O(length*states^chains). I guess, a solution would be to force the factorization in the VB approximation:
q(x^(1)) * q(x^(2)) * ... * q(x^(N))
Then writing a specific node for Additive FHMM which takes mu:s and N chains as parents. The chains are then estimated in turns instead of jointly.
I'm not sure that my analysis is correct. I quickly looked at your sklearn's toolbox but didn't find out how you estimate the chains. Of course, the parameters of the chains can be unknown but their estimation is straightforward after one is able to estimate the distribution of the chains/states.
Do you have any insight?
from bayespy.
This is quite intuitive: if you change some of the mu selected by one chain, you probably need to change the mu:s selected by other chains too.
You are right.
I think there is no standard distribution for modelling a dependent set of categorical variables and even if there was, the computational cost would be large in this case, something like O(length*states^chains). I guess, a solution would be to force the factorization in the VB approximation:
q(x^(1)) * q(x^(2)) * ... * q(x^(N))
Then writing a specific node for Additive FHMM which takes mu:s and N chains as parents. The chains are then estimated in turns instead of jointly.
Zoubin Ghahramani has an old paper on this (it is on FHMM and not specifically Additive FHMM).
He discusses a few approximation schemes in that paper. I feel that this paper should serve as a good reference.
I quickly looked at your sklearn's toolbox but didn't find out how you estimate the chains.
I had infact approximated the Additive FHMM as a HMM. Let me explain that with a small example.
Assume we have two electrical appliances: Fan and TV. For simplicity both these have two states (off and on) and they emit according to a normal distribution.
The parameters for these are as follows
Pi_Fan = [0.5, 0.5]
A_fan = [[0.9, 0.1], [0.2,0.8]]
B_fan = [Normal(0,1), Normal(100,10)]
Pi_TV = [0.2, 0.8]
A_TV = [[0.7, 0.3], [0.2, 0.8]]
B_TV = [Normal(0,1), Normal(200,10)]
Additive FHMM for both these represented as a HMM would have 2*2 = 4 states and parameters as follows:
Pi_combined = [0.5_0.2, 0.5_0.8, 0.5_0.2, 0.5_0.8]
B_combined = [Normal(0+0, 1+1), Normal(0+200, 1+10), Normal(100+0, 1+1), Normal(100+200, 10+10)]
.. and so on..
This way is NOT the best way to do such stuff however.
What do you think?
from bayespy.
@oliparson, @JackKelly would you have more information to add on variational Additive FHMMs?
from bayespy.
I looked at Ghahramani's paper. So, exact inference is possible as you presented and a bit more efficient exact inference is possible by utilizing the factorial structure. However, both of these might be too inefficient in practice, so the best practical solution is probably to factorize the chains and then use VB iteration.
However, I don't see the difference between Additive FHMM and the model in (4a)+(4b) in Ghahramani's paper. What is the difference?
from bayespy.
I would say that Additive FHMM makes it explicit that Y would be the sum of x's at the given time. I think Ghahramani's paper is a little more general.
from bayespy.
I haven't had time to think about this thoroughly.. In any case, the current collection of nodes does not support implementing this model. I'm thinking what is the most modular and efficient way to implement the model.
One solution is to write a node which takes a Gaussian array and N independent categorical chains as input and then the output is the sum of the mu variables. It would be a deterministic node. This should at least work and the implementation can be quite efficient because the node does mixing and summing at the same time, so it is possible to avoid huge intermediate arrays. Also, it is quite straightforward to implement, I think. The only downside is that the node is not very general purpose.
Another option would be to implement two nodes. The other one does the mixing, that is, selects mu for each chain, and the other node computes the sum of these. It is not obvious to me what the messages in these nodes would be. Also, there might be some overhead because the first node mixes mu's which means that the message has Gaussian dimensionality of observation dimensions times the number of chains, and the length is the number of observations. The nice property of this implementation is that the nodes might be a bit more general than in the other implementation, maybe.
So, I think the first suggestion is probably easier to implement and more efficient. However, I'm not very familiar with the model, so I may have missed or misunderstood something.
from bayespy.
Hi,
Sorry for getting back this late. I'll be busy in conferences etc. for 3-4 days more. Should get back to you then.
from bayespy.
Related Issues (20)
- ImageComparisonFailure error during tests HOT 1
- Creating Gaussian Node Raises Error HOT 4
- Extended Marble Example HOT 1
- Fast Variational Bayesian Linear State-Space Model HOT 2
- Performance Doubt HOT 3
- Implamenting
- How to approach bayesian?
- Online/Iterative Learning for Bayespy?
- Higher order Markov chains? HOT 3
- Anaconda install of bayespy causes a downgrade of ipython HOT 6
- Gate for Non Categorical Variable HOT 4
- How to extract posterior over latent states from CategoricalMarkovChain HOT 2
- Remove deprecated time.clock() call HOT 1
- Array of counts doesn't work in multinomial mixture HOT 2
- Using the posterior distribution? HOT 1
- Gaussian mixture HOT 2
- Increase the usage of augmented assignment statements
- Errors in the demos HOT 1
- More projects for "Similar projects"
- Equations not showing in docs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bayespy.