Giter Club home page Giter Club logo

Comments (18)

jluttine avatar jluttine commented on June 12, 2024

Thanks for the suggestion! Indeed, this would be a helpful example. I'll try to do it on Monday and add it as an IPython notebook to the documentation. It should be quite trivial, but I don't have the time today. But wanted to let you know that the question is noted and I'll get back to you.

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

Thanks for the kind gesture! Infact replicating some examples from Murphy's BNT should be useful! I am coming up with the syntax and should be able to contribute many of these!

Aside: What is the roadmap for MCMC stuff?

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

Since there are already several MCMC packages (PyMC, OpenBUGS, Stan, ...), I don't have a very high priority for that. I'd like to get this VB stuff working well. After that, maybe expectation propagation (EP). And after that, maybe Gibbs sampling. But at least currently, I don't have plans for more general MCMC than Gibbs.

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

I had a little spare time so did this for fun. First, this is the example exactly as it is given in the wikipedia:

from bayespy.nodes import CategoricalMarkovChain, Categorical, Mixture
N = 100
# Initial state probabilities
a0 = [0.6, 0.4]
# State transition probabilities
A = [[0.7, 0.3],[0.4,0.6]]
# Markov chain
Z = CategoricalMarkovChain(a0, A, states=N)
# Emission probabilities
P = [[0.1, 0.4, 0.5], [0.6, 0.3, 0.1]]
# Observed process
Y = Mixture(Z, Categorical, P)
# Generate dummy data
import numpy as np
data = np.random.randint(3, size=N)
# Observe data
Y.observe(data)

Then, if you want that some parameters (initial state probabilities, state transition probabilities and/or emission probabilities) are unknown, just put priors for them. For instance, for the state transition probabilities:

from bayespy.nodes import Dirichlet
A = Dirichlet([[1e-3, 1e-3], [1e-3, 1e-3]])

The priors you mentioned, could be given as:

from bayespy.nodes import Dirichlet
a0 = Dirichlet([20,10])
A = Dirichlet([[8,2], [2,8]])
P = Dirichlet([[3,4,3], [10,6,4]])

I didn't test too much, so if you run into any problems please let me know. Maybe add names to the nodes.

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

This is brilliant! I didn't know that you BayesPy allowed both bayesian and non-bayesian approaches (the first one being non-bayesian).
I just added the following 3 lines to test out VB

from bayespy.inference import VB
Q = VB(a0, A, P, Y)
Q.update(repeat=20)

However, I run into error while running VB on both of these:
For the first one (exactly copied from Wiki)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-33-58d0b110b4ee> in <module>()
      1 from bayespy.inference import VB
----> 2 Q = VB(a0, A, P, Y)
      3 Q.update(repeat=20)

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/vmp.py in __init__(self, tol, autosave_iterations, autosave_filename, callback, *nodes)
     44 
     45         # Remove duplicate nodes
---> 46         self.model = utils.utils.unique(nodes)
     47 
     48         self._figures = {}

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/utils/utils.py in unique(l)
    128     seen = set()
    129     seen_add = seen.add
--> 130     return [ x for x in l if x not in seen and not seen_add(x)]
    131 
    132 def tempfile(prefix='', suffix=''):

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/utils/utils.py in <listcomp>(.0)
    128     seen = set()
    129     seen_add = seen.add
--> 130     return [ x for x in l if x not in seen and not seen_add(x)]
    131 
    132 def tempfile(prefix='', suffix=''):

TypeError: unhashable type: 'list'

For the second one:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-1ce487e03fd2> in <module>()
      1 Q = VB(a0, A, P, Z, Y)
----> 2 Q.update(repeat=20)

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/vmp.py in update(self, repeat, plot, *nodes)
    101                 X = self[node]
    102                 if hasattr(X, 'update') and callable(X.update):
--> 103                     X.update()
    104                     if plot:
    105                         self.plot(X)

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/stochastic.py in update(self)
    175         if not np.all(self.observed):
    176             u_parents = self._message_from_parents()
--> 177             m_children = self._message_from_children()
    178             self._update_distribution_and_lowerbound(m_children, *u_parents)
    179 

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_from_children(self)
    514         #msg = [np.array(0.0) for i in range(len(self.dims))]
    515         for (child,index) in self.children:
--> 516             m = child._message_to_parent(index)
    517             for i in range(len(self.dims)):
    518                 if m[i] is not None:

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_to_parent(self, index)
    434 
    435         # Compute the message and mask
--> 436         (m, mask) = self._get_message_and_mask_to_parent(index)
    437         mask = utils.squeeze(mask)
    438 

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/deterministic.py in _get_message_and_mask_to_parent(self, index)
     68     def _get_message_and_mask_to_parent(self, index):
     69         u_parents = self._message_from_parents(exclude=index)
---> 70         m_children = self._message_from_children()
     71         return self._compute_message_and_mask_to_parent(index,
     72                                                         m_children,

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_from_children(self)
    514         #msg = [np.array(0.0) for i in range(len(self.dims))]
    515         for (child,index) in self.children:
--> 516             m = child._message_to_parent(index)
    517             for i in range(len(self.dims)):
    518                 if m[i] is not None:

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/node.py in _message_to_parent(self, index)
    434 
    435         # Compute the message and mask
--> 436         (m, mask) = self._get_message_and_mask_to_parent(index)
    437         mask = utils.squeeze(mask)
    438 

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/stochastic.py in _get_message_and_mask_to_parent(self, index)
    114                                                          index,
    115                                                          self.u,
--> 116                                                          *u_parents)
    117         mask = self._distribution.compute_mask_to_parent(index, self.mask)
    118         ## m = self._compute_message_to_parent(self.parents[index], index, self.u, *u_parents)

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/inference/vmp/nodes/mixture.py in compute_message_to_parent(self, parent, index, u, *u_parents)
     58             # Reshape(g):
     59             # Shape(g)      = [Nn,..,N0,K]
---> 60             g = utils.moveaxis(g, self.cluster_plate, -1)
     61 
     62             # Compute phi:

/home/nipun/miniconda/envs/py3k/lib/python3.3/site-packages/bayespy/utils/utils.py in moveaxis(A, axis_from, axis_to)
    544                          % (axis_from,
    545                             axis_to,
--> 546                             np.shape(A)))
    547 
    548     axes = np.arange(np.ndim(A))

ValueError: Can't move axis -1 to position -1. Axis index out of bounds for array with shape ()

I seriously hope that I am not doing anything really stupid!

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

I don't think you're doing anything wrong. I'll check those a bit later. In any case, the package should provide better feedback. The package isn't very mature yet and there are some bugs. Thus it is great to get some feedback from other users when they hit some weird problems when they are trying to do something relatively simple.

To the non-bayes vs bayes comment: It is bayesian to use the exact values of the parameters if you know them. That means your uncertainty is zero = the distribution is a delta distribution = you use a fixed value. :) But if you are uncertain about the parameters, then the distributions should represent your knowledge.

I'll get back to you later today or more likely tomorrow.

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

To the non-bayes vs bayes comment: It is bayesian to use the exact values of the parameters if you know them. That means your uncertainty is zero = the distribution is a delta distribution = you use a fixed value. :) But if you are uncertain about the parameters, then the distributions should represent your knowledge.

Right! Once this is done, I would be keen (with your help if you intend to) to do a simple comparison of this basic HMM using VB, using Forward Backward (maybe use hmmlearn or yahmm)..and maybe even Gibbs sampling based.

I haven't had success with PyMC yet though!

Aside: I did create a simple Gibbs sampler for Bayes Nets some time back. Do you envision that a Gibbs sampler for BayesPy to be much different?

Eventually, I intend to use BayesNet for much more complex stuff like Factorial HMMs. The packages looks great! Kudos!

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

Ok, I think I got this solved.

In the first case, you were creating VB with arguments that were not nodes. The constants were lists not nodes. I added a check for this and it now raises a meaningful error in this case. Also: You didn't add Z in the list although that is the unknown variable your interested in. Also: As there's only one unknown variable, one iteration gives you the exact answer, no need to repeat. Thus, the best way to do it is like this:

from bayespy.inference import VB
Q = VB(Y, Z)
Q.update()

Yep, building the VB object is a bit cumbersome. It should be made more automatic (for instance, give only node Y and it searches for all connected nodes to find the other nodes). It's on the TODO list. At the moment, I can only apologize for the inconvenience. :)

In the second case, there was a few bugs. I fixed those and added a few unit tests. Also, you were missing Z in the list, so do it like this:

from bayespy.inference import VB
Q = VB(Y, Z, A, a0, P)
Q.update(repeat=20)

You may need to do random initializations for some variables in some cases. But I'll write a more detailed example to the docs and discuss that too.

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

I wrote the following example:
http://www.bayespy.org/_notebooks/hmm_discrete.html
If you comments or questions, please send a message here.

In addition, were you interested in some other particular examples I should add? The current list is available here:
http://www.bayespy.org/examples.html
But they are not ready yet. I am working on them currently..

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

This is amazing!
Would you mind if I edit the notebook a bit?

I think adding a few more basic examples on the following might be really helpful

  1. Sprinkler rain example from Wiki to illustrate how evidence changes beliefs. This can easily be contrasted to a MCMC based solution (using PyMC) as well (which is out of scope of this example).
  2. Bayesian linear regression
  3. Bayesian clustering/GMM
  4. DBN examples from Murphy's BNT

Personally, I would be interested in HMM variants- Gaussian HMM, Factorial HMM and the more complex ones where approximate inference is much more handy!

I previously wrote a Additive FHMM using sklearn's HMM toolbox here

This model looks like:

What do you think would be needed to create such a model?

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

Yep, you can edit the notebook and, for instance, make a pull request if you want.

(Linear) regression is a good idea for an example. I'll add that.

The additive FHMM is almost possible with current nodes. Not exactly as you presented it, but quite similar model is possible. The exact model may require implementing one general purpose gating node that I've been thinking for some time now. But I'll take a closer look.

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

Ok, Additive FHMM seems to be a bit more complex than it first appeared. If I understood correctly, the problem is that the states of the different chains become dependent because of the sum of mu variables. That is, p(x|y,mu,...) does not factorize with respect to the different chains. This is quite intuitive: if you change some of the mu selected by one chain, you probably need to change the mu:s selected by other chains too. I think there is no standard distribution for modelling a dependent set of categorical variables and even if there was, the computational cost would be large in this case, something like O(length*states^chains). I guess, a solution would be to force the factorization in the VB approximation:
q(x^(1)) * q(x^(2)) * ... * q(x^(N))
Then writing a specific node for Additive FHMM which takes mu:s and N chains as parents. The chains are then estimated in turns instead of jointly.

I'm not sure that my analysis is correct. I quickly looked at your sklearn's toolbox but didn't find out how you estimate the chains. Of course, the parameters of the chains can be unknown but their estimation is straightforward after one is able to estimate the distribution of the chains/states.

Do you have any insight?

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

This is quite intuitive: if you change some of the mu selected by one chain, you probably need to change the mu:s selected by other chains too.

You are right.

I think there is no standard distribution for modelling a dependent set of categorical variables and even if there was, the computational cost would be large in this case, something like O(length*states^chains). I guess, a solution would be to force the factorization in the VB approximation:
q(x^(1)) * q(x^(2)) * ... * q(x^(N))
Then writing a specific node for Additive FHMM which takes mu:s and N chains as parents. The chains are then estimated in turns instead of jointly.

Zoubin Ghahramani has an old paper on this (it is on FHMM and not specifically Additive FHMM).
He discusses a few approximation schemes in that paper. I feel that this paper should serve as a good reference.

I quickly looked at your sklearn's toolbox but didn't find out how you estimate the chains.

I had infact approximated the Additive FHMM as a HMM. Let me explain that with a small example.

Assume we have two electrical appliances: Fan and TV. For simplicity both these have two states (off and on) and they emit according to a normal distribution.

The parameters for these are as follows

Pi_Fan = [0.5, 0.5]
A_fan = [[0.9, 0.1], [0.2,0.8]]
B_fan = [Normal(0,1), Normal(100,10)]

Pi_TV = [0.2, 0.8]
A_TV = [[0.7, 0.3], [0.2, 0.8]]
B_TV = [Normal(0,1), Normal(200,10)]

Additive FHMM for both these represented as a HMM would have 2*2 = 4 states and parameters as follows:

Pi_combined = [0.5_0.2, 0.5_0.8, 0.5_0.2, 0.5_0.8]
B_combined = [Normal(0+0, 1+1), Normal(0+200, 1+10), Normal(100+0, 1+1), Normal(100+200, 10+10)]
.. and so on..

This way is NOT the best way to do such stuff however.

What do you think?

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

@oliparson, @JackKelly would you have more information to add on variational Additive FHMMs?

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

I looked at Ghahramani's paper. So, exact inference is possible as you presented and a bit more efficient exact inference is possible by utilizing the factorial structure. However, both of these might be too inefficient in practice, so the best practical solution is probably to factorize the chains and then use VB iteration.

However, I don't see the difference between Additive FHMM and the model in (4a)+(4b) in Ghahramani's paper. What is the difference?

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

I would say that Additive FHMM makes it explicit that Y would be the sum of x's at the given time. I think Ghahramani's paper is a little more general.

from bayespy.

jluttine avatar jluttine commented on June 12, 2024

I haven't had time to think about this thoroughly.. In any case, the current collection of nodes does not support implementing this model. I'm thinking what is the most modular and efficient way to implement the model.

One solution is to write a node which takes a Gaussian array and N independent categorical chains as input and then the output is the sum of the mu variables. It would be a deterministic node. This should at least work and the implementation can be quite efficient because the node does mixing and summing at the same time, so it is possible to avoid huge intermediate arrays. Also, it is quite straightforward to implement, I think. The only downside is that the node is not very general purpose.

Another option would be to implement two nodes. The other one does the mixing, that is, selects mu for each chain, and the other node computes the sum of these. It is not obvious to me what the messages in these nodes would be. Also, there might be some overhead because the first node mixes mu's which means that the message has Gaussian dimensionality of observation dimensions times the number of chains, and the length is the number of observations. The nice property of this implementation is that the nodes might be a bit more general than in the other implementation, maybe.

So, I think the first suggestion is probably easier to implement and more efficient. However, I'm not very familiar with the model, so I may have missed or misunderstood something.

from bayespy.

nipunbatra avatar nipunbatra commented on June 12, 2024

Hi,
Sorry for getting back this late. I'll be busy in conferences etc. for 3-4 days more. Should get back to you then.

from bayespy.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.