Comments (4)
Thanks for the message and for the very clearly written code example! :) Just a minor change to your code should fix the problem.
To understand the problem, you need to know that BayesPy considers Gaussian variables to have both shape
and plates
. shape
defines the "shape" of the Gaussian variable and plates
defines the "repetition" of these variables. plates
defines sets that are independent in the posterior while shape
defines the shape of the variable array in which elements are correlated in the posterior approximation. In your case, you want to (or need to) model the correlations on those axes that are summed over, thus you want to define those as shape
axes and the non-summed axes as plates
axes. May sound a bit complicated, especially if you're not familiar with variational Bayesian methods..
Anyway, this should fix your code:
b = bp.nodes.GaussianARD(0, 1e-3, shape=(n, c)) # use shape, not plates
However, I suppose you don't want to sum over axis s
, so you need to modify the code a bit more to avoid that summing. There might be a slight difference to einsum
because that key definition you used would sum over s
axis in SumMultiply
. So, if you don't want to sum over that axis, you need to change the code as follows:
x = np.random.normal(0, 1, (s, n, c)) # move s axis to be the first
b = bp.nodes.GaussianARD(0, 1e-3, shape=(n, c))
f = bp.nodes.SumMultiply('ij,ij', b, x) # ignore s axis in the keys
The total shape of a GaussianARD
variable is plates + shape
, thus the shape
axes are the trailing axes. Notice that I just moved s
axes to be the first axis and the summed axes n
and c
are the last axes. In addition, SumMultiply
can sum only shape
axes, not plates
axes. I can try to explain this plates
and shape
difference in more detail if you want. It can be a bit confusing.
Then a general comment about your approach if you're interested. :) Using a high-order polynomial to gain flexibility isn't the best way to go. You limit the possible functions to only those functions that are represented as such polynomials. I would suggest you consider Gaussian processes for the task, if you find yourself comfortable with them. They are an amazing tool for flexible non-linear regression problems using the Bayesian approach. GPs aren't yet implemented in BayesPy but they are available for instance in GPy https://github.com/SheffieldML/GPy or pyGPs https://github.com/marionmari/pyGPs or a few other toolboxes. Just a thought. But if you want to do polynomial regression, then your approach with BayesPy is good. :)
The regression code may not scale to very large datasets because of how BayesPy handles messages internally, but I hope you won't be affected by that problem. Just letting you know if you happen to hit that problem with bigger datasets that you think should be easily inside the computational limits.
Did this help? Please don't hesitate to ask further questions.
from bayespy.
Thanks so much for the prompt and most helpful response! Those additions/changes did the trick; and the high order polynomial method successfully modeled this round of test data.
I think I understand the plates-shape differentiation, at least enough to refresh my memory next time. Looking back at the linear regression example, I see that shape
is very clearly used, and not plates
. Oops. : /
I still haven't quite wrapped my mind around Einstein summation...in general, so I will have to re-watch that Einstein summation playlist on youtube. ; )
Your comment on the general approach is most welcome. Per your suggestion, I looked into Gaussian processes, and they look like the ticket! But I have a question: part of the reason I wanted to use high order polynomials is because I can easily take their derivatives programatically. Zooming out even more, I am working on a reinforcement learning model where the learner attempts to climb the gradient to maximize the dependent value, so I'll need the derivative (possibly the second, third, etc.) of the inferred model. So...how difficult is it to compute the gradient of a Gaussian process model?
Thanks again for the help!
from bayespy.
About GP derivatives, see section 9.4 in Gaussian Processes for Machine Learning (Rasmussen & Williams), available here: http://www.gaussianprocess.org/gpml/chapters/RW9.pdf . (Great book, I recommend you take a look at it.) In short, "since differentiation is a linear operator, the derivative of a Gaussian process is another Gaussian process". You just basically use the derivative of your covariance function as the covariance function of your derivative. Simple. So you not only get a posterior distribution over functions but also a posterior distribution over the derivative (and even higher order derivatives). I'm not sure how easily the packages I mentioned support this kind of operation, but in any case it should be relatively easy and straightforward to implement derivatives of covariance functions yourself. Cheers! :)
from bayespy.
Wow! Thanks for the above-and-beyond assistance! That is a great resource saying exactly what I wanted to hear; looks like I have my work cut out for me.
Thanks for pointing me in the right direction. : )
from bayespy.
Related Issues (20)
- ImageComparisonFailure error during tests HOT 1
- Creating Gaussian Node Raises Error HOT 4
- Extended Marble Example HOT 1
- Fast Variational Bayesian Linear State-Space Model HOT 2
- Performance Doubt HOT 3
- Implamenting
- How to approach bayesian?
- Online/Iterative Learning for Bayespy?
- Higher order Markov chains? HOT 3
- Anaconda install of bayespy causes a downgrade of ipython HOT 6
- Gate for Non Categorical Variable HOT 4
- How to extract posterior over latent states from CategoricalMarkovChain HOT 2
- Remove deprecated time.clock() call HOT 1
- Array of counts doesn't work in multinomial mixture HOT 2
- Using the posterior distribution? HOT 1
- Gaussian mixture HOT 2
- Increase the usage of augmented assignment statements
- Errors in the demos HOT 1
- More projects for "Similar projects"
- Equations not showing in docs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bayespy.