Comments (11)
Now that I think of it: Is it possible to deterministically (force set) the value of several other categorical nodes based on the value of a parent. For example if categorical node p takes on value 1 then I need children c_0 through c_10 to also take value 1 and if p takes value 0 can call its children just take value 0. I checked Deterministic node documentation, but dont see such support yet.
If I can do that, then I can broadcast my entity assignment to each word and proceed.
from bayespy.
I quickly looked into this, but the model is a bit complex, so I'd need to spend a bit more time thinking about this. Anyway, the broadcasting error is raised because the plates of the two nodes are incompatible.
I like to denote the overall shape of a node in two parts: the plates and the dimensionality/range/etc of the variable. For instance, entity_assignments
has plates (numDocuments, numEntities)
and range(?) (numPersonas)
, that is, each of the elements is an integer in range [0, numPersonas]
. Similarly, psi
has plates (numPersonas, numRoles)
and dimensionality (numTopics)
, that is, each variable is a numTopics
-dimensional vector from a Dirichlet distribution.
Now, at least using gated_plate=-2
perfectly matches the range of entity_assignments
with the second last plate axis of psi
, so that's fine. However, the plates of entity_assignments
(numDocuments, numEntities)
should be broadcastable to the remaining plates of psi
(numRoles,)
, but it isn't. So there's a mismatch. Are the plates incorrect for either of the nodes, are you combining wrong nodes or would you want to get resulting plates like (numDocuments, numEntities, numRoles)
? I should take a bit more careful look to figure out how to write the model, but at least there's an obvious error.
With complex models like this, I typically like to write the plates and "dimensionality" of the node as a comment above the node definition so I can keep track of them easily. Then, usually you are doing things the right way if all the plates and dimensionalities match nicely. ;)
I can try to take a more careful look some other day but I wanted to at least write this in case it helps you to solve the issues.
About the deterministic mapping, I'm not sure why would you need map 0->0 and 1->1 deterministically to a child node, why can't you use the parent node directly in such a case? Anyway, Take
node provides one way of defining deterministic mappings, but I don't think it's needed here.
I hope this helps at least a bit. I'll get back to you another day if you want.
from bayespy.
Thanks foe the note. I will get back to you in a day or two.
from bayespy.
@jluttine Thanks a lot for your comments. Here is my scenario:
I have topics that can be indexed by 2 dimensions (instead of just one).
So for each word w in my corpus, there is a topic assignment determined by 2 indices: (a) row index and (b) column index
So to specify this in my model, I need to gate on both dimensions. I was hoping to do this by first gating on rows and then nesting that to gate on columns.
Here is a small snippet
import bayespy
word_topic_dist = bayespy.nodes.Dirichlet([1,1,1,1], plates=(4,3)) # There are 12 topics indexed by a 4X3 matrix. Each topic is a distrubution over 4 words
row_assignments = bayespy.nodes.Categorical([0.2, 0.1, 0.2, 0.5], plates=(1000,)) # The row assignments to index the topics for each of the 1000 words in my corpus
col_assignments = bayespy.nodes.Categorical([0.2, 0.1, 0.7], plates=(1000,)) # The column assignments to index the topics for each of the 1000 words in my corpus
# So word *i* must be drawn from topic indexed (row_assignments[i], col_assignments[i])
# How can I achieve this using BayesPy (I am thinking using Gates is the right approach) ?
from bayespy.
Sorry for the huge delay. I've been very busy during the last few weeks. I took a look again and it takes a quite a lot of time for me to parse the exact model definition. Many words but few mathematical formulas. :) If you have time to write the exact model definition as conditional probability distributions or some generative formulas which describe how each variable defines some others, and what is the input data like, that would help a lot. Implementing that in BayesPy is then rather simple, hopefully. But in any case, I'll continue on this today evening or some other day again. Cheers!
from bayespy.
FYI, I'm reading section 4.1 from the paper, that probably gives enough information.
from bayespy.
Note: This message is Jupyter Notebook. You can download it or run it interactively.
Ok, I now sketched an implementation of the Dirichlet Persona Model. You should double check that this is what you wanted, I'm not absolutely sure. I think I made at least one minor change: persona distribution is global, not document/movie specific.
Anyway, define the configuration
import bayespy as bp
import numpy as np
numTopics = 10 # number of topics
numPersonas = 4 # protagonist, villain, ...
numRoles = 3 # agent verb, patient verb, attribute
sizeVocabulary = 50 # size of vocabulary
#numDocuments = 8 # number of documents (not used now)
numCharacters = 15 # total number of characters
sizeCorpus = 10000 # size of the dataset
# Generate random dataset from the model
# Data are a set of tuples (word, role, character)
# So, each "datapoint" has a word-index, role-index and character-index.
data_characters = bp.nodes.Categorical(
np.ones(numCharacters) / numCharacters,
plates=(sizeCorpus,)
).random()
data_roles = bp.nodes.Categorical(
np.ones(numRoles) / numRoles,
plates=(sizeCorpus,)
).random()
data_personas = bp.nodes.Categorical(
np.ones(numPersonas) / numPersonas,
plates=(numCharacters,)
).random()
data_topic_dist = bp.nodes.Dirichlet(
np.ones(numTopics),
plates=(numPersonas, numRoles)
).random()
data_topics = bp.nodes.Categorical(
data_topic_dist[data_personas[data_characters], data_roles]
).random()
data_word_dist = bp.nodes.Dirichlet(
np.ones(sizeVocabulary) / sizeVocabulary,
plates=(numTopics,)
).random()
data_words = bp.nodes.Categorical(
data_word_dist[data_topics],
plates=(sizeCorpus,)
).random()
Below is the model:
# Word distribution for each topic
# (numTopics) x (numWords)
word_dist_in_topics = bp.nodes.Dirichlet(
np.ones(sizeVocabulary),
plates=(numTopics,)
)
# Topic distribution for each role and persona
# (numPersonas, numRoles) x (numTopics)
topic_dist_in_personas_and_roles = bp.nodes.Dirichlet(
np.ones(numTopics),
plates=(numPersonas, numRoles)
)
# Persona distribution (make this document specific?)
persona_dist = bp.nodes.Dirichlet(
np.ones(numPersonas)
)
# Persona assignments of the characters
# (numCharacters) x (numPersonas)
personas_of_characters = bp.nodes.Categorical(
persona_dist,
plates=(numCharacters,)
)
# Persona assignments for each data point (i.e., each word in the corpus)
# (sizeCorpus) x (numPersonas)
personas = bp.nodes.Gate(
data_characters,
personas_of_characters
)
# Topic assignment for each data point (i.e., each word in the corpus)
# (sizeCorpus) x (numTopics)
topics = bp.nodes.Categorical(
bp.nodes.Gate(
personas,
bp.nodes.Gate(
data_roles[:,None], # a trick to make plates match in this case
topic_dist_in_personas_and_roles
)
)
)
# Words in the corpus
# (sizeCorpus) x (sizeVocabulary)
words = bp.nodes.Categorical(
bp.nodes.Gate(
topics,
word_dist_in_topics
)
)
Create VB object, initialize some nodes randomly and observe the data. Note that characters and roles data were used as "inputs" in the above model.
Q = bp.inference.VB(
words,
word_dist_in_topics,
topics,
topic_dist_in_personas_and_roles,
personas_of_characters,
persona_dist,
)
topics.initialize_from_random()
personas_of_characters.initialize_from_random()
topic_dist_in_personas_and_roles.initialize_from_random()
persona_dist.initialize_from_random()
words.observe(data_words)
Run inference:
Q.update(repeat=1000)
You can visualize the posterior of the nodes for instance as:
%matplotlib notebook
bp.plot.plt.figure(); bp.plot.hinton(personas_of_characters)
bp.plot.plt.figure(); bp.plot.hinton(word_dist_in_topics)
I hope this helps!
from bayespy.
@jluttine : Thanks a ton ! This is awesome :) . Its pretty much the model I had in mind (except that the persona disttribution is document specific, but I think I can get that working).. I was having trouble defining the topics node (where you have a nested gate). Nice trick to use the data_roles to index the roles I had modeled roles as probabilistic variable, which I had to set to observed, which complicated my model). Thanks a ton ! I will definitely be using your wonderful package for my research !
from bayespy.
In the paper, the document specific persona distributions seem to share the concentration parameters of the Dirichlet distributions. I think that sharing these parameters would currently require a custom node which implements an estimation algorithm. I'll make a separate issue about that feature request and see what I can do about it. Anyway, I don't think it makes much of a difference whether you use document specific persona distributions by sharing the the concentration parameters or use the same persona distribution for all documents. It might make a significant difference if you have a relatively large number of personas in each document AND the distributions of those personas are very different in each document. If I understood correctly, this is probably not the case.
And at least, don't use document specific persona distributions without sharing the concentration parameters, you'll probably end up getting very little information about the persona distributions on each document.
from bayespy.
For your information, develop branch now supports learning the concentration parameter of a Dirichlet distribution. The node is called DirichletConcentration and it takes one mandatory argument: the dimensionality of the probability vector. That is, one integer. This node can then be used as a parent node for Dirichlet nodes in order to learn models where many Dirichlet variables share a common unknown concentration parameter.
from bayespy.
Thanks a ton !I will give it a shot.
from bayespy.
Related Issues (20)
- ImageComparisonFailure error during tests HOT 1
- Creating Gaussian Node Raises Error HOT 4
- Extended Marble Example HOT 1
- Fast Variational Bayesian Linear State-Space Model HOT 2
- Performance Doubt HOT 3
- Implamenting
- How to approach bayesian?
- Online/Iterative Learning for Bayespy?
- Higher order Markov chains? HOT 3
- Anaconda install of bayespy causes a downgrade of ipython HOT 6
- Gate for Non Categorical Variable HOT 4
- How to extract posterior over latent states from CategoricalMarkovChain HOT 2
- Remove deprecated time.clock() call HOT 1
- Array of counts doesn't work in multinomial mixture HOT 2
- Using the posterior distribution? HOT 1
- Gaussian mixture HOT 2
- Increase the usage of augmented assignment statements
- Errors in the demos HOT 1
- More projects for "Similar projects"
- Equations not showing in docs HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bayespy.