Comments (6)
states = total no. of tags in training data ?
Yes, that is correct.
also, you mentioned for my case the observations will be multi-dimensional vectors, I didn't understood that
Each column of a sequence (sentence) should correspond to one word. You can choose how you want to represent each word. One option is to store each word as a numeric index (e.g. "hello" corresponds to 0
, "goodbye" corresponds to 1
, etc., etc.), but that will only work for emission distributions that are DiscreteDistribution
. If you are using GaussianDistribution
or GMM
, then you will want to represent each word as, e.g., an embedding or a one-hot encoded vector, or something like this.
of type size_t - is it to represent states as numbers?
Yes, each state is represented by its index.
Does each row = states for each sequence?
Yes, each row vector in stateSeq
should correspond to the list of hidden states for each word in the corresponding sentence.
from mlpack.
So, the answer to the question depends on whether you are doing this from C++, or from a binding or command-line program. In both cases, it could be helpful to take a look at the tests to get an idea of some examples, although I do understand that looking at test code is not always the easiest:
- C++ interface tests: https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/hmm_test.cpp
- command-line/binding interface tests: https://github.com/mlpack/mlpack/blob/master/src/mlpack/tests/main_tests/hmm_train_test.cpp (note that there are also tests for the hmm_viterbi, hmm_loglik, and hmm_predict bindings too)
In short an HMM is trained on a series of sequences (optionally, you might know the hidden states for each observation in a sequence, but that is not required). In C++, this is represented as a std::vector<arma::mat>
, where each element in the outer std::vector
corresponds to a sequence, and each inner arma::mat
(which is a sequence) has each observation in the sequence as a column.
In really simple cases, each observation might be a single scalar (e.g. the temperature); in this case, each arma::mat
sequence would have 1 row (the temperature) and however many columns were in that sequence. Each sequence can have a different length (number of columns). In more complex cases, each observation may actually be a multidimensional vector; I think that will be the case with your parts-of-speech tagging.
In C++, it will be up to you to use data::Load()
to load each matrix in the sequence and pack it into a std::vector<arma::mat>
. Of course if you only have one sequence, then there is only a need for one element in the std::vector
.
If you are using the bindings (e.g. command-line mlpack_hmm_train
), you can pass in a single sequence with the input_file
option, where that file is just a matrix that contains a single sequence as described above. Or, if you specify the batch
option, then it is expected that the file specified by the input_file
option contains a list of filenames, each of which specifies one sequence.
I hope this helps! It actually is on the short-term TODO list to clarify the expectations of these methods, so hopefully that should help out. Let me know if I can clarify anything. 👍
from mlpack.
@rcurtin Thanks for the above explanation. However, it seems I still do have some confusion regarding the inputs to Train().
For my use case, from what I understood:
- Sequence = sentence
- Words = observations
- States = POS tags
- Transition probability: probability of current state C given previous state was P
- Emission probability: probability of observation O given the current state C.
I will be using C++ for experimentation (but I would love to know how to use CLI bindings as well).
From what I have thought:
- I will call the constructor:
HMM(states) // states = total no. of tags in training data ?
- Then I will call Traiin():
Train(dataSeq, // vector of size = no. of sentences in data set // each element of vector is a matrix, where I am confused on what will be the rows and columns and what will each element of matrix hold // also, you mentioned for my case the observations will be multi-dimensional vectors, I didn't understood that stateSeq // vector of rows (of type size_t - is it to represent states as numbers?) with arbitrary number of columns // Does each row = states for each sequence? );
from mlpack.
This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! 👍
from mlpack.
@rcurtin I was able to implement POS Tagging with HMMs successfully.
I made a Youtube video explaining steps from start to end.
Also, you can find the code here.
Thanks for the guidance.
from mlpack.
Awesome! I will point people towards that in the future when there are questions about the HMM code. Also, if you had interest in adapting that to the examples repository I think it would be nice to add, but don't feel obligated (it's easy enough to link to the repository you have).
from mlpack.
Related Issues (20)
- Can't train a model having bias addition layer Add() HOT 8
- Reverse Convolution? HOT 6
- Documentation issue
- [R] - `verbose` argument has no effect HOT 1
- Get rid of `arma::fill::zeros` when we upgrade the minimum armadillo version HOT 7
- Document `internal_compact::` name space for `arma::fill` HOT 3
- [R] - Global option for 'verbose' argument HOT 5
- Add `.prepare` script to have r-universe automatically build new nightlies HOT 1
- bfd.h:35:2: error: #error config.h must be included before this header HOT 4
- Any ideas about Random Forest regressor? HOT 2
- Switch from `-j 2` to `-j ${nproc}` HOT 3
- dimensionality mismatch: Decision Tree CLI with both -t and -T specified HOT 1
- [R] - Should the returning model object gain a class? HOT 6
- NumPy 2.0 support HOT 3
- [R] Switch `sprintf` to `snprintf` HOT 4
- Physics-Informed Neural Network possible with MLPack? HOT 1
- 1-D Convolution issues about time series data HOT 1
- Using Header-Only mlpack via CMake FetchContent and Automatic Dependency Download HOT 1
- Problem with loading a SimpleDQN model
- How can I get params of linear regression? HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mlpack.