mogeng / iohmm Goto Github PK

Input Output Hidden Markov Model (IOHMM) in Python

License: BSD 3-Clause "New" or "Revised" License

Python 100.00%

graphical-models hidden-markov-model linear-models machine-learning python scikit-learn semi-supervised-learning sequence-labeling sequence-to-sequence statsmodels supervised-learning time-series unsupervised-learning

iohmm's People

Contributors

Stargazers

Watchers

iohmm's Issues

Tutorial?

Is there any tutorial or manual which can guide me as a user to be able to implement a function IOHMM and produce outputs? I'm thinking about using this in a bachelor's thesis.

Handling latent observations in IOHMM

Hi,

I need to modify your code to be able to handle missing observations controlled by a binary vector R that controls missed or provided observations by considering probability 1(log probability 0 ) for missed observations such as suggested solution on
yeh2012 (1).pdf

can you guide me on this?
thnx

Regarding Transition Probabilities

Hello

I am using a few covariates for transition, but when I am using the sample code provided to view transition probabilities, it is throwing an error.
For Example : covariates_transition = ['a','b','c','d','e', 'f','g','h','i']

When using print(np.exp(SHMM.model_transition[0].predict_log_proba(np.array([[]])))),
I am getting an error : X has 1 features per sample; expecting 10. How can I get the transition probabilities for the two hidden state model with transition covariates.

Also, how can I get the most likely state for each observation? In hmmlearn ,
remodel = hmm.GaussianHMM(n_components=2).fit(X, lengths)
Z3 = remodel.predict(X) provides the hidden states, how can i get that in this model.

Thank you for all the help.

Getting AIC/BIC to figure out the optimal states

Hi could you please let me know how can I get the AIC/BIC for UnSupervisedIOHMM to get the optimal hidden states.
Thank you for your help.

ModuleNotFoundError: No module named 'sklearn.linear_model.base'

When using IOHMM with python 3.9.1 I get the following error:

Traceback (most recent call last):
File "c:\users...\appdata\local\programs\python\python39\lib\site-packages\IOHMM_init_.py", line 1, in
from .IOHMM import (UnSupervisedIOHMM,
File "c:\users...\appdata\local\programs\python\python39\lib\site-packages\IOHMM\IOHMM.py", line 39, in
from .linear_models import (GLM, OLS, DiscreteMNL, CrossEntropyMNL)
File "c:\users...\appdata\local\programs\python\python39\lib\site-packages\IOHMM\linear_models.py", line 56, in
from sklearn.linear_model.base import _rescale_data
ModuleNotFoundError: No module named 'sklearn.linear_model.base'

I managed to get it working by changing sklearn.linear_model.base to sklearn.linear_model._base in IOHMM\linear_models.py

Train an IOHMM with independent chains

I want to train an IOHMM with different independent sequences and I don't know how to specify them on the only one input panda data frame. Can you explain to me the fields corr and prev briefly and how to use them, please?

Prediction

How to do prediction from the output distributions?

how to do prediction?

How to do prediction for new test data after trained the model?

Train a SemiSupervisedIOHMM

I was using SemiSupervisedIOHMM from IOHMM package, but I got an error which is "ValueError: too many values to unpack (expected 2)". To understand the problem I ran your code (https://github.com/Mogeng/IOHMM/blob/master/examples/notebooks/SemiSupervisedIOHMM.ipynb) on your given data (https://github.com/Mogeng/IOHMM/blob/master/examples/data/speed.csv), and again got the same error. Could you please help me with this?

Multivariate gaussian emissions

Hello,

I have blank inputs and am trying to reconcile the results between IOHMM and hmmlearn.GaussianHMM. Which I assume should be the standard HMM model.

USHMM = UnSupervisedIOHMM(num_states=2, max_EM_iter=200, EM_tol=1e-6)

USHMM.set_models(model_emissions = [OLS()],  
                                    model_transition=CrossEntropyMNL(solver='lbfgs'), 
                                    model_initial=CrossEntropyMNL(solver='lbfgs'))

USHMM.set_inputs(covariates_initial = [], covariates_transition = [], covariates_emissions = [[]])
USHMM.set_outputs([['feature1'],['feature2'],['feature3'],['feature4']] ) #multiple univariate emissions
USHMM.set_data([train_df])

The above gives me almost same results to

import hmmlearn as hmm

model = hmm.GaussianHMM(n_components = 2, covariance_type = "diag")
model.fit(train_df)

This is expected since "diag" implies that all observations are independent and simply individual emissions. The IOHMM model in this case treats it as 4 independent emissions so the results are not too different and are able to reconcile.

However, if I switch to a full multivariate covariate emission

USHMM = UnSupervisedIOHMM(num_states=2, max_EM_iter=200, EM_tol=1e-6)

USHMM.set_models(model_emissions = [OLS()],  
                                    model_transition=CrossEntropyMNL(solver='lbfgs'), 
                                    model_initial=CrossEntropyMNL(solver='lbfgs'))

USHMM.set_inputs(covariates_initial = [], covariates_transition = [], covariates_emissions = [[]])
USHMM.set_outputs([['feature1','feature2','feature3','feature4']])  #multivariate emission
USHMM.set_data([train_df])

The gives me very different results compared to

model = hmm.GaussianHMM(n_components = 2, covariance_type = "full")
model.fit(train_df)

Where "full" means there would be a pxp covariance matrix under each state which is used as emissions to characterise the multivariate gaussian distribution. Is this expected or did I set-up the model incorrectly? Thanks

Best Regards,

ImportError: cannot import name 'logsumexp'

When using IOHMM with python 3.5.2 I get the following error:

Traceback (most recent call last):
File "", line 1, in
File "/home/tas/.local/lib/python3.5/site-packages/IOHMM/init.py", line 1, in
from .IOHMM import (UnSupervisedIOHMM,
File "/home/tas/.local/lib/python3.5/site-packages/IOHMM/IOHMM.py", line 38, in
from .forward_backward import forward_backward
File "/home/tas/.local/lib/python3.5/site-packages/IOHMM/forward_backward.py", line 21, in
from scipy.misc import logsumexp
ImportError: cannot import name 'logsumexp'

Using CrossEntropyMNL as model_emissions parameter

(UnSupervisedIOHMM)
When setting models as follows:

SHMM.set_models(model_emissions = [CrossEntropyMNL(solver='lbfgs')], model_transition=CrossEntropyMNL(solver='lbfgs'), model_initial=CrossEntropyMNL(solver='lbfgs'))

It throws me back this error:

ValueError: Input contains NaN, infinity or a value too large for dtype('float64').

It does not happen when I use instead:

SHMM.set_models(model_emissions = [OLS(est_stderr=True)], model_transition=CrossEntropyMNL(solver='lbfgs'), model_initial=CrossEntropyMNL(solver='lbfgs'))

Could you tell me how to face this problem?

Thanks in advanced!

P.D.: entire model adjustment code here:

SHMM = UnSupervisedIOHMM(num_states=3, max_EM_iter=200, EM_tol=1e-4)

SHMM.set_models(model_emissions = [CrossEntropyMNL(solver='lbfgs')], model_transition=CrossEntropyMNL(solver='lbfgs'), model_initial=CrossEntropyMNL(solver='lbfgs'))

SHMM.set_inputs(covariates_initial = [], covariates_transition = ['var'], covariates_emissions = [['var']])

SHMM.set_outputs([['out']])

SHMM.set_data([df])

SHMM.train()

What if some sequences has only one record in it?

I was using UnSupervisedIOHMM to learn sequences, but the predict_log_proba in E_step keeps give me ValueError: zero-size array to reduction operation maximum which has no identity. The error arises from X = self._transform_X(X) in predict_log_proba and my self.inp_transitions[seq]=[] due to the length of these sequences are 1. So they don't have inp_transitions. Is there any way I have get rid of this problem?

SemiSupervisedIOHMM couldn't load model

On line:
SHMM_from_json = SemiSupervisedIOHMM.from_json(json_dict)
i get the following error:

ValueError Traceback (most recent call last)
in
----> 1 SHMM_from_json = SemiSupervisedIOHMM.from_json(json_dict)

~/.local/lib/python3.5/site-packages/IOHMM/IOHMM.py in from_json(cls, json_dict)
539 model_initial=getattr(
540 LinearModelLoader, json_dict['properties']['model_initial']['data_type']).from_json(
--> 541 json_dict['properties']['model_initial']),
542 model_transition=[getattr(
543 LinearModelLoader, model_transition_json['data_type']

~/.local/lib/python3.5/site-packages/IOHMM/linear_models.py in from_json(cls, json_dict)
316 l1_ratio=json_dict['properties']['l1_ratio'],
317 coef=np.load(json_dict['properties']['coef']['path']),
--> 318 stderr=np.load(json_dict['properties']['stderr']['path']))
319
320

~/.local/lib/python3.5/site-packages/numpy/lib/npyio.py in load(file, mmap_mode, allow_pickle, fix_imports, encoding)
445 else:
446 return format.read_array(fid, allow_pickle=allow_pickle,
--> 447 pickle_kwargs=pickle_kwargs)
448 else:
449 # Try a pickle

~/.local/lib/python3.5/site-packages/numpy/lib/format.py in read_array(fp, allow_pickle, pickle_kwargs)
694 # The array contained Python objects. We need to unpickle the data.
695 if not allow_pickle:
--> 696 raise ValueError("Object arrays cannot be loaded when "
697 "allow_pickle=False")
698 if pickle_kwargs is None:

ValueError: Object arrays cannot be loaded when allow_pickle=False

Impose constraints on the transition model

I am very grateful to you and your team for the code that you provided. I am very interested in IOHMM models and want to use my dataset to train a simple UnsupervisedIOHMM. But，there is a problem in using the model.

I was wondering if it is possible to impose constraints on the transition probability when learning with EM. I want to train a specific type of IOHMM where there are no transitions from a higher indexed state to a lower indexed state (also called the Bakis model, left-to-right HMMs). By means of which, if a system goes from any state z𝑖 to another state z𝑗 where 𝑖 <= 𝑗, then it cannot go back to the previous state. Does the existing IOHMM Library have this option? Or How can I add this constraint to the code?

Training model with multiple transition matrices

Thank you all so much for making and sharing this package. Is it possible to create and train an unsupervised IOHMM that has 2 possible transition matrices and the input array determines which matrix is applied at each step in the sequence? All of the examples only show multiple emission matrices, and I haven't been able to initialize a model with multiple transition matrices. Thank you again!

Emissions Probability Distribution

Hi!

I have adjusted an unsupervised IOHMM with 3 states and an emission covariate (see code below).

SHMM = UnSupervisedIOHMM(num_states=3, max_EM_iter=200, EM_tol=1e-4)

SHMM.set_models(model_emissions = [OLS(est_stderr=True)], model_transition=CrossEntropyMNL(solver='lbfgs'), model_initial=CrossEntropyMNL(solver='lbfgs'))

SHMM.set_inputs(covariates_initial = [], covariates_transition = [], covariates_emissions = [['var']])

SHMM.set_outputs([['out']])

SHMM.set_data([df])

SHMM.train()

I am trying to get the Emissions Probability Distribution in each state based on the emission covariate value. How can I model this probability? I am getting two coefficients and two estimated standard error coefficients of the emission model in each state.

How can I use these coefficients?

Appreciate your answers!

Constraints

Is it possible to impose constraints on the parameters?

mogeng / iohmm Goto Github PK

iohmm's People

Contributors

Stargazers

Watchers

Forkers

iohmm's Issues

Recommend Projects

Recommend Topics

Recommend Org