infer-actively / pymdp Goto Github PK

View Code? Open in Web Editor NEW

422.0 422.0 84.0 13.36 MB

A Python implementation of active inference for Markov Decision Processes

License: MIT License

Python 85.02% MATLAB 12.21% TeX 2.78%

pymdp's People

Contributors

Stargazers

Watchers

Forkers

berenmillidge elichou martinagvilas muddy91 arun-niranjan ivanmetel nested-minds osaaso3 exilefaker activeinferenceinstitute techthiyanes byeonkyoungseob amart85 benjamesbabala shism2 spetey daphnedemekas longtailfinancial tugis-test jqq93 calebtuttle emdupre rl-code-lib aureliennioche smejak seaspaniel whatcoloris pitmonticone tverbele phylodome adamjuliangoldstein navyrub tedvr vamorel fgregoretti annaother yibranpacheco marek1n ramonserrallonga casperhesp mengjin123 paul-kinghorn poppycollis weka511 esennesh alemuzzi ozancatalverses moomoofarm1 zeng-zr standardgalactic fionafibration 5l1v3r1 tommasoderossi7 bioelectricity ran-wei-verses macterra ramonpereira jgraving chriscalvin petrahubatka victorbbt sriharshakoribille zlte2011 francescobalzan tianyu-z sangxiaofu sgrimbly iosonomarco hideyy nikhilsulghur aswinpaul toonvdm magnuskoudahl lancelotdacosta scott-carroll-verses-ai salvatomm stendoipanni nicola9602

pymdp's Issues

Not sure linting step in github actions is doing what it was expected to do..

.github/workflows/python-package.yml:

    - name: Lint with flake8
      run: |
        # exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
        flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics

While it is true that the code gets linted, it happens in context of a github docker image and is lost after the tests are run.

If you want auto-linting to happen before commits, there are a variety of ways to do it, but I suggest looking into precommit. If you set expectations that code is linted before commit, this step can be updated so that it fails if the linting results in changing the files at all, because it implies that the files weren't linted prior to committing.

Nothing is perfect and there are ways around any such solution, but you can see how we're using precommit to make sure that static artifacts are build correctly before commits here: https://github.com/VersesTech/c4-verses-ai/blob/main/.pre-commit-config.yaml

transition matrix B form

Hey there,
First of all, I'm grateful for your work and your effort to make active inference more accessible for everyone. I'm quite new to this approach, so I started playing around with the provided notebooks.
I was wondering why the (controllable) transition matrix B (for the Agent class inside pymdp.agent) accepts direct state interaction only, i.e. it has to be in the form s^i_{t+1} = B(s^i_t, a) and not in a form with internal state interactions like s^i_{t+1} = B(s^i_t, s^j_t ,a), where s^i, s^j refer to different hidden states. Am I wrong here? Because imho it shouldn't be a problem, since you can always define a world state \tilde{s}(s^i, s^j), but makes your program more complicated.
Best,
Martin

Allow `alpha` (action precision) to be passed into `Agent()` constructor as optional argument

Right now, action precision (alpha) in case of stochastic action sampling, is defaulted to alpha = 16.0 within the function of the control module sample_action(..., alpha = 16.0). Change this so that an instance of Agent.agent itself stores the action precision, and calls it in its method self.sample_action().

Originally posted by @helenegu in #4 (comment)

README badges

I think it would be helpful to have the badges in README to link to something other than the picture of the badge.
E.g. the pypi could link to https://pypi.org/project/inferactively-pymdp/
Further I would suggest a badge to link to your read the docs site, and if you have CI set up for your pytest tests, a badge for that would be very helpful

Full API documentation

Hi!

I think it would be useful to provide full API documentation for your package on your RTD website.
For example, I think the agent class and the envs subpackage are quite important, but I can't find API docs for them on the site.
I'd find it easier to follow if they were listed in the modules section of the site, or even if you wanted to provide a separate full API documentation section that would work too I think.
Thanks!

A matrix stub bug

Hi,
Will this new method fix the error I get with A-Matrix and B-Matrix examples?

AssertionError Traceback (most recent call last)
in
----> 1 A = utils.convert_A_stub_to_ndarray(A_stub, model_labels)

~\pymdp\core\utils.py in convert_A_stub_to_ndarray(A_stub, model_labels)
490 for g, modality_name in enumerate(model_labels['observations'].keys()):
491 A[g] = A_stub.loc[modality_name].to_numpy().reshape(num_obs[g], *num_states)
--> 492 assert (A[g].sum(axis=0) == 1.0).all(), 'A matrix not normalized! Check your initialization....\n'
493
494 return A

AssertionError: A matrix not normalized! Check your initialization....

or unrelated?
Thanks

Originally posted by @osaaso3 in #26 (comment)

Fix assertion when whether `num_controls` is consistent with actions enumerated in policy space

In line 149 of agent.py, this assertion statement is used:

assert all([n_c == max_action for (n_c, max_action) in zip(self.num_controls, list(np.max(all_policies, axis =0)+1))]), "Maximum number of actions is not consistent with `num_controls`"

But sometimes certain actions will not be allowed (they will be pruned / absent in policies) and thus even when taking the maximum across all policies, you won't see the maximum action taken within a given control factor.

Therefore, @tverbele suggests to change this to

assert all([n_c >= max_action for (n_c, max_action) in zip(self.num_controls, list(np.max(all_policies, axis =0)+1))]), "Maximum number of actions is not consistent with `num_controls`"

lack of time symmetry in run_mmp when creating messages

When going through mmp with @conorheins yesterday, I noticed this section:

                # past message
                if t == 0:
                    lnB_past = spm_log(prior[f])
                else:
                    past_msg = B[f][:, :, int(policy[t - 1, f])].dot(qs_seq[t - 1][f])
                    lnB_past = spm_log(past_msg)

                # future message
                if t >= future_cutoff:
                    lnB_future = qs_T[f]
                else:
                    future_msg = trans_B[f][:, :, int(policy[t, f])].dot(qs_seq[t + 1][f])
                    lnB_future = spm_log(future_msg)

I am not sure if the past_msg and future_msg lines are consistent with their use of t-1, t and t +1. I think this is worth checking against 1) the original SPM to see what they do and 2) deciding what the correct behaviour should be.

Better error messages for MMP

Add better error /warning messages when you try to initialise an Agent() instance with the MMP inference algorithm (inference_algo="MMP") , but that choice doesn't make sense (inconsistent other choices of policy_depth, etc.).

Sampling action when using deep policy

Hi, thank you for this package I'm really enjoying learning about active inference.

I deeply appreciate the contributors to this package.

However, I have a question when I tried to implement the explore-exploit task in this article (Smith et al., A step-by-step tutorial on active inference and its application to empirical data, https://doi.org/10.1016/j.jmp.2021.102632) which is already implemented in MATLAB and "pymdp".

I tried to run a loop for active inference of deep policy (two time-steps) according to the "complete recipe for active inference" as written in the "pymdp" tutorial notebook, but I found that the "sample_action" method of the "Agent" class only sample action from the first timestep of policy (each policy has the shape of (2,2), the first dim is the number of timesteps and the second dim is the number of factors) using "control.sample_policy" function as below:

(line 674-675, control.py)

for factor_i in range(num_factors):
     selected_policy[factor_i] = policies[policy_idx][0, factor_i]

My setting of the agent class was:

timepoints = [0,1,2]
agent = Agent(
    A = A_gm,
    B = B,
    C = C,
    D = D_gm,
    E = E,
    pA = pA,
    pD = pD,
    policies = policies,
    policy_len = policies[0].shape[0],
    inference_horizon = len(timepoints),  
    inference_algo="MMP",
    sampling_mode="full",
    modalities_to_learn=[1],
    use_BMA = True,
    policy_sep_prior = False,
)

In my thought, to sample the action of the other timestep in each policy, line 675 would be better if changed like this:

selected_policy[factor_i] = policies[policy_idx][timestep, factor_i]

If I didn't understand this package well, then please let me know how to correct it.

Thank you!

Incorrect error message when ensuring normalized A/B matrices

In line 95 of agent.py

assert utils.is_normalized(self.B), "A matrix is not normalized (i.e. A.sum(axis = 0) must all equal 1.0"

I believe this is an error, as it should be the B matrix not the A matrix.

Originally posted by @mahault in #4 (comment)

Action indices need to be cast to `int` in T-maze learning demo

Thanks @conorheins and @Arun-Niranjan,
things getting better.
The A-matrix worked with your my-A-matrix but B-matrix was not fixed I think. A sample pandas matrix might help.
An issue remained with T-maze-learning (T-maze-demo works fine), that I pase it here but I can open an issue if needed.

Best

Originally posted by @osaaso3 in #26 (comment)

Move plot functions from explicit definition in demos to `utils` module, change axes tick labels calls

will also make compatible with earlier versions of matplotlib and seaborn

see #95 by @tverbele

! pip install inferactively-pymdp

Dear researchers

Unfortunatelly this wonderfull pyton toolbox for active inference cannot be opened in colab since Thursday 23.3.23 evening. What can be done?

I hope this can be fixed soon! Thank you in advance.

Best
Susanne Valavanis

Consider shifting from raw print statements to using canonical logging

I see that there are a lot of print statements build into the library (e.g. print(f'Factorized version: lnA at time {t}: {lnA}') in pymdp/algos/mmp.py). It would be preferrable to use a canonical logging library. This is a little article making the case for using logging rather than raw print statements: https://www.loggly.com/blog/4-reasons-a-python-logging-library-is-much-better-than-putting-print-statements-everywhere/

Make tests run against local code, fix syntax warning breaking coverage, and add code coverage generation to github actions

This PR should do the trick: #140

Allow direct sampling / selection from posterior over policies, rather than only from posterior over actions

Add an option to control.sample_action() or perhaps create an entirely new function in control.py, that allows you to directly sample a policy from q_pi and take the action entailed by that policy along each control factor, at the first timestep.

Thanks to @helenegu for requesting this

an error in the official documentation

Dear sir，
In the official document "Active Inference Demo: Epistemological Chains",

Starting with the 0-th modality, the Location observation modality: A[0]

A[0] = np.tile(np.expand_dims(np.eye(num_grid_points), (-2, -1)), (1, 1, num_states[1], num_states[2]))

i run it but with the error "A matrix is not normalized (i.e. A.sum(axis = 0) must all be equal to 1.0)",this looks initialized but not normalized

Allow learning of likelihood and transition likelihood parameters with sequences of hidden state posteriors

Add functions to learning module (and update corresponding update_A() and update_B() methods of Agent class) to allow learning to incorporate an entire past sequence of hidden state posteriors, rather than just a single hidden state posterior.

Choice of data structures for distributions/matrices

I thought it'd be good to get a discussion going on how we represent distributions in pymdp.

Currently we have the categorical and dirichlet classes, which I believe we are trying to move away from as they're tricky to use and it means all of the algorithms which call them have to constantly check if their inputs are one of these classes or not.

It seems for now we're moving towards using numpy arrays of numpy arrays to shift data around. IIUC This is necessary because we often have varying sub structures depending on the number of factors involved (e.g. we have multi-dimensional tensors representing an agents priors, policy choices etc.)

I previously considered whether we should write our own dataclass for representing what we need, but I think it's likely to end up the same way as the Dirichlet/Categorical classes. It would be a step forward, but still vulnerable to the same problems of being fragile to refactor and improve upon.

I suggest we use the third party Awkward Array to solve this problem for us. They seem to have a stable API, and we should be able to use any existing numpy methods with those arrays as long as they are not jagged (which we have to sort out anyway in the current implementation).

I think using this library will provide the following benefits:

make the code more readable
significant speedups when we have a much larger range of policies and states to contend with

There is always a risk of introducing third party dependencies, but I am of the opinion that if it's good enough for particle physics (tracks and decay events) at the scale CERN is dealing with, it is almost certainly good enough for us.

@conorheins @alec-tschantz what do you think? I could have a go at introducing it in a PR if you think this idea is worth investigating further

(tutorial video here is nice: https://www.youtube.com/watch?v=WlnUF3LRBj4)

Remove redundant operation for normalization in case of 3-tensors

@OzanCatalVerses pointed out that norm_dist within the pymdp.utils module strangely treats the case of 3-tensors differently than other numbers of dimensions, see here and code below.

def norm_dist(dist):
    """ Normalizes a Categorical probability distribution (or set of them) assuming sufficient statistics are stored in leading dimension"""
    if dist.ndim == 3:
        new_dist = np.zeros_like(dist)
        for c in range(dist.shape[2]):
            new_dist[:, :, c] = np.divide(dist[:, :, c], dist[:, :, c].sum(axis=0))
        return new_dist
    else:
        return np.divide(dist, dist.sum(axis=0))

Will correct this to:

def norm_dist(dist):
    """ Normalizes a Categorical probability distribution (or set of them) assuming sufficient statistics are stored in leading dimension"""
      return np.divide(dist, dist.sum(axis=0))

Bug when `num_controls` is passed as an argument to `Agent` constructor

Attribute error in agent.py at line 124:

pymdp/pymdp/agent.py

Line 124 in 4a81f1b

 self.control_fac_idx = [f for f in range(self.num_factors) if self.num_controls[f] > 1] 

whennum_controls is passed in as argument to agent.Agent() constructor. Need an else statement after 118:

pymdp/pymdp/agent.py

Line 118 in 4a81f1b

self.num_controls = [self.B[f].shape[2] for f in range(self.num_factors)]

, so that num_controls gets assigned to a property of self if provided as argument.

Originally posted by @conorheins in #4 (reply in thread)

Is it possible to do model fitting using this package?

Hi~

I want to know if this package can fit model to behavior datas collected by my experiment?
Or will there be future plans to implement this functionality?

Thanks~

Incorrect calculation of cross-entropy in the calc_free_energy function.

There is no LOG in the cross entropy on line 368.

B matrix navigation problem

Hello,

Thanks so much with your work on bringing Active Inference to Python.

While going through your epistemic chaining demo, it appears there is a problem with the agent navigation. When I moved cue1 location from (2,0) to (0,1), the agent takes two moves downward and then tries to move "left" into the wall. The agent never recovers from this and doesn't seem to know to try a different direction.

I assume this is a problem with the B Matrix but I'm not smart enough to figure out if this is some challenge in the agent class or in the rules set up during the demo itself for (["UP", "DOWN", "LEFT", "RIGHT", "STAY"]); elif

Any help/advice would be greatly appreciated! Please see the output log from the agent movements below . The only change I make to your demo is the my_env section where I change the cue1_loc to (0,1) - you'll see that once it completes the second action it tries to go LEFT... then STAY, then tries LEFT a few more times:

Action at time 0: DOWN
Grid location at time 0: (1, 0)
Reward at time 0: Null
Action at time 1: DOWN
Grid location at time 1: (2, 0)
Reward at time 1: Null
Action at time 2: LEFT
Grid location at time 2: (2, 0)
Reward at time 2: Null
Action at time 3: STAY
Grid location at time 3: (2, 0)
Reward at time 3: Null
Action at time 4: STAY
Grid location at time 4: (2, 0)
Reward at time 4: Null
Action at time 5: STAY
Grid location at time 5: (2, 0)
Reward at time 5: Null
Action at time 6: LEFT
Grid location at time 6: (2, 0)
Reward at time 6: Null
Action at time 7: LEFT
Grid location at time 7: (2, 0)
Reward at time 7: Null
Action at time 8: LEFT
Grid location at time 8: (2, 0)
Reward at time 8: Null
Action at time 9: LEFT
Grid location at time 9: (2, 0)
Reward at time 9: Null

Index Error

Hi
I came up with thois message when running agent-demo (tmaze works fine):

einsum while calculating free energy

hi, guys, i have a question about calculating the free energy. While calculating einsum, why only sum to the first dim:

arg_list = [X, list(range(X.ndim))] + list(chain(*([x[xdim_i],[dims[xdim_i]]] for xdim_i in range(len(x))))) + [[0]]

and then use:

spm_dot(likelihood, qs)[0]

to fetch only the first level of s0 (suppose hidden state s is factorized as s0, s1, ... ) ?

in my opinion, we should sum over all the tensor by removing the last + [[0]] and the Y should just be a scalar.
am I totally wrong, if i am, please correct me, thanks a lot!

Error happend when using update_B()

It showed some error during runing update_B()

Sparse representation of likelihood tensors (`A` and `B` arrays)

allow users to specify an optional list of hidden state factor indices when passing in A and B arrays, that specify which hidden state factors correspond to the lagging dimensions of these tensors. The purpose of this feature is to speed up computation speed and limit memory consumption. It leverages the (often-satisfied) assumption in generative models that not all observation modalities depend on all hidden state factors. This means we don't have to include many of the lagging dimensions in the likelihood tensors that correspond to hidden state factors that the observation modality in question doesn't depend on. The same goes for the B arrays, which are already accordingly sparse due to the baked-in mean-field approximation of the posterior (statistical independence of hidden state factors in the prior).
- I imagine these structures should be two list of lists (one for A and one for B), structured as follows:
  - A_factor_list, where A_factor_list[m] contains a list of the hidden state factor indices that correspond to the factors that observation modality m is conditionally dependent on.
  - B_factor_list where B_factor_list[f] is the list of hidden state factor indices that are needed to predict the dynamics of factor f. For now we will assume control factors are still fully conditionally independent, such that one B[f] sub-array only depends on the control state factor with index f, even though it may depend on other hidden state factors.
completing the above will also require building in the option of having additional dependencies in each factor-specific B array, on the other hidden state factors.

Incorrect error message when ensuring normalization of D array/vectors

In line 180 of agent.py, error message should be changed to:

"D vector is not normalized (i.e. D[f].sum() must all equal 1.0)"

Originally posted by @helenegu in #73 (comment)

Error when using `stochastic` mode action sampling in case that `num_controls[f] = 1`

Bug found by @SamGijsen:

Running "stochastic" action selection throws an error when using the current version. I believe it's because actions are sampled even if num_controls=1, in which case utils.sample() attempts to squeeze an array of size 1. Reverting to deterministic selection if num_controls==1 has fixed this for me.
(control.sample_action() line 565 if action_selection == 'deterministic' or num_controls[factor_i]==1:) Happy to submit a PR if that's useful, otherwise I believe it's just this line of code.

Originally posted by @SamGijsen in #82 (comment)

Polish tutorial 2

I really enjoyed this tutorial, overall. It shows the power of Agent. I think it's a little hard to read in some places, and I wanted to flag some of these issues:

It's a non-standard bandit task, because there's a third action, HINT. It would be nice to indicate this in the intro - or to refer to a version of the task which includes the hint action, because there's no indication in the only reference, which is the Wikipedia link.
Going from tutorial 1 to tutorial 2, there's an unexplained shift in the definition of A. In tutorial 1, it was a two-dimensional array p(o|s), but now it's a four-dimensional array. I realized a bit late in the process that this was all explained in the pymdp fundamentals part of the doc - a backlink would be nice.
Generally I find using raw arrays to define states difficult to read. I presume that this is an artifact of the history of the package (Matlab cell arrays). It would be helpful to remind people the definition of the axes as we go along. OTOH, I did find the diagrams very helpful.
I found the last part about manipulating the reward very confusing. I had to go back to definitions to figure out that the last manipulation changed the loss into a reward (+5) - if I just look at the output, it looks like the agent always picks the arm resulting in a loss.

One thing that's not explained here is how the agent's inference mechanism differs from those of other, more well-known agents from the RL literature. Having read the first few chapters of Richard Sutton's book eons ago, I wondered whether the inference was equivalent to a finite horizon dynamic programming solution, or similar in spirit to an agent that uses a UCB heuristic or Thompson sampling. If you could have a few references in the tutorial about this, it would be great.

Habit learning and precision

Hello!
I couldn't find implementation of habit learning (updating of prior over policies E) and precision(gamma) updating through betta prior.
Am I missing it? And if not, is it planned for future releases?
Thanks!

Note Arxiv reference

The following new Arxiv reference may be noted in the readme and/or docs, etc.:

https://arxiv.org/abs/2201.03904

pymdp: A Python library for active inference in discrete state spaces

Choose random action if all action probabilities are the same, even if sampling is deterministic (i.e., some tolerance of probability)

When action_selection == "determinstic" in control.sample_action(), but all the action probabilities are equal, we should sample rather than deterministically choose the first action (which is the default behavior of np.argmax

Fix `pymdp.utils.obj_array_from_list`

Creating an object array from a list of numpy arrays seems to break if the component arrays of the list have matching leading axis. For example

arrs = [np.zeros((3, 6)), np.zeros((3, 4, 5))]
obj_arr = np.array(arrs, dtype=object)

will throw the following error:

ValueError: could not broadcast input array from shape (3,6) into shape (3,)

whereas

arrs = [np.zeros((4, 6)), np.zeros((3, 4, 5))]
obj_arr = np.array(arrs, dtype=object)

works just fine.

State inference: Considerable performance difference with MATLAB implementation?

Hi all,

First of, many thanks for your work, it looks very promising!

I was comparing some simple agents between the current and matlab implementation and noticed that in terms of reversal learning, the pymdp version appears to adapt considerably slower. I've played around with a variety of setups and hyperparameters but the difference is quite significant.

Example setup: slot machine task without hints
2 actions: 'button' 0 and 'button' 1
A 'button 0 is better' context and a 'button 1 is better' context.

40 trials, with the hidden context switching after 20 trials. Here I plot the state posterior (black) of 100 agents starting with flat context priors, compared to the true but unknown state/context (red). Below I'll include the pymdp code. I'm assuming I'm using the package wrong, and would love to know my misunderstanding.

import pymdp
from pymdp import utils
from pymdp.agent import Agent
import numpy as np
import matplotlib.pyplot as plt

num_obs = [3, 3] # 3 Rewards, 3 choice observations
num_states = [3, 2] # 3 choice states, 2 hidden states
num_factors = len(num_states)

# Press one of two buttons
num_controls = [2, 1] 

A_shapes = [[o_dim] + num_states for o_dim in num_obs]

# initialize the A array to all 0's
A = utils.obj_array_zeros(A_shapes)

# reward probabilities
r1=0.9
r2=0.9

# Reward observations
# STATE 1     Start a0 a1
A[0][0,:,0] = [0, 1-r1, r2  ] # Positive reward
A[0][1,:,0] = [0, r1  , 1-r2] # Negative reward
A[0][2,:,0] = [1, 0   , 0   ] # Neutral (start state)
# STATE 2     Start a0 a1
A[0][0,:,1] = [0, r1  , 1-r2] # Positive
A[0][1,:,1] = [0, 1-r1, r2  ] # Negative
A[0][2,:,1] = [1, 0   , 0   ] # Neutral (start state)

# No uncertainty about choice observations
A[1][:,:,0] = np.eye(num_obs[1])
A[1][:,:,1] = np.eye(num_obs[1])


B_shapes = [[s_dim, s_dim, num_controls[f]] for f, s_dim in enumerate(num_states)]

B = utils.obj_array_zeros(B_shapes)

for i in range(2):
    B[0][0,:,i] = np.ones(3)
    
B[0][:,0,0] = [0, 1, 0] # action 0: Start  -> a0
B[0][:,0,1] = [0, 0, 1]  # action 1: Start  -> a1

B[1][:,:,0] = np.eye(num_states[1])

C = utils.obj_array_zeros(num_obs)
C[0] = np.array([1, -1, 0]) # Prefer rewards

D = utils.obj_array_uniform(num_states)
D[0] = np.array([1, 0, 0]) # Start in the 'start'-state

# ENVIRONMENT
class my_env():
    
    def __init__(self, A):
        
        self.A = A
        
    def step(self, action, state):
                
        obs = utils.sample(self.A[0][:, action[0].astype(int)+1, state])
            
        return [obs, action[0].astype(int)+1]

# SIMULATIONS
T = 40
alpha = 16
gamma = 16
AS = "deterministic"

for run in range(100):
    D2 = D.copy()

    model = Agent(A=A, B=B, C=C, D=D, policy_len=1, action_selection=AS)
    switches = [20,40,50]
    state = 0
    states = []
    pstate = []
    pact = []
    e = my_env(A)

    model.infer_states([2,0]) # 2 = neutral obs, 0 = start state

    for t in range(T):
#         if t > 0: 
#             D2[1] = model.qs[1]
#             model = Agent(A=A, B=B, C=C, D=D2, policy_len=1, action_selection=AS)

        if t in switches:
            state = 1 - state
        states.append(state)

        # Start position for the trial (I believe you don't use this in the tutorial, but it doesnt seem to matter much)
        model.infer_states([2,0]) # 2 = neutral reward, 0 = start state observation

        q_pi, neg_efe = model.infer_policies()

        action = model.sample_action()

        obs = e.step(action, state=state)
        model.infer_states(obs)

        # Save belief and output
        pstate.append(model.qs[1][0])
        pact.append(q_pi[0])

    plt.plot([1-s for s in pstate], label="p(s)", linewidth=3, alpha=0.1, color='k')
    
plt.plot([s*1.1-0.05 for s in states], label="s", color="r", linewidth=3)
plt.xlabel("trial")
plt.ylim([-0.05, 1.05])
plt.title("Python")

Typos in Tutorial 1

There are two instances which read THe, which should be changed to The

A heading is mislabeled:

The prior over observations: the D vector or P(s)

Should read:

The prior over hidden states: the D vector or P(s)

infer-actively / pymdp Goto Github PK

pymdp's People

Contributors

Stargazers

Watchers

Forkers

pymdp's Issues

Hi, Will this new method fix the error I get with A-Matrix and B-Matrix examples?

Recommend Projects

Recommend Topics

Recommend Org

Hi,
Will this new method fix the error I get with A-Matrix and B-Matrix examples?