pymc-devs / pymc4 Goto Github PK
View Code? Open in Web Editor NEWExperimental PyMC interface for TensorFlow Probability. Official work on this project has been discontinued.
License: Apache License 2.0
Experimental PyMC interface for TensorFlow Probability. Official work on this project has been discontinued.
License: Apache License 2.0
After running this, an error occurs.
pip install --user git+https://github.com/pymc-devs/pymc4.git#egg=pymc4
Could not find a version that satisfies the requirement tf-nightly==1.9.0.dev20180607
No matching distribution found for tf-nightly==1.9.0.dev20180607 (from pymc4)
My Platform is OSX 10.13.6, using Python 3.6.5 :: Anaconda, Inc.
PyMC3 as well as Edward and Edward2 require (or strongly encourage) passing of an additional name to each RV:
with pm.Model():
x = pm.Normal('x', mu=0, sigma=1)
Note that the x
appears twice. We need to do this because we can't easily infer that the variable x
is called "x"
at run-time. But we do need to keep track of it and reference it later in the trace, so that's why we need the additional argument.
This is in contrast to DSL PPL systems like stan, bugs or newer julia-based PPLs which don't need that.
It's also a common stumbling block and dislike for many pymc3 users:
I hacked together a small prototype that inspects and transforms the AST to get the information and make the name optional (i.e. it can still be provided by the user):
This is fairly straight-forward as you can see, we just fetch the name of the variable and add it to the code that instantiates the RV. So for pymc4 everything looks identical to before, it's just a thin layer that transforms the code before it get's executed.
This prototype matches the PyMC3 API but I think we should change it ultimately to look something like this:
def create_RV():
return pm4.Normal(mu=0, sigma=1, name='required') # name must be provided as we don't traverse into function calls, although we probably could
@pm4.model(autoname=True) # True by default?
def mymodel():
x = pm4.Normal(mu=0, sigma=1) # replaced by AST parser to -> x = pm4.Normal(mu=0, sigma=1, name='x')
x2 = pm4.Normal(mu=0, sigma=1, name='but_i_want_to') # name can be supplied
y = create_RV()
z = [pm4.Normal(mu=0, sigma=1, name='required {}'.format(i) for i in range(5)] # name must be provided
x = pm4.Normal(mu=0, sigma=1) # raises exception, variable already defined
That code focuses mostly on edge cases where we can't easily infer the name, we don't need to handle those cases (as I think they are rare).
The main initial response from developers seems to be "thou shalt not mess with the AST" but I don't think that's a good argument. It is standard and public API of Python, the code executes in the same way (i.e. break-points can still be set) and the magic that's performed is pretty small but gives a huge improvement in user experience, and that's the focus of PyMC.
Would be helpful for people to contribute with the addition of requirements.txt
.
There is no exception handling for models with only observed variables. Maybe one can add ndim
method to the state object to make all the checks in sample
method but there is some ambiguity with shape support #91? How to better fix the issue?
I am trying a forward sampling on mixture model example from pymc4 examples, but getting an empty dict as a result.
The model is defined as -
@pm.model
def mixture(n_groups, n_points):
centers = pm.Normal('centers', loc=tf.zeros((n_groups, 2)), scale=1)
scales = pm.HalfNormal('scales', 0.4 * tf.ones(n_groups))
rates = pm.Dirichlet('rates', concentration=tf.ones(n_groups))
group_assignments = pm.Multinomial('group_assignments', total_count=n_points, probs=rates)
for idx in range(n_groups):
count = tf.to_int32(group_assignments[idx])
pm.Normal(f'group_{idx}', loc=centers[idx] * tf.ones((count, 2)), scale=scales[idx])
# Now I configured the model -
model = mixture.configure(n_groups=5, n_points=10_000)
I checked the tensorflow graph but it's not empty. The graph node names are-
['zeros',
'centers/scale',
'centers/Identity',
'centers/Identity_1',
'ones/shape_as_tensor',
'ones/Const',
'ones',
'mul/x',
'mul',
'scales/scale',............]
What could be the reasons ?
Add observed RV support by exploring the use of values
attribute in tensorflow.distributions
I can't seem to find the definition of ArrayStep
in pymc4. Was this object experimental or is it supposed to be somewhere?
https://github.com/pymc-devs/pymc4/blob/master/pymc4/_hmc/hmc.py#L3
Notebook demonstrating eight schools model
pymc3.stats has functions to evaluate the models such as Rhat, HPD, AIC,WAIC and LOO.
How about implement these function in PyMC4?
Change this to a dictionary to allow for ease of debugging and stronger ability to check that all variables are passed to log_prob
function
https://github.com/pymc-devs/pymc4/blob/master/pymc4/_template_contexts.py#L47
Add traceplot()
and plot_posterior()
functions.
@pm.model
def t_test(sd_prior='half_normal'):
mu = pm.Normal('mu', 0, 1)
sd = pm.HalfNormal('sd', 1)
# Problem here! sd still can't take negative values when passed into other vars
pm.Normal('y_0', 0, 2 * sd)
pm.Normal('y_1', mu, 2 * sd)
model = t_test.configure()
model._forward_context.vars
func = model.make_log_prob_function()
mu = -tf.ones((10,))
sd = -tf.ones((10,))
y_0 = tf.ones((10,))
y_1 = tf.ones((10,))
func(mu, sd, y_0, y_1)
Returns nan. The problem is actually not in line 2 of the model, but when it is passed on in line 3. So what we need to do is back-transform the value for when it's seen by other RVs. Not quite sure where we need to do that but maybe we can look at how PyMC3 does it.
After some internal discussion, we agreed that it would be good to be able to get a model's element-wise log_prob
evaluation. This feature would help with:
I was just thinking about the test suite here - I think it makes sense to dynamically generate tests for each random variable to:
.sample
from it.log_prob
as_tensor
In other words, we should try to pull a similar trick to how we dynamically wrap the tfp distributions as random variables, for the tests. I think has several advantages, (e.g. the tests write themselves!), but I'm unsure if this is best practice with tests... Any thoughts?
cc @canyon289 depending on how far you want to take #49, I'd be happy to do a follow up PR for tests.
We no longer need to point the README to the functional branch to install.
TFP Normal uses NumPy's loc
and scale
.
PyMC3 used mu
and sigma
(or alternatively tau
or sd
).
What are our thoughts on parametrization of distributions? Follow TFP only, or provide the alternative parametrizations as well?
Speaking for myself here, I believe that it's better for UX to keep PyMC3 parametrizations - it means fewer things to change API-wise, for an end-user.
Running the following
# Logp calculation for linear regression
@pm4.model(auto_name=True)
def rugby():
# Define priors
home = pm4.Normal(mu=0, sigma=2)
sd_att = pm4.HalfNormal(sigma=2.5)
sd_def = pm4.HalfNormal(sigma=2.5)
intercept = pm4.Normal(mu=0, sigma=2)
# team-specific model parameters
atts_star = pm4.Normal(mu=0, sigma=tf.fill([6],sd_att))
defs_star = pm4.Normal(mu=0, sigma=tf.fill([6],sd_def))
atts = atts_star - tf.mean(atts_star)
defs = defs_star - tf.mean(defs_star)
home_theta = tt.exp(intercept + home + atts[home_team] + defs[away_team])
away_theta = tt.exp(intercept + atts[away_team] + defs[home_team])
# likelihood of observed data
home_points = pm.Poisson(mu=home_theta, observed=observed_home_goals)
away_points = pm.Poisson(mu=away_theta, observed=observed_away_goals)
model = rugby.configure()
forward_sample = model.forward_sample()
Errors.
AttributeError Traceback (most recent call last)
<ipython-input-18-226106c12312> in <module>
1
2 # Logp calculation for linear regression
----> 3 @pm4.model(auto_name=True)
4 def rugby():
5 # Define priors
~/pymc4/pymc4/_model.py in wrap(func)
32 # convert to ast and apply visitor
33 tree = parse_snippet(*unc)
---> 34 AutoNameTransformer().visit(tree)
35 ast.fix_missing_locations(tree)
36 unc[0] = tree
~/miniconda3/envs/tfp36/lib/python3.6/ast.py in visit(self, node)
251 method = 'visit_' + node.__class__.__name__
252 visitor = getattr(self, method, self.generic_visit)
--> 253 return visitor(node)
254
255 def generic_visit(self, node):
~/miniconda3/envs/tfp36/lib/python3.6/ast.py in generic_visit(self, node)
306 for value in old_value:
307 if isinstance(value, AST):
--> 308 value = self.visit(value)
309 if value is None:
310 continue
~/miniconda3/envs/tfp36/lib/python3.6/ast.py in visit(self, node)
251 method = 'visit_' + node.__class__.__name__
252 visitor = getattr(self, method, self.generic_visit)
--> 253 return visitor(node)
254
255 def generic_visit(self, node):
~/miniconda3/envs/tfp36/lib/python3.6/ast.py in generic_visit(self, node)
306 for value in old_value:
307 if isinstance(value, AST):
--> 308 value = self.visit(value)
309 if value is None:
310 continue
~/miniconda3/envs/tfp36/lib/python3.6/ast.py in visit(self, node)
251 method = 'visit_' + node.__class__.__name__
252 visitor = getattr(self, method, self.generic_visit)
--> 253 return visitor(node)
254
255 def generic_visit(self, node):
~/pymc4/pymc4/ast_compiler.py in visit_Assign(self, tree_node)
119 rv_name = tree_node.targets[0].id
120 # Test if creation of known RV
--> 121 func = tree_node.value.func
122 if hasattr(func, "attr"):
123 call = func.attr
AttributeError: 'BinOp' object has no attribute 'func'
In PyMC3 we automatically transformed RVs to be on the real line, as required by e.g. ADVI and HMC. TFP has strong support for various bijectors we can use here. There are two options:
Initial discussions I think showed consensus around option 1 and I think we can just borrow from the PyMC3 implementation which I don't think caused us many problems. Especially with #70 it gives us the option to inherit from e.g. PositiveRandomVariable
where appropriate.
There is no untransformed values (if distribution requires transformation) in trace object, since initialize_state
turns state
into sampling_state
. Maybe there is a need to capture untransformed
values before initialize_state
call and then inverting appropriate RVs to get more convenient trace. How one can implement it better?
Model:
@pm.model(auto_name=True)
def model():
....
# The offending line is this one.
nu = pm.Exponential(lam=1 / 29.0) + 1
...
Error message:
________________________________________________________________________ ERROR collecting pymc4/tests/test_hierarchical_comparison.py ________________________________________________________________________
pymc4/tests/test_hierarchical_comparison.py:41: in <module>
model = model.configure()
pymc4/_model.py:64: in configure
model._evaluate()
pymc4/_model.py:80: in _evaluate
self._template._func(*args, **kwargs)
pymc4/tests/test_hierarchical_comparison.py:33: in model
nu = pm.Exponential(lam=1 / 29.0) + 1
pymc4/random_variables/random_variable.py:134: in __init__
raise ValueError("No name was set. Supply one via the name kwarg.")
E ValueError: No name was set. Supply one via the name kwarg.
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 errors during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
=================================================================================== 52 deselected, 1 error in 1.79 seconds ===================================================================================
As a matter of cleanliness, we should probably dump the notebooks into a separate directory.
Maybe I don't know the real purpose of observed
param in sample
method, but executors do not care about these dicts and transform the distribution even if you provide new observed values. Is it intentional logic?
Notebook demonstrating the problem
Line 103 in a552387
Minimal example
import copy
class My(object):
def __init__(self):
self.mutable = dict()
my1 = My()
my2 = copy.copy(my1)
my2.mutable.update(a=1)
print(my1.mutable)
Hello:
I have mostly operated with TF and TFP in the past, and I thought it'd be a great time to give PyMC4 a try, given that all the HMC work will be easily implemented in the backend. However, it seems like I can't seem to make a simple OLS model run properly..
I have my model code here with all the outputs: https://gist.github.com/sadatnfs/d5004f488ba00371333770059ab99776
If you look at the traces printed, you'll see that all 3 parameters of the OLS model is staying at 0.5 (which I'm guessing is the initialized value by TF), whereas when I ran the eight schools example from the example as-is, it ran the sampling all good...
I feel like I'm missing something super simple and trivial... any thoughts?
Thanks!
Currently our RVs do not accept shape
s.
Following up from #54 (comment) and #54 (comment).
We rely on tfp to correctly implement many random variables. However, the random variables that we implement ourselves (e.g. the zero-inflated random variables) need to be tested for correctness.
As our random variables are now tensors, manipulations that users of pymc3 are familiar with might no longer be valid, requiring them to use tensorflow's particular syntax for manipulating tensors. I might be wrong, but advanced indexing of numpy arrays seem to no longer work and requires tf.gather_nd
. Is this deviation from pymc3-like syntax desirable?
I will suggest developing on 3.6, but will add Python 3.5 to travis for now. I would love to hear if there are good reasons to make sure to keep supporting 3.5, since 3.7 is coming out so soon.
@csuter this question references #58, which is something I care deeply about - having APIs that match what a package's user base might consider idiomatic. (We all probably know about the fiasco going from py2 to py3.)
I can see how the TFP distributions API matched and extended numpy.random
better than TF did for numpy, which is admirable! I wonder if alternative parametrizations might be allowed at some point?
E.g.: in PyMC3, we used parameter names that were familiar to a statistician, e.g. "mu" and "sigma" rather than "loc" and "scale" (this admittedly breaks the general principle I usually advocate for; in this case, though pm3's user base would be more familiar with mu and sigma over loc and scale).
More e.g.: for other distributions, we allowed multiple parametrizations so that users could use the one they are most familiar with. I can foresee an objection - it's more code to maintain (but once implemented, can be hardened by tests) - however, I think that from a user adoption and migration perspective, this would be really helpful!
In both azure-pipelines.yml
and scripts/lint.sh
, the pydocstyle
check is commented out. It would be great for documentation to start to catch up to development.
I started trying, but need more context of how classes and functions work together to do a decent job...
Following up on @junpenglao's suggestion, I decided to start converting my Bayesian analysis recipes notebook into PyMC4 code, with the hope that I could maybe explore API design and UX aspects of using PyMC4.
I was looking at the example notebooks provided by @anhhuyalex, and immediately saw in the second cell:
config = tf.ConfigProto()
config.graph_options.optimizer_options.global_jit_level = (
tf.OptimizerOptions.ON_1)
config.intra_op_parallelism_threads = 1
config.inter_op_parallelism_threads = 1
sess = tf.InteractiveSession(config=config)
The immediate thought that went through my mind was, "Wow, that's a lot of up-front config that an end-user has to do." The second thought that went through my mind was, "Now I'm reminded why I don't like TensorFlow, and would much rather work with Jax."
I'm wondering how much of this we can hide from end users? Previously with Theano, not much in-notebook configuration, with the exception of, say, setting floatX. Is hiding configuration technically feasible, or not easily doable?
Currently pymc3 enables the user to access the RVs using model.RVname
or model['RVname']
.
Continuing this functionality.
The directory structure is now this: pymc4/pymc4/hmc/_hmc/*.py
. Is this intended?
#52 left out the DiscreteUniform
distribution (it is commented out right now). I misunderstood the errors that TFP was raising, which was ironed out in tensorflow/probability#286. All that remains is to properly implement the DiscreteUniform distribution (most of it is written, we just need to iron out the type casting).
As part of our docker build we'll need to make sure to integrate properly with conda-forge.
As If we eventually want to distribute on conda-forge then users will likely be using a python binary from conda-forge (https://anaconda.org/conda-forge/python/files?version=3.6.7) rather than defaults. Regardless we can sort this out later, just wanted to chime in.
as @kyleabeauchamp said in #49
This is an issue that builds off from #83.
Background: We need transformations for continuous distributions, particularly for bounded ones (e.g. uniform, half-normal). It helps with computation during inference.
Current design: We instantiate the appropriate distribution for an RV, and then instantiate a transformed distribution, during the initialization of an RV.
The untransformed distribution is then called during forward sampling (RV.sample()
), and the transformed distribution is called during log_probability computation (RV.log_prob()
). I made this design choice to keep the codebase readable, so that newcomers to pm4 can read the code base and hopefully more easily grok the design of a PPL in general.
Key questions:
Rather than debate this from a theoretical perspective only, I would like to make sure that we have measurements to back up our discussion. If there's overhead anywhere, we should be able to pinpoint where it is.
I don't see any issues to support more samplers. Can anyone be assigned to the issue or somebody else is already in progress of completing it?
Efforts in implementing pymc4 are not single person, so we need a place to sync our progress on this. This issue should be the place where we communicate, update status of every task on a regular basis.
Max
Saurav Implement Tensorflow backend and make forward sampling working.
do
operator works correctly. This is the core in observed
API (not yet discussed, should be possible with a prototype, but verbose)Max Decide on Transforms API. Should be independent from (1.).
Forward sampling and applying transforms are now splitted as no dependency functionality: forward sampling does not require transforms and transforms usage under the hood is not aware about backend used.
Link to implementation
Distributions are currently raising exception when logp is called. Transform need to be implemented on all.
https://github.com/pymc-devs/pymc4/blob/master/pymc4/tests/test_random_variables.py#L95
why would you want to limit yourself to only sample once per call?
Remove dtype argument that is hardcoded in to bernoulli, zipf, and categorical after issue below is resolved
Hello! I'm excited to see all the cool ideas going on in the new PyMC, and I'm looking forward to using it for real. I've been following the development a little, and had an idea I wanted to run by you. I'm still new to PyMC, so please correct me if I get anything wrong.
One of the distinctive features of PyMC is its usage of context managers for building models, like this:
with pm.Model() as model:
eta = pm.Normal("eta", 0, 1, shape=J)
mu = pm.Normal("mu", 0, sd=1e6)
tau = pm.HalfCauchy("tau", 5)
theta = pm.Deterministic("theta", mu + tau * eta)
obs = pm.Normal("obs", theta, sd=sigma, observed=y)
trace_h = pm.sample(1000)
plot_summary(model)
This kind of API is powerful in that it allows users to transparently access the sampling backend without extra work, and it makes common workflows really quick and easy. The decorator-based @pm.model
API has similar advantages. The developer guide explains the power and flexibility that comes out of this design.
The design also has some side effects:
I've been wondering about some possible API designs. Some of them may have been discussed and rejected already; please forgive me if I'm being redundant.
One idea that might be familiar to Python developers might be using a class per model, something like this:
@model
class MyModel:
J = ConstantInteger()
eta = Normal(0, 1, shape=J)
mu = Normal(0, sd=1e6)
tau = HalfCauchy(5)
theta = Deterministic(mu + tau * eta)
# Any of these functions could be methods instead.
model = MyModel()
observed = observe(model, data)
trace = sample(observed)
plot_summary(trace)
I'm not 100% sure that it can do everything PyMC needs, but, from my (possibly naive) perspective, having an option like this might have some benefits:
There's a lot to explore in this design space. If this seems interesting to people, I'm happy to discuss or try out some implementation ideas, to see if something like this could be possible, and if it'd be nice. I'd love to hear your thoughts! ๐
Currently, sample
, sample_prior_predictive
and sample_posterior_predictive
all use a first call to some variant of evaluate_model
just to get some meta information of the model:
sample
)All of these will lead to problems for models that cannot generate samples from the prior. For example, models with Flat
priors (see this pymc3 issue)
I think that the solution to this problem is to introduce some purely symbolic model parser to get the metadata we need. I think that we should do this with symbolic-pymc
We've got a high level sample
function skeleton in place, but it would be nice to also include a high level API like pymc3's sample_prior_predictive
and sample_posterior_predictive
.
In pymc3, I can use pm.sample(5000, cores=4, tune=500, chains=4)
explicitly to run with 4 chains and 4 CPU cores. But I find there are no such parameters in pymc4, sample(model, num_results=5000, num_burnin_steps=3000, step_size=.4, num_leapfrog_steps=3, numpy=True)
. How can I make it?
Besides, since pymc4 builds upon TensorFlow, can I switch to GPU version manually when sampling?
Thanks!
I thought it might be nice to add a logistic regression example at some point. I've tried to do this (see below code) but I'm having some issues with constant chains and the following warning:
"tensorflow/core/common_runtime/executor.cc:642] Executor failed to create kernel. Internal: No function library
[[{{node MatVec/MatMul/pfor/cond}}]]"
I'm running on TFP nightly (Oct 15), TFP 2.0, and python 3.6. Anyone see anything obviously wrong here? I'm happy to add a notebook or test case for logistic regression once we get things running.
import numpy as np
import pymc4 as pm4
import tensorflow as tf
from sklearn.datasets import load_breast_cancer
data = load_breast_cancer()
X = data["data"].astype('float32')
# Standardize to avoid overflow issues
X -= X.mean(0)
X /= X.std(0)
y = data["target"]
n_samples, n_features = X.shape
X = tf.constant(X)
@pm4.model
def logistic_model():
w = yield pm4.Normal("w", np.zeros(n_features, 'float32'), 0.01 * np.ones(n_features, 'float32'))
z = tf.linalg.matvec(X, w)
p = tf.math.sigmoid(z)
obs = yield pm4.Bernoulli("obs", p, observed=y)
return obs
def test_sample():
tf_trace = pm4.inference.sampling.sample(
logistic_model(), step_size=0.01, num_chains=1, num_samples=200, burn_in=0, xla=False
)
return tf_trace
tr = test_sample()
Filing this as an issue so we don't forget: #41 (comment)
I think we can achieve most of the important distributions using tfp.distributions.Mixture
and tfp.distributions.Bijector
(e.g. the zero-inflated distributions are just mixtures with the zero distribution).
@ColCarroll I'm assuming there will be a problem with having a distribution/RV named Deterministic
? ๐ That's what tfp calls the thing that PyMC calls the Constant
distribution. If so, I can write something to remap names.
Continuous:
Discrete:
Multivariate:
Timeseries:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.