uber / orbit Goto Github PK

View Code? Open in Web Editor NEW

1.8K 35.0 133.0 164.33 MB

A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.

Home Page: https://orbit-ml.readthedocs.io/en/stable/

License: Other

Python 93.86% Stan 6.09% Dockerfile 0.05%

python forecasting bayesian exponential-smoothing pyro stan pystan probabilistic-programming probabilistic forecast

orbit's Issues

Revisit DLT to make both global and local trend damped

Implement DLT / LGT with a multiplicative option

Current implementation is the additive form, but doing a log transformation before fit and an exp transformation after predict, makes it a multiplicative model.

This can be integrated directly into the model classes.

Further this allows Backtesting to work as intended on the original scale of the data, without having to implement transformation function callbacks

Reparameterization of sigma

Right now we use bounded Cauchy for both MAP and NUTS:

obs_sigma ~ cauchy(0, CAUCHY_SD) T[0,];

It works well for MAP. However, it may be related to slowness in NUTS and suggesting for NUTS:

real<lower=0, upper=pi()/2> obs_sigma_unif_dummy;
obs_sigma = CAUCHY_SD * tan(obs_sigma_unif_dummy);

We may need to split stan code in handling NUTS vs. rest or MAP vs. rest

incompatible concatenation in MAP estimate

For some scenarios, such as one regular regressor and two positive regressors, there pops error message w.r.t incompatible concatenation for MAP estimate...

Refactor lgt vs. dlt constants grouping

Better Pyro Examples and Documentation

Publish Sphinx API documentation

Set up sphinx autodoc dependencies
Author documentation RST pages
Publish docs/ to GitHub Pages

Redesign Backtest Module

Redesign backtest module so Backtest is initialized only with the data.
Make models a function of _fit() instead of an attribute of Backtest.

Create Contributing.md document

Need to author documentation on the process for contributing

Regression Coef Penalty

get a dataset for benchmarking/testing
L1/L2 Penalty
Total Coef Penalty
Set Positive Reg Coef as zero instead of rejecting sample
Variable selection with spike and slab

Just some thoughts; not necessarily complete all of them

Decimal Seasonality

For fractional cycle such as weekly seasonality

Reorganize stan input mapper and constants

create separate enum class to handle individual model

Implement estimator for multiple time series

Implement a class to handle multiple univariate time series cases. In other words, a vector of response columns, but each series is still independent

better log / warning / error handling

provide diagnostic plots for MCMC sampled params

histogram/density
paired plots
trace plot (for the time being, not doable because such info is not available )

Separate plotting module instead of utils

Update API docs and more examples with regressors

LGT / DLT docstrings
Better definitions of each parameter within docstrings
Examples with priors
Pyro docstrings

run_multi_series_backtest not working with MCMC models

Plot components of prediction with decomposed predicted_df

auto_scale=True with regressors errors out

when auto_scale=True with regressors, the following error message pops out:

TypeError Traceback (most recent call last)
in
1 from sklearn.preprocessing import MinMaxScaler
2 regressor_min_max_scaler = MinMaxScaler(1, 2.719)
----> 3 df[regressor_col] = regressor_min_max_scaler.fit_transform(df[regressor_col])

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
569 if y is None:
570 # fit method of arity 1 (unsupervised transformation)
--> 571 return self.fit(X, **fit_params).transform(X)
572 else:
573 # fit method of arity 2 (supervised transformation)

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
337 # Reset internal state before fitting
338 self._reset()
--> 339 return self.partial_fit(X, y)
340
341 def partial_fit(self, X, y=None):

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y)
361 """
362 feature_range = self.feature_range
--> 363 if feature_range[0] >= feature_range[1]:
364 raise ValueError("Minimum of desired feature range must be smaller"
365 " than maximum. Got %s." % str(feature_range))

TypeError: 'int' object is not subscriptable

Regressor beta Logic with torch.tensor

line 414 in lgt.py:

regressor_beta = pr_beta or rr_beta

does not work with torch.tesnor.

Refactor to store `stan.sampling` directly to enable diagnostic methods

Currently we store stan.sampling.extract() as self.posterior_samples. However, we want to be able to retrieve chain level information from stan.sampling directly to enable diagnostic methods.

However, we can't currently retrieve stan.sampling directly since we never store the attribute.

There are two proposed solutions:

Store stan.sampling at the end of fit, and only call stan.sampling.extract() for downstream methods such as predict, plot, diagnostic.
Store self.posterior_samples as-is, and additionally store stan.sampling.to_dataframe() to something like self.posterior_samples_chain.

The first may require refactoring of other methods when we don't use stan.sampling for example if we fit using VI, MAP, or Pyro.

The second requires storing double the information that we would otherwise. An alternative approach to the second method is to parse the dataframe to the same state that our current self.posterior_samples is in, but poses a challenge because matrix samples are stored as a single column in a dataframe.

Another alternative to the second method is to store only the chain info from the dataframe, but we'd have to guarantee order preservation between the dataframe and arrays in stan.sampling.extract()

Pyro Estimation with Regressors

RuntimeError Traceback (most recent call last)
in
1 # make prediction of past and future
----> 2 predicted_df = lgt_reg_map.predict(df=df, decompose=True)
3 predicted_df.head(5)

~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
567 # prediction
568 predicted_dict = self._predict(
--> 569 df=df, include_error=False, decompose=decompose
570 )
571

~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
472 regressor_matrix = df[self.regressor_col].values
473 regressor_torch = torch.from_numpy(regressor_matrix)
--> 474 regressor_component = torch.matmul(regressor_torch, regressor_beta)
475 regressor_component = regressor_component.t()
476 else:

RuntimeError: Expected object of scalar type Double but got scalar type Float for argument #2 'mat2' in call to _th_mm

Better message/handle for wrong mix future of predict and sample method

for example,

when we see "map" sample method, make better message and enforce predict method to "map"
when we see "mcmc"/"vi" as sample method while "map" as predict method, enforce a "mean" predict method instead

Or just raise error in those wrong combinations?

Implement damped LGT class

Implement the concrete class for Damped LGT class (DLT), which is now separate from the main LGT model.

implement the functionality of using backtest to do the hyper-parameter tuning

one main use case of backtest

Methods to plot and view distribution of posterior samples

bug with pyro and MAP

get this when running
predicted_df = lgt_map.predict(df=test_df)

RuntimeError Traceback (most recent call last)
in
----> 1 predicted_df = lgt_map.predict(df=test_df)

~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
566 # prediction
567 predicted_dict = self._predict(
--> 568 df=df, include_error=False, decompose=decompose
569 )
570

~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
504 trend_forecast_matrix
505 = torch.zeros((num_sample, trend_forecast_length), dtype=torch.double)
--> 506 trend_component = torch.cat((local_global_trend_sums, trend_forecast_matrix), dim=1)
507
508 last_local_trend_level = local_trend_levels[:, -1]

RuntimeError: Expected object of scalar type Float but got scalar type Double for sequence element 1 in sequence argument at position #1 'tensors'

Stability issue of DLT

organize, clean and validate inputs of inference engine x sample method x predict methdo

have fitted model objects saved in backtest iterations

per some users, it would be helpful to have the models saved for future use, such as calculating in-sample errors and checking the posterior coefficient estimates...

More robust unit tests

Current unit tests should be parameterized instead of static
Stronger assertions with fixtures for actual values
Parameterization for more variants of init / fit / predict args
Speed up run time without compromising the tests themselves

Backtest batch methods should be written outside class as static functions

Batch backtesting methods such as fit_score_batch or _fit_batch can be refactored as static functions outside of the class.

Once implemented, we should deprecate fit_score_batch and remove from examples

Error with DLT._get_param_names()

Error message: 'DLT' object has no attribute 'global_trend_coef_min'
DLT inherits unexisted params from LGT

Include meta data in backtest.py

can we also include two meta data

in _predicted_df append train end (if date_col is available in the splitter
in _score_df append number of splits as a reference for user knowing how many splits has conducted

Set up coverage tests for travis CI

fix demo script in read me

Warning raised by calling plotting utils should be investigated

orbit/orbit/utils/utils.py

Lines 13 to 15 in e449497

 if os.environ.get('DISPLAY', '') == '': 

 print('no display found. Using non-interactive Agg backend') 

 matplotlib.use('Agg')

The above lines are raising a warning / exception in unit tests for test_backtest.py

Warning Output:

orbit/utils/utils.py:15
  /home/travis/build/uber/orbit/orbit/utils/utils.py:15: UserWarning: 
  This call to matplotlib.use() has no effect because the backend has already
  been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
  or matplotlib.backends is imported for the first time.
  
  The backend was *originally* set to 'TkAgg' by the following code:
    File "setup.py", line 66, in <module>
      'Programming Language :: Python :: 3.7',
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
      return distutils.core.setup(**attrs)
    File "/opt/python/3.7.1/lib/python3.7/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/command/test.py", line 237, in run
      self.run_tests()
    File "setup.py", line 39, in run_tests
      errcode = pytest.main(self.test_args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/config/__init__.py", line 79, in main
      return config.hook.pytest_cmdline_main(config=config)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 242, in pytest_cmdline_main
      return wrap_session(config, _main)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 209, in wrap_session
      session.exitstatus = doit(config, session) or 0
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 248, in _main
      config.hook.pytest_collection(session=session)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 258, in pytest_collection
      return session.perform_collect()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 485, in perform_collect
      items = self._perform_collect(args, genitems)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 524, in _perform_collect
      self.items.extend(self.genitems(node))
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 762, in genitems
      for x in self.genitems(subnode):
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 759, in genitems
      rep = collect_one_node(node)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 407, in collect_one_node
      rep = ihook.pytest_make_collect_report(collector=collector)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in pytest_make_collect_report
      call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 226, in from_call
      result = func()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in <lambda>
      call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 435, in collect
      self._inject_setup_module_fixture()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 447, in _inject_setup_module_fixture
      setup_module = _get_non_fixture_func(self.obj, "setUpModule")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 251, in obj
      self._obj = obj = self._getobj()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 432, in _getobj
      return self._importtestmodule()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 499, in _importtestmodule
      mod = self.fspath.pyimport(ensuresyspath=importmode)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/py/_path/local.py", line 668, in pyimport
      __import__(modname)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/assertion/rewrite.py", line 296, in load_module
      six.exec_(co, mod.__dict__)
    File "/home/travis/build/uber/orbit/tests/test_backtest.py", line 6, in <module>
      from orbit.utils.constants import BacktestFitColumnNames
    File "/home/travis/build/uber/orbit/orbit/utils/constants.py", line 4, in <module>
      from orbit.utils.utils import get_parent_path
    File "/home/travis/build/uber/orbit/orbit/utils/utils.py", line 6, in <module>
      import matplotlib.pyplot as plt
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/pyplot.py", line 71, in <module>
      from matplotlib.backends import pylab_setup
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/backends/__init__.py", line 17, in <module>
      line for line in traceback.format_stack()
  
  
    matplotlib.use('Agg')

RuntimeError occurs when LBFGS fails in MAP estimate

For certain series, the default algorithm LBFGS of MAP estimate may fail to converge and something goes wrong after call_sampler in pystan.

Implement estimator for multivariate time series

config of pyro

enhance control of config inside pyro such as steps, message etc.

Create args for stan and pyro map

there are some args can be configured in pyro_map

Rebuild backtest error metrics group by prediction horizons

append meta data of prediction steps
group by calculation
plotting

1,2 should be ready

lgt = LGT(
    regressor_cols=['feature1', 'feature2', 'feature3', 'feature4', 'feature5'],
    regressor_beta_prior={'feature1': 100, 'feature4': 1000} ,
    regressor_sigma_prior={'feature1': 20, 'feature4': 40}
)

	if os.environ.get('DISPLAY', '') == '':
	print('no display found. Using non-interactive Agg backend')
	matplotlib.use('Agg')

uber / orbit Goto Github PK

orbit's Issues

get this when running predicted_df = lgt_map.predict(df=test_df)

Recommend Projects

Recommend Topics

Recommend Org

get this when running
predicted_df = lgt_map.predict(df=test_df)