uber / orbit Goto Github PK
View Code? Open in Web Editor NEWA Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
Home Page: https://orbit-ml.readthedocs.io/en/stable/
License: Other
A Python package for Bayesian forecasting with object-oriented design and probabilistic models under the hood.
Home Page: https://orbit-ml.readthedocs.io/en/stable/
License: Other
Current implementation is the additive form, but doing a log transformation before fit and an exp transformation after predict, makes it a multiplicative model.
This can be integrated directly into the model classes.
Further this allows Backtesting to work as intended on the original scale of the data, without having to implement transformation function callbacks
Right now we use bounded Cauchy for both MAP and NUTS:
obs_sigma ~ cauchy(0, CAUCHY_SD) T[0,];
It works well for MAP. However, it may be related to slowness in NUTS and suggesting for NUTS:
real<lower=0, upper=pi()/2> obs_sigma_unif_dummy;
obs_sigma = CAUCHY_SD * tan(obs_sigma_unif_dummy);
We may need to split stan code in handling NUTS vs. rest or MAP vs. rest
For some scenarios, such as one regular regressor and two positive regressors, there pops error message w.r.t incompatible concatenation for MAP estimate...
docs/
to GitHub PagesRedesign backtest module so Backtest
is initialized only with the data.
Make models
a function of _fit()
instead of an attribute of Backtest
.
Need to author documentation on the process for contributing
we can use it track cost vs. accuracy
Just some thoughts; not necessarily complete all of them
For fractional cycle such as weekly seasonality
create separate enum class to handle individual model
Implement a class to handle multiple univariate time series cases. In other words, a vector of response columns, but each series is still independent
when auto_scale=True
with regressors, the following error message pops out:
TypeError Traceback (most recent call last)
in
1 from sklearn.preprocessing import MinMaxScaler
2 regressor_min_max_scaler = MinMaxScaler(1, 2.719)
----> 3 df[regressor_col] = regressor_min_max_scaler.fit_transform(df[regressor_col])
~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
569 if y is None:
570 # fit method of arity 1 (unsupervised transformation)
--> 571 return self.fit(X, **fit_params).transform(X)
572 else:
573 # fit method of arity 2 (supervised transformation)
~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
337 # Reset internal state before fitting
338 self._reset()
--> 339 return self.partial_fit(X, y)
340
341 def partial_fit(self, X, y=None):
~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y)
361 """
362 feature_range = self.feature_range
--> 363 if feature_range[0] >= feature_range[1]:
364 raise ValueError("Minimum of desired feature range must be smaller"
365 " than maximum. Got %s." % str(feature_range))
TypeError: 'int' object is not subscriptable
line 414 in lgt.py:
regressor_beta = pr_beta or rr_beta
does not work with torch.tesnor.
Currently we store stan.sampling.extract()
as self.posterior_samples
. However, we want to be able to retrieve chain level information from stan.sampling
directly to enable diagnostic methods.
However, we can't currently retrieve stan.sampling
directly since we never store the attribute.
There are two proposed solutions:
stan.sampling
at the end of fit, and only call stan.sampling.extract()
for downstream methods such as predict, plot, diagnostic.self.posterior_samples
as-is, and additionally store stan.sampling.to_dataframe()
to something like self.posterior_samples_chain
.The first may require refactoring of other methods when we don't use stan.sampling
for example if we fit using VI, MAP, or Pyro.
The second requires storing double the information that we would otherwise. An alternative approach to the second method is to parse the dataframe to the same state that our current self.posterior_samples
is in, but poses a challenge because matrix samples are stored as a single column in a dataframe.
Another alternative to the second method is to store only the chain info from the dataframe, but we'd have to guarantee order preservation between the dataframe and arrays in stan.sampling.extract()
RuntimeError Traceback (most recent call last)
in
1 # make prediction of past and future
----> 2 predicted_df = lgt_reg_map.predict(df=df, decompose=True)
3 predicted_df.head(5)
~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
567 # prediction
568 predicted_dict = self._predict(
--> 569 df=df, include_error=False, decompose=decompose
570 )
571
~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
472 regressor_matrix = df[self.regressor_col].values
473 regressor_torch = torch.from_numpy(regressor_matrix)
--> 474 regressor_component = torch.matmul(regressor_torch, regressor_beta)
475 regressor_component = regressor_component.t()
476 else:
RuntimeError: Expected object of scalar type Double but got scalar type Float for argument #2 'mat2' in call to _th_mm
for example,
Or just raise error in those wrong combinations?
Implement the concrete class for Damped LGT class (DLT), which is now separate from the main LGT model.
one main use case of backtest
RuntimeError Traceback (most recent call last)
in
----> 1 predicted_df = lgt_map.predict(df=test_df)
~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
566 # prediction
567 predicted_dict = self._predict(
--> 568 df=df, include_error=False, decompose=decompose
569 )
570
~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
504 trend_forecast_matrix
505 = torch.zeros((num_sample, trend_forecast_length), dtype=torch.double)
--> 506 trend_component = torch.cat((local_global_trend_sums, trend_forecast_matrix), dim=1)
507
508 last_local_trend_level = local_trend_levels[:, -1]
RuntimeError: Expected object of scalar type Float but got scalar type Double for sequence element 1 in sequence argument at position #1 'tensors'
per some users, it would be helpful to have the models saved for future use, such as calculating in-sample errors and checking the posterior coefficient estimates...
Batch backtesting methods such as fit_score_batch
or _fit_batch
can be refactored as static functions outside of the class.
Once implemented, we should deprecate fit_score_batch
and remove from examples
Error message: 'DLT' object has no attribute 'global_trend_coef_min'
DLT inherits unexisted params from LGT
can we also include two meta data
in _predicted_df append train end (if date_col is available in the splitter
in _score_df append number of splits as a reference for user knowing how many splits has conducted
Lines 13 to 15 in e449497
The above lines are raising a warning / exception in unit tests for test_backtest.py
Warning Output:
orbit/utils/utils.py:15
/home/travis/build/uber/orbit/orbit/utils/utils.py:15: UserWarning:
This call to matplotlib.use() has no effect because the backend has already
been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
or matplotlib.backends is imported for the first time.
The backend was *originally* set to 'TkAgg' by the following code:
File "setup.py", line 66, in <module>
'Programming Language :: Python :: 3.7',
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
return distutils.core.setup(**attrs)
File "/opt/python/3.7.1/lib/python3.7/distutils/core.py", line 148, in setup
dist.run_commands()
File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 966, in run_commands
self.run_command(cmd)
File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 985, in run_command
cmd_obj.run()
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/command/test.py", line 237, in run
self.run_tests()
File "setup.py", line 39, in run_tests
errcode = pytest.main(self.test_args)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/config/__init__.py", line 79, in main
return config.hook.pytest_cmdline_main(config=config)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
return self._hookexec(self, self.get_hookimpls(), kwargs)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 242, in pytest_cmdline_main
return wrap_session(config, _main)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 209, in wrap_session
session.exitstatus = doit(config, session) or 0
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 248, in _main
config.hook.pytest_collection(session=session)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
return self._hookexec(self, self.get_hookimpls(), kwargs)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 258, in pytest_collection
return session.perform_collect()
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 485, in perform_collect
items = self._perform_collect(args, genitems)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 524, in _perform_collect
self.items.extend(self.genitems(node))
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 762, in genitems
for x in self.genitems(subnode):
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 759, in genitems
rep = collect_one_node(node)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 407, in collect_one_node
rep = ihook.pytest_make_collect_report(collector=collector)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
return self._hookexec(self, self.get_hookimpls(), kwargs)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
return self._inner_hookexec(hook, methods, kwargs)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
res = hook_impl.function(*args)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in pytest_make_collect_report
call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 226, in from_call
result = func()
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in <lambda>
call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 435, in collect
self._inject_setup_module_fixture()
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 447, in _inject_setup_module_fixture
setup_module = _get_non_fixture_func(self.obj, "setUpModule")
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 251, in obj
self._obj = obj = self._getobj()
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 432, in _getobj
return self._importtestmodule()
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 499, in _importtestmodule
mod = self.fspath.pyimport(ensuresyspath=importmode)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/py/_path/local.py", line 668, in pyimport
__import__(modname)
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/assertion/rewrite.py", line 296, in load_module
six.exec_(co, mod.__dict__)
File "/home/travis/build/uber/orbit/tests/test_backtest.py", line 6, in <module>
from orbit.utils.constants import BacktestFitColumnNames
File "/home/travis/build/uber/orbit/orbit/utils/constants.py", line 4, in <module>
from orbit.utils.utils import get_parent_path
File "/home/travis/build/uber/orbit/orbit/utils/utils.py", line 6, in <module>
import matplotlib.pyplot as plt
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/pyplot.py", line 71, in <module>
from matplotlib.backends import pylab_setup
File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/backends/__init__.py", line 17, in <module>
line for line in traceback.format_stack()
matplotlib.use('Agg')
For certain series, the default algorithm LBFGS of MAP estimate may fail to converge and something goes wrong after call_sampler
in pystan.
enhance control of config inside pyro such as steps, message etc.
there are some args can be configured in pyro_map
1,2 should be ready
For daily data
Currently users need to use obj.aggregated_posteriors.get('median').get('rr_beta')
and only get an array.
Better public facing method to aggregate rr_beta
and pr_beta
with column names in a dataframe
it is forced to run bt_expand.run(..., date_col=date_col, response_col=response_col, regressor_col=regressor_col) to fit models like Prophet.
It seems to me a hacky solution.
Allow users to use dictionary for regressor_beta_prior
and regressor_sigma_prior
args to selectively input priors.
For example, suppose we have feature1
through feature5
, but only want custom priors for feature1
and feature4
. Users could set the following args:
lgt = LGT(
regressor_cols=['feature1', 'feature2', 'feature3', 'feature4', 'feature5'],
regressor_beta_prior={'feature1': 100, 'feature4': 1000} ,
regressor_sigma_prior={'feature1': 20, 'feature4': 40}
)
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.