Giter Club home page Giter Club logo

orbit's Issues

Implement DLT / LGT with a multiplicative option

Current implementation is the additive form, but doing a log transformation before fit and an exp transformation after predict, makes it a multiplicative model.

This can be integrated directly into the model classes.

Further this allows Backtesting to work as intended on the original scale of the data, without having to implement transformation function callbacks

Reparameterization of sigma

Right now we use bounded Cauchy for both MAP and NUTS:

obs_sigma ~ cauchy(0, CAUCHY_SD) T[0,];

It works well for MAP. However, it may be related to slowness in NUTS and suggesting for NUTS:

real<lower=0, upper=pi()/2> obs_sigma_unif_dummy;
obs_sigma = CAUCHY_SD * tan(obs_sigma_unif_dummy); 

We may need to split stan code in handling NUTS vs. rest or MAP vs. rest

Redesign Backtest Module

Redesign backtest module so Backtest is initialized only with the data.
Make models a function of _fit() instead of an attribute of Backtest.

Regression Coef Penalty

  • get a dataset for benchmarking/testing
  • L1/L2 Penalty
  • Total Coef Penalty
  • Set Positive Reg Coef as zero instead of rejecting sample
  • Variable selection with spike and slab

Just some thoughts; not necessarily complete all of them

auto_scale=True with regressors errors out

when auto_scale=True with regressors, the following error message pops out:


TypeError Traceback (most recent call last)
in
1 from sklearn.preprocessing import MinMaxScaler
2 regressor_min_max_scaler = MinMaxScaler(1, 2.719)
----> 3 df[regressor_col] = regressor_min_max_scaler.fit_transform(df[regressor_col])

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/base.py in fit_transform(self, X, y, **fit_params)
569 if y is None:
570 # fit method of arity 1 (unsupervised transformation)
--> 571 return self.fit(X, **fit_params).transform(X)
572 else:
573 # fit method of arity 2 (supervised transformation)

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in fit(self, X, y)
337 # Reset internal state before fitting
338 self._reset()
--> 339 return self.partial_fit(X, y)
340
341 def partial_fit(self, X, y=None):

~/Desktop/uTS-py/myenv/lib/python3.7/site-packages/sklearn/preprocessing/_data.py in partial_fit(self, X, y)
361 """
362 feature_range = self.feature_range
--> 363 if feature_range[0] >= feature_range[1]:
364 raise ValueError("Minimum of desired feature range must be smaller"
365 " than maximum. Got %s." % str(feature_range))

TypeError: 'int' object is not subscriptable

Refactor to store `stan.sampling` directly to enable diagnostic methods

Currently we store stan.sampling.extract() as self.posterior_samples. However, we want to be able to retrieve chain level information from stan.sampling directly to enable diagnostic methods.

However, we can't currently retrieve stan.sampling directly since we never store the attribute.

There are two proposed solutions:

  1. Store stan.sampling at the end of fit, and only call stan.sampling.extract() for downstream methods such as predict, plot, diagnostic.
  2. Store self.posterior_samples as-is, and additionally store stan.sampling.to_dataframe() to something like self.posterior_samples_chain.

The first may require refactoring of other methods when we don't use stan.sampling for example if we fit using VI, MAP, or Pyro.

The second requires storing double the information that we would otherwise. An alternative approach to the second method is to parse the dataframe to the same state that our current self.posterior_samples is in, but poses a challenge because matrix samples are stored as a single column in a dataframe.

Another alternative to the second method is to store only the chain info from the dataframe, but we'd have to guarantee order preservation between the dataframe and arrays in stan.sampling.extract()

Pyro Estimation with Regressors


RuntimeError Traceback (most recent call last)
in
1 # make prediction of past and future
----> 2 predicted_df = lgt_reg_map.predict(df=df, decompose=True)
3 predicted_df.head(5)

~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
567 # prediction
568 predicted_dict = self._predict(
--> 569 df=df, include_error=False, decompose=decompose
570 )
571

~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
472 regressor_matrix = df[self.regressor_col].values
473 regressor_torch = torch.from_numpy(regressor_matrix)
--> 474 regressor_component = torch.matmul(regressor_torch, regressor_beta)
475 regressor_component = regressor_component.t()
476 else:

RuntimeError: Expected object of scalar type Double but got scalar type Float for argument #2 'mat2' in call to _th_mm

Implement damped LGT class

Implement the concrete class for Damped LGT class (DLT), which is now separate from the main LGT model.

bug with pyro and MAP

get this when running
predicted_df = lgt_map.predict(df=test_df)

RuntimeError Traceback (most recent call last)
in
----> 1 predicted_df = lgt_map.predict(df=test_df)

~/work/orbit-super/orbit/orbit/estimator.py in predict(self, df, decompose)
566 # prediction
567 predicted_dict = self._predict(
--> 568 df=df, include_error=False, decompose=decompose
569 )
570

~/work/orbit-super/orbit/orbit/lgt.py in _predict(self, df, include_error, decompose)
504 trend_forecast_matrix
505 = torch.zeros((num_sample, trend_forecast_length), dtype=torch.double)
--> 506 trend_component = torch.cat((local_global_trend_sums, trend_forecast_matrix), dim=1)
507
508 last_local_trend_level = local_trend_levels[:, -1]

RuntimeError: Expected object of scalar type Float but got scalar type Double for sequence element 1 in sequence argument at position #1 'tensors'

More robust unit tests

  • Current unit tests should be parameterized instead of static
  • Stronger assertions with fixtures for actual values
  • Parameterization for more variants of init / fit / predict args
  • Speed up run time without compromising the tests themselves

Include meta data in backtest.py

can we also include two meta data

in _predicted_df append train end (if date_col is available in the splitter
in _score_df append number of splits as a reference for user knowing how many splits has conducted

Warning raised by calling plotting utils should be investigated

if os.environ.get('DISPLAY', '') == '':
print('no display found. Using non-interactive Agg backend')
matplotlib.use('Agg')

The above lines are raising a warning / exception in unit tests for test_backtest.py

Warning Output:

orbit/utils/utils.py:15
  /home/travis/build/uber/orbit/orbit/utils/utils.py:15: UserWarning: 
  This call to matplotlib.use() has no effect because the backend has already
  been chosen; matplotlib.use() must be called *before* pylab, matplotlib.pyplot,
  or matplotlib.backends is imported for the first time.
  
  The backend was *originally* set to 'TkAgg' by the following code:
    File "setup.py", line 66, in <module>
      'Programming Language :: Python :: 3.7',
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/__init__.py", line 145, in setup
      return distutils.core.setup(**attrs)
    File "/opt/python/3.7.1/lib/python3.7/distutils/core.py", line 148, in setup
      dist.run_commands()
    File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 966, in run_commands
      self.run_command(cmd)
    File "/opt/python/3.7.1/lib/python3.7/distutils/dist.py", line 985, in run_command
      cmd_obj.run()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/setuptools/command/test.py", line 237, in run
      self.run_tests()
    File "setup.py", line 39, in run_tests
      errcode = pytest.main(self.test_args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/config/__init__.py", line 79, in main
      return config.hook.pytest_cmdline_main(config=config)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 242, in pytest_cmdline_main
      return wrap_session(config, _main)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 209, in wrap_session
      session.exitstatus = doit(config, session) or 0
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 248, in _main
      config.hook.pytest_collection(session=session)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 258, in pytest_collection
      return session.perform_collect()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 485, in perform_collect
      items = self._perform_collect(args, genitems)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 524, in _perform_collect
      self.items.extend(self.genitems(node))
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 762, in genitems
      for x in self.genitems(subnode):
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/main.py", line 759, in genitems
      rep = collect_one_node(node)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 407, in collect_one_node
      rep = ihook.pytest_make_collect_report(collector=collector)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/hooks.py", line 284, in __call__
      return self._hookexec(self, self.get_hookimpls(), kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 67, in _hookexec
      return self._inner_hookexec(hook, methods, kwargs)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/manager.py", line 61, in <lambda>
      firstresult=hook.spec.opts.get("firstresult") if hook.spec else False,
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/pluggy/callers.py", line 187, in _multicall
      res = hook_impl.function(*args)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in pytest_make_collect_report
      call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 226, in from_call
      result = func()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/runner.py", line 289, in <lambda>
      call = CallInfo.from_call(lambda: list(collector.collect()), "collect")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 435, in collect
      self._inject_setup_module_fixture()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 447, in _inject_setup_module_fixture
      setup_module = _get_non_fixture_func(self.obj, "setUpModule")
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 251, in obj
      self._obj = obj = self._getobj()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 432, in _getobj
      return self._importtestmodule()
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/python.py", line 499, in _importtestmodule
      mod = self.fspath.pyimport(ensuresyspath=importmode)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/py/_path/local.py", line 668, in pyimport
      __import__(modname)
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/_pytest/assertion/rewrite.py", line 296, in load_module
      six.exec_(co, mod.__dict__)
    File "/home/travis/build/uber/orbit/tests/test_backtest.py", line 6, in <module>
      from orbit.utils.constants import BacktestFitColumnNames
    File "/home/travis/build/uber/orbit/orbit/utils/constants.py", line 4, in <module>
      from orbit.utils.utils import get_parent_path
    File "/home/travis/build/uber/orbit/orbit/utils/utils.py", line 6, in <module>
      import matplotlib.pyplot as plt
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/pyplot.py", line 71, in <module>
      from matplotlib.backends import pylab_setup
    File "/home/travis/virtualenv/python3.7.1/lib/python3.7/site-packages/matplotlib/backends/__init__.py", line 17, in <module>
      line for line in traceback.format_stack()
  
  
    matplotlib.use('Agg')

config of pyro

enhance control of config inside pyro such as steps, message etc.

Create public method to Regression coefficients

Currently users need to use obj.aggregated_posteriors.get('median').get('rr_beta') and only get an array.

Better public facing method to aggregate rr_beta and pr_beta with column names in a dataframe

Refactor run_group_backtest()

it is forced to run bt_expand.run(..., date_col=date_col, response_col=response_col, regressor_col=regressor_col) to fit models like Prophet.
It seems to me a hacky solution.

Selective input of priors

Allow users to use dictionary for regressor_beta_prior and regressor_sigma_prior args to selectively input priors.

For example, suppose we have feature1 through feature5, but only want custom priors for feature1 and feature4. Users could set the following args:

lgt = LGT(
    regressor_cols=['feature1', 'feature2', 'feature3', 'feature4', 'feature5'],
    regressor_beta_prior={'feature1': 100, 'feature4': 1000} ,
    regressor_sigma_prior={'feature1': 20, 'feature4': 40}
)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.