Giter Club home page Giter Club logo

hierarchicalforecast's Introduction

Nixtla ย  Tweet ย Slack

NixtlaTS

Forecast using TimeGPT

CI Python PyPi License docs Downloads

NixtlaTS offers a collection of classes and methods to interact with the API of TimeGPT.

๐Ÿ•ฐ๏ธ TimeGPT: Revolutionizing Time-Series Analysis

Developed by Nixtla, TimeGPT is a cutting-edge generative pre-trained transformer model dedicated to prediction tasks. ๐Ÿš€ By leveraging the most extensive dataset ever โ€“ financial, weather, energy, and sales data โ€“ TimeGPT brings unparalleled time-series analysis right to your terminal! ๐Ÿ‘ฉโ€๐Ÿ’ป๐Ÿ‘จโ€๐Ÿ’ป

In seconds, TimeGPT can discern complex patterns and predict future data points, transforming the landscape of data science and predictive analytics.

โš™๏ธ Fine-Tuning: For Precision Prediction

In addition to its core capabilities, TimeGPT supports fine-tuning, enhancing its specialization for specific prediction tasks. ๐ŸŽฏ This feature is like training a machine learning model on a targeted data subset to improve its task-specific performance, making TimeGPT an even more versatile tool for your predictive needs.

๐Ÿ”„ NixtlaTS: Your Gateway to TimeGPT

With NixtlaTS, you can easily interact with TimeGPT through simple API calls, making the power of TimeGPT readily accessible in your projects.

๐Ÿ’ป Installation

Get NixtlaTS up and running with a simple pip command:

pip install nixtlats>=0.1.0

๐ŸŽˆ Quick Start

Get started with TimeGPT now:

df = pd.read_csv('https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv')

from nixtlats import NixtlaClient
nixtla = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)
fcst_df = nixtla.forecast(df, h=24, level=[80, 90])

hierarchicalforecast's People

Contributors

azulgarza avatar cchallu avatar dluuo avatar hahnbeelee avatar jmoralez avatar kdgutier avatar mcsqr avatar melopeo avatar mergenthaler avatar nickto avatar rpmccarter avatar sugatoray avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hierarchicalforecast's Issues

Wrong order of arguments in README tutorial

Discussed in #80

Originally posted by stefanhuber1510 October 19, 2022
The last line of the first big block of the tutorial on the readme file is the following:

Y_rec_df = hrec.reconcile(Y_hat_df, Y_df_train, S, tags)

It seems to me that, Y_df_train is in the wrong position here, and should come after tags (since the arguments are positional and thus the order matters)

Reconciler class inheritance.

To simplify and reuse reconciler's code, it is convenient to declare a reconciler class with common methods like:

  • fit
  • predict
  • fit_predict
  • sample

Challenging HierarchicalForecast method input's indexing.

In its current version some methods like HierarchicalForecast.reconcile need the Y_hat_df pd.DataFrame input to be indexed by 'unique_id' to operate correctly.

This makes running the methods challenging, in particular since StatsForecast.predict output is not indexed by 'unique_id'.
I suggest to make the needed set_index('unique_id') operation implicit in the reconcile method.

Bug in the README example

The example (How to use) in README.md loads data with

Y_df, S, tags = HierarchicalData.load('./data', 'TourismLarge')

However the HierarchicalData.load method returns a tuple of only 2, causing not enough values to unpack. (source here)

Python error message:

Traceback (most recent call last):
  File "hierarchicalforecast/example.py", line 11, in <module>
    Y_df, S, tags = HierarchicalData.load('./data', 'TourismLarge')
ValueError: not enough values to unpack (expected 3, got 2)

It would also be great if more examples can be provided.

Thanks!

[BUG] Ensure `Y_df` has forecasts for all base series in `S`

A user reported the following error,

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
/var/folders/01/4mysj8cx1bjg_w8rbw097jwr0000gp/T/ipykernel_22347/4254128446.py in <module>
      7 hrec = HierarchicalReconciliation(reconcilers=reconcilers)
      8 #Y_rec_df = hrec.reconcile(Y_hat_df, Y_df_train, S, tags)
----> 9 Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_df_train, S=S, tags=tags)

~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/hierarchicalforecast/core.py in reconcile(self, Y_hat_df, S, tags, Y_df, level, bootstrap)
    148                 kwargs = {key: common_vals[key] for key in kwargs}
    149                 fcsts_model = reconcile_fn(y_hat=y_hat_model, **kwargs)
--> 150                 fcsts[f'{model_name}/{reconcile_fn_name}'] = fcsts_model['mean'].flatten()
    151                 if (pi and has_level and level is not None) or (bootstrap and level is not None):
    152                     for lv in level:

~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/frame.py in __setitem__(self, key, value)
   3653         else:
   3654             # set column
-> 3655             self._set_item(key, value)
   3656 
   3657     def _setitem_slice(self, key: slice, value):

~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/frame.py in _set_item(self, key, value)
   3830         ensure homogeneity.
   3831         """
-> 3832         value = self._sanitize_column(value)
   3833 
   3834         if (

~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/frame.py in _sanitize_column(self, value)
   4533 
   4534         if is_list_like(value):
-> 4535             com.require_length_match(value, self.index)
   4536         return sanitize_array(value, self.index, copy=True, allow_2d=True)
   4537 

~/opt/anaconda3/envs/ts_recon/lib/python3.9/site-packages/pandas/core/common.py in require_length_match(data, index)
    555     """
    556     if len(data) != len(index):
--> 557         raise ValueError(
    558             "Length of values "
    559             f"({len(data)}) "

ValueError: Length of values (4704) does not match length of index (2688)

The error is related to a difference between the number of base forecasts expected by S.

Documentation/Aesthetics

  • Change back the crown icon to "Hierarchical Crown Forecast", or better use favicon to change the world image from the explorer tab towards the library icon.
  • Change PERMBU description in the nbs/index.ipynb and add links to probabilistic methods in the description.

Example not working

Trying to implement on forecasts that i already have (both aggregations and low level)

I made my own summing matrix but when running the tutorial to see what the y_hat_df was formatted like, ran into this error:

"""
TypeError Traceback (most recent call last)
in ()
2 models=[(ets, 4, 'ZZA')],
3 freq='QS', n_jobs=-1)
----> 4 Y_hat_df = fcst.forecast(h=8, fitted=True)
5 Y_fitted_df = fcst.forecast_fitted_values()

3 frames
/usr/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):

TypeError: seasonal_decompose() got an unexpected keyword argument 'period'

"""

Which is odd bc I use seasonal decompose and know it has a period argument..

This happens on all tutorials or examples when opening in collab..

Do you have an example of implementation with forecasts/tags you already have or could you please fix the examples?

[BUG] Calculate evaluation per time series

For now, the evaluation is performed for each hierarchy considering all observations regardless of the time series. In metrics that are not "distributive" for several series, such as rmse, the performance can be confusing.

Pls update the Jupyter example notebook

Hi the google colab notebook is not available.. Kindly update it

The 'Nixtla/hierarchicalforecast' repository doesn't contain the 'examples/TourismSmall.ipynb' path in 'main'.

ERM stationarity warning

In the presence of non stationarity ERM might degrade in performance.
It would be a good idea to add a test, and a warning for the user if the series are non-stationary.

Wrong documentation at PyPI

Kindly update the example in the "How to use" section at the PyPI website, as well as the link to the Colab Notebook.

I've been stuck in that non-working example, but here in the repo, I found the correct one. You have developed a fantastic package, but folks like me who first face the PyPI documentation might find it not working while it is (with updated code).

The main issue I found at PyPI is using auto_arima and naive instead of AutoARIMA and Naive.

Thanks for your contribution =D

Guaranteed hierarchical ordering for reconciliation methods inputs.

A clever way to simplify lower level reconciliation functions is to assume 'hierarchically ordered series'.
It can be achieved with the following lines at the beginning of the reconcile method on the core.HierarchicalForecast with these lines:

Y_df.unique_id = Y_df.unique_id.astype('category')
Y_df.unique_id = Y_df.unique_id.cat.set_categories(S_df.index)

If we decide to operate like this, we should add in the docstrings of the methods that are order dependent the assumption.
Currently the PERMBU method operates like this.

These two issues are closely related: Challenging HierarchicalForecast method input's indexing.

[BUG] Min Trace `method="mint_shrink"`

Users have reported issues trying to run MinTrace(method=='mint_shrink').

I suspect that it is related to datasets with time series of distinct lengths (in our examples all time series have the same length and the method works fine). In particular, I think that line 441 below is not working correctly. It should ignore the NAS in the residuals matrix generated by the difference in lengths, but perhaps it is not ignoring them.

residuals = (y_insample - y_hat_insample).T
n, _ = residuals.shape
masked_res = np.ma.array(residuals, mask=np.isnan(residuals))
covm = np.ma.cov(masked_res, rowvar=False, allow_masked=True).data
if method == 'wls_var':
W = np.diag(np.diag(covm))
elif method == 'mint_cov':
W = covm
elif method == 'mint_shrink':
tar = np.diag(np.diag(covm))
corm = cov2corr(covm)
xs = np.divide(residuals, np.sqrt(np.diag(covm)))
xs = xs[~np.isnan(xs).any(axis=1), :]
v = (1 / (n * (n - 1))) * (crossprod(xs ** 2) - (1 / n) * (crossprod(xs) ** 2))
np.fill_diagonal(v, 0)
corapn = cov2corr(tar)
d = (corm - corapn) ** 2
lmd = v.sum() / d.sum()
lmd = max(min(lmd, 1), 0)
W = lmd * tar + (1 - lmd) * covm

From Linear-GitHub Sync

Separate `reconcile_fn` from `prob_reconcile_fn`

Both Bootstrap and Normality probabilistic reconcilers require inputs that depend on a previous run of mean hierarchical reconciliation methods.

Currently the instantiation of probabilistic reconcilers in HierarchicalForecast.reconcile is complicated due to the effort of sending samplers as part of the reconcile_fn arguments.

I believe that separating the responsabilities between reconcile_fn and prob_reconcile_fn, used after the use of the reconcile_fn can improve code's readability and will help to simplify the instantiation of the samplers.

Small data set size issue

##########################Getting this error#####################

RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.7/multiprocessing/pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "/usr/local/lib/python3.7/dist-packages/statsforecast/core.py", line 154, in forecast
raise error
File "/usr/local/lib/python3.7/dist-packages/statsforecast/core.py", line 149, in forecast
res_i = model.forecast(h=h, y=y_train, X=X_train, X_future=X_f, fitted=fitted, **kwargs)
File "/usr/local/lib/python3.7/dist-packages/statsforecast/models.py", line 481, in forecast
mod = ets_f(y, m=self.season_length, model=self.model)
File "/usr/local/lib/python3.7/dist-packages/statsforecast/ets.py", line 968, in ets_f
raise NotImplementedError('tiny datasets')
NotImplementedError: tiny datasets
"""

The above exception was the direct cause of the following exception:

NotImplementedError Traceback (most recent call last)
in
2 models=[ETS(season_length=4, model='ZAA')],
3 freq='M', n_jobs=-1)
----> 4 Y_hat_df = fcst.forecast(h=4, fitted=True)
5 Y_fitted_df = fcst.forecast_fitted_values()

Change matrix operations -> `.dot` method

Some methods like TopDown, BottomUp, ERM benefit a lot from sparse matrix operations.

Luckily the numpy's and scipy.sparse's matrix multiplications can both be expressed with .dot method.

It would be convenient to make the HierarchicalForecast library agnostic to the method's sparsity by changing all the corresponding matrix multiplications.

Saved Prefitted ARIMA Base Forecasts for TourismL

It would be convenient to have prefitted ARIMA base forecasts for medium and large datasets on S3.

  • We guarantee replicability.
  • We can save a lot of time for potential users and other researchers.
  • Circle CI tests that check time efficiency can greatly benefit from this dataset.

Probabilistic Forecast Reconciliation

After going through the package and paper found that supports only point forecasts reconciliation. I would like to know if it supports probabilistic forecast reconciliation. I appreciate your quick response to my inquiry for information.

[FIX] Improve `Introduction to Hierarchical Forecasting`

The Base predictions and Reconciliation sections don't have a complete explanation of the hierarchical reconciliation methodology. Ideas to improve those sections:

Base predictions:

  • Define what we mean by "base predictions".
  • Use a different model. By construction, the Naive model is coherent, so this model does not well motivate the use of hierarchical reconciliation. In particular, RandomWalkWithDrift could work.

Reconciliation

  • Show that the base forecasts are not coherent, using plot_hierarchical_predictions_gap (fix #85 first).
  • Explain the methods used for reconciliation. In this case, the explanation for BottomUp is missing.

h Indexing in evaluate method

In the .evaluate of the HierarchicalEvaluation class, it sets h = len(Y_h.loc[Y_h.index[0]]) but when the look ahead period is only 1 period ahead, Y_h.loc[Y_h.index[0]] returns values rather than a dataframe.

The length of these values is then 2 because of the ds column and the forecast column. This then returns an error during the reshape of Y_test here: y_test_cats = Y_test.loc[cats, 'y'].values.reshape(-1, h) because it tries to reshape to 2 even though there is only 1 record.

Would changing h = len(Y_h.loc[Y_h.index[0]]) to h = len(Y_h.loc[[Y_h.index[0]]]) with the double brackets to ensure a dataframe is returned work?

Let me know if you need any additional detail or a reproducible example as this is my first issue statement

Probabilistic hierarchical forecasting

Although probabilistic hierarchical forecasting is still a niche topic, there is a clear need to extend hierarchical forecasting available methods beyond point/mean forecasting.

This paper has an R implementation that can be improved, along with pointers to other methods we can extend towards their predictive distributions.

MinT improved computational efficiency

By recommendation of Shanika, we can change the current matrix:
$P_{MinT} = (SW^{โˆ’1} S)^{-1}S^{t} W^{โˆ’1}$

To avoid the double computation of the inverses that scale cubically.
Her paper contains the alternative that uses a single inverse with lower dimensionality.

Change reconcile input name from S to S_df

In the method core.HierarchicalForecast.reconcile
I suggest we change the name of the S dataframe in to S_df

  • It homogeneizes inputs across libraries.
  • It makes a distinction between method's internal matrix S.
  • Suffix _df helps our user's intuition on the type of input expected.

Continuous Integration Tests

It would be convenient to add continuous integration tests to monitor performance and fitting times of the reconciliation methods.
Automated tests will hasten the development and improvements of the code.

Question: Reconciliation of varying hierarchy size?

Dear colleagues,

I am working with a problem where I do need to predict the performance of a financial portfolio formed by different assets. As of today I am approaching this as a combination of a classification and regression problem but now I have been task to do this by all of the portfolios that our firm has.

I can replicate the same method as before as I am predicting at asset level but I have been thinking to frame this as a hierarchical forecasting problem.

The only thing is that the number of portfolios grow over time and the number of assests is different from portfolio to portfolio and I am bit unsure if there is a way to reconcialiate forecasts if the hierachy size is changing over time without retraining the models.

Looking to the notation that R.Hyndman uses:

image

It seems that the matrixes they need to be mxn compatible.

Is this the right interpreation or is there any way to reforecast at hierarchical level without retraining again the models.

Verbose/TQDM Progress when calling reconcile()

I followed the tutorial, and everything worked until I reached the following part:

reconcilers = [ BottomUp(), TopDown(method='forecast_proportions'), MiddleOut(middle_level='filial/categoria', top_down_method='forecast_proportions')] hrec = HierarchicalReconciliation(reconcilers=reconcilers) Y_rec_df = hrec.reconcile(Y_hat_df=Y_hat_df, Y_df=Y_train_df, S=S, tags=tags)

It works fine in small datasets, but I'm trying to use this in a dataset with the following structure:

Filial (36 unique)
----Categoria (50 unique)
--------Master(~30k unique)

Each series have 36 time steps.

The code is currently running the reconcile() function, it is stuck in this part for the last 2 hours. It would be nice if there was an indication of how long it will probably take or at least some sort of verbose showing of what is going on.

My computer has an i5-12500H, 16Gb of RAM, and an SSD for storage.

Reconcile function fail and try/except protections.

Discussed in #47

Originally posted by jakebajo August 25, 2022
Is there a way to write the HierarchicalReconciliation.reconcile() function so that if one reconcile method fails, the function will move on to the next one and still output a dataframe with the reconcile methods that worked?

I understand that you can loop through various reconcile methods instead but that is slower and inefficient. I don't think this would be a huge change in the function, just some try except essentially to the for loop. Could also output an error dataframe if necessary. I could possibly put some things together in a pull request if I have time but wanted to ask if the team has thought about this.

Please ensure proper citation practice and proper assignment of credit in your academic papers

Kindly ensure you follow best academic citation practice in your academic papers, e.g., the accompanying paper here:
https://arxiv.org/abs/2207.03517

I'm aware that in industry it might not be common or expected to comment on prior art (or even considered counterproductive for marketing reasons), and that Nixtla is primarily a commercial venture. But, since you are publishing in the arXiv you are at least making the claim of adhering to basic scientific standards, so I think you should at least give it a try.

A proper "literature review" or "prior art" section is the minimum in any scientific paper, and that means commenting on the specific context of your contribution, not just providing a generic list of packages in an appendix.

As past contributors to sktime, you are probably aware of its hierarchical framework functionality, and the fact that the algorithms you present are also implemented there. See, e.g., this conference presentation from April:
https://github.com/sktime/sktime-tutorial-pydata-berlin-2022

Perhaps more important even are the earlier python packages when it comes to hierarchical forecasting, which have developed pertinent designs, and which you don't even cite, e.g.,:
https://github.com/carlomazzaferro/scikit-hts, FYI @carlomazzaferro
https://github.com/AngelPone/pyhts, FYI @AngelPone

And there is, of course, even more in R.

You can't claim to be a "reference" (in the title of your paper!) without making reference to what has come before. Pun intended.

From your mission statement: "We intend to continue maintaining and increasing the repository, promoting collaboration across the forecasting community."

I do hope to see that, giving due credit in your scientific papers would be a good start.

Let me know if you have any questions on best scientific practice.

`Normality` prob_reconciler covariance_type possibility

It would be good to specify in the code the type of covariance estimator used by the Normality class.
Further in the future it would be good to add the capability to switch types of covariance estimator to the Normality reconciliator with a covariance_type input.

[SILENT BUG] Fitted values not requested when needed

Some methods and functionality (bootstrapped prediction intervals) need the fitted values of the models. But the following iteration omits to request them.

if model_name in Y_df:
y_hat_insample = Y_df.pivot(columns='ds', values=model_name).loc[uids].values
y_hat_insample = y_hat_insample.astype(np.float32)
if has_fitted:
common_vals['y_hat_insample'] = y_hat_insample
if bootstrap and has_level:
common_vals['bootstrap_samples'] = _bootstrap_samples(
y_insample=common_vals['y_insample'],
y_hat_insample=y_hat_insample,
y_hat=y_hat_model,
n_samples=1_000
)
common_vals['bootstrap'] = bootstrap
common_vals['level'] = level
else:
# some methods have the residuals argument
# but they don't need them
# ej MinTrace(method='ols')
common_vals['y_hat_insample'] = None

The above leads to unintelligible bugs.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.