Giter Club home page Giter Club logo

stldecompose's Introduction

STL Decompose

This is a relatively naive Python implementation of a seasonal and trend decomposition using Loess smoothing. Commonly referred to as an "STL decomposition", Cleveland's 1990 paper is the canonical reference.

This implementation is a variation of (and takes inspiration from) the implementation of the seasonal_decompose method in statsmodels. In this implementation, the trend component is calculated by substituting a configurable Loess regression for the convolutional method used in seasonal_decompose. It also extends the existing DecomposeResult from statsmodels to allow for forecasting based on the calculated decomposition.

Usage

The stldecompose package is relatively lightweight. It uses pandas.Dataframe for inputs and outputs, and exposes only a couple of primary methods - decompose() and forecast() - as well as a handful of built-in forecasting functions.

See the included IPython notebook for more details and usage examples.

Installation

A Python 3 virtual environment is recommended.

The preferred method of installation is via pip:

(env) $ pip install stldecompose

If you'd like the bleeding-edge version, you can also install from this Github repo:

(env) $ git clone [email protected]:jrmontag/STLDecompose.git
(env) $ cd STLDecompose; pip install .

More Resources

stldecompose's People

Contributors

boegel avatar jrmontag avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stldecompose's Issues

Check for any/proper DatetimeIndex in observations

The pandas wrapper in decompose() keeps track of the observation DatetimeIndex so it can be used properly in the forecast() method.

There are two potential issues here:

  • while the decomposition seems to work fine without continuous data e.g. no missing data, it does require a DatetimeIndex
  • though the decomposition would work with missing data, the forecast will fail awkwardly if there is a bad DatetimeIndex

A couple of possibilities:

  • the decompose method could raise an exception if the input is missing a DatetimeIndex
  • the forecast method could catch exceptions and raise this as a possibly helpful message

Consider a ramp-down of this library

It appears statsmodels has merged a PR to implement this years-old request! statsmodels/statsmodels#4044 Since this library primarily extends the existing statsmodels internals, they will probably do this more robustly and flexibly.

For those currently using this library, there are no plans to delete this library! I've still got some net new things I'd like to do with it.

Cannot call stldecompose library

Python version: 3.7.*
Statsmodel version: statsmodels==0.11.0
Stldecompose version: stldecompose==0.0.5
Error: ImportError: cannot import name '_maybe_get_pandas_wrapper_freq' from 'statsmodels.tsa.filters._utils'

Include more than one period in seasonal decomposition

Currently, the decompose method takes a single period value for the decomposition. This could be extended to accept an iterable of frequency values which are sequentially factored out of the observation to build up the seasonal component.

Another option would be to automatically calculate the top frequencies/periodicities to factor out by calling out to a scipy FFT routine and setting some thresholding about how big (or small) a frequency component must be before we stop.

The seasonal and residual components are NaN

When I perform decompose(x), the returned seasonal and residual value for each observation is NaN. Could you check this?
For the following lines of code, the output is given below.
result=stldecompose.decompose(X)
print(result.seasonal) #same output for result.resid

1998-05-06 NaN
1998-05-07 NaN
1998-05-08 NaN
1998-05-09 NaN
1998-05-10 NaN
1998-05-11 NaN
1998-05-12 NaN
1998-05-13 NaN
1998-05-14 NaN
1998-05-15 NaN
1998-05-16 NaN
1998-05-17 NaN
1998-05-18 NaN
1998-05-19 NaN
1998-05-20 NaN
1998-05-21 NaN
1998-05-22 NaN
1998-05-23 NaN
1998-05-24 NaN
1998-05-25 NaN
1998-05-26 NaN
1998-05-27 NaN
1998-05-28 NaN
1998-05-29 NaN
1998-05-30 NaN
1998-05-31 NaN

Can STL process embedded time series data?

Dear jrmontag:
Hello! I'd like to research if there are some decomposition methods which can process embedded data.
These data's dimension is transfer from (length,) to (length, embedding). So STL can do this?

broken scipy import paths

Dear Josh,

I’m working with timeseries of water supply for a city of more than 8 millions of inhabitants.
Really you project has a high interest for my purposes.

However, when install it under python 3.7 obtain an error: “cannot import name ‘factorial’ from ‘scipy.misc’.
Your .whl installed the scipy lib version 1.3.0 in which the 'scipy.misc.factorial’ was deprecated (from version 1.0.0) and now the function
factorial is under ‘scipy.special’.

May you modify this calling to scipy.special.factorial please. Otherwise, will be necessary downgrade scipy to 0.19.1 version

Please some advice on this situation, thanks on advance,

Raul

PD: the stldecomposemod project v=0.0.3 has the same error.

stldecompose.forecast() fails on a dataframe with offsetalias/freq=T

I have been looking at this project lately and noticed that the forecast function fails when I pass a dataframe that has a frequency offset of a minute (not exactly a minute, but any frequency that is a multiple of T/min).

I tried to look closer, but could not figure out the exact reason. Even the usage example (https://github.com/jrmontag/STLDecompose/blob/master/STL%20usage%20example.ipynb) breaks when I resample with frequency in minutes. Any help/pointers would be great.

Include contributor guidelines

It'd be nice to have some guidance for how others can best contribute to this project e.g. a dev requirements.txt file (twine, wheel, ?), tests to check before submitting a PR, etc.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.