Giter Club home page Giter Club logo

filippomb / time-series-classification-and-clustering-with-reservoir-computing Goto Github PK

View Code? Open in Web Editor NEW
330.0 20.0 83.0 14.15 MB

Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Home Page: https://reservoir-computing.readthedocs.io/en/latest/

License: MIT License

Python 100.00%
machine-learning-algorithms reservoir-computing time-series-classification time-series-clustering

time-series-classification-and-clustering-with-reservoir-computing's Introduction

Time series classification and clustering with Reservoir Computing

arXiv Downloads

Figure 1: Overview of the RC classifier.

This library allows to quickly implement different architectures for time series data based on Reservoir Computing (RC), the family of approaches popularized in machine learning by Echo State Networks. This library is primarly design to perform classification and clustering of both univariate and multivariate time series. However, it can also be used to perform time series forecasting.

Installation

The recommended installation is with pip:

pip install reservoir-computing

Alternatively, you can install the library from source:

git clone https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing.git
cd Time-series-classification-and-clustering-with-Reservoir-Computing
pip install -e .

Quick start

The following scripts provide minimalistic examples that illustrate how to use the library for different tasks.

To run them, download the project and cd to the root folder:

git clone https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing.git
cd Time-series-classification-and-clustering-with-Reservoir-Computing

Classification

python examples/classification_example.py

You can also view the notebook or Open In Colab

Clustering

python examples/clustering_example.py

You can also view the notebook or Open In Colab

Forecasting

python examples/forecasting_example.py

You can also view the notebook or Open In Colab

Overview of the framework

In the following, we present the three main functionalities of this library.

Classification

Referring to Figure 1, the RC classifier consists of four different modules.

  1. The reservoir module specifies the reservoir configuration (e.g., bidirectional, leaky neurons, circle topology). Given a multivariate time series $\mathbf{X}$ it generates a sequence of the same length of Reservoir states $\mathbf{H}$.
  2. The dimensionality reduction module (optionally) applies a dimensionality reduction on the sequence of the reservoir's states $\mathbf{H}$ generating a new sequence $\mathbf{\bar H}$.
  3. The representation generates a vector $\mathbf{r}_\mathbf{X}$ from the sequence of reservoir's states, which represents in vector form the original time series $\mathbf{X}$.
  4. The readout module is a classifier that maps the representation $\mathbf{r}_\mathbf{X}$ into the class label $\mathbf{y}$, associated with the time series $\mathbf{X}$.

This library implements also the reservoir model space, a very powerful representation $\mathbf{r}_\mathbf{X}$ for the time series. Details about the methodology are found in the original paper.

The class RC_model contained in modules.py permits to specify, train and test an RC-model. Several options are available to customize the RC model, by selecting different configurations for each module.

The training and test function requires in input training and test data, which must be provided as multidimensional numpy arrays of shape [N,T,V], with:

  • N = number of samples
  • T = number of time steps in each sample
  • V = number of variables in each sample

Training and test labels (Ytr and Yte) must be provided in one-hot encoding format, i.e. a matrix [N,C], where C is the number of classes.

from reservoir_computing.modules import RC_model

clf = RC_model()
clf.fit(Xtr, Ytr) # Training
Yhat = clf.predict(Xte) # Prediction

Clustering

The representation $\mathbf{r}_\mathbf{X}$ obtained from the representation module (step 3) can be used to perform time series clustering. The same class RC_model used for classification can be configured to directly return the time series representations, which can be used in unsupervised tasks such as clustering and dimensionality reduction.

As in the case of classification, the data must be provided as multidimensional NumPy arrays of shape [N,T,V]

from reservoir_computing.modules import RC_model

clst = RC_model(readout_type=None)
clst.fit(X)
rX = clst.input_repr # representations of the input data

The representations rX can be used to perfrom clustering using traditional clustering algorithms for vectorial data, such as those here.

Forecasting

The sequences $\mathbf{H}$ and $\mathbf{\bar H}$ obtained at steps 1 and 2 can be directly used to forecast the future values of the time series.

The class RC_forecaster contained in modules.py permits to specify, train and test an RC-model for time series forecasting.

from reservoir_computing.modules import RC_forecaster

fcst = RC_forecaster()
fcst.fit(Xtr, Ytr) # Training
Yhat = fcst.predict(Xte) # Predictions

Here, Xtr, Ytr are current and future values, respectively, used for training.

Advanced examples

The following notebooks illustrate more advanced use-cases.

  • Perform dimensionality reduction, cluster analysis, and visualize the results: view or Open In Colab
  • Probabilistic forecasting with advanced regression models as readout: view or Open In Colab
  • Use advanced classifiers as readout: view or Open In Colab

Datasets

There are several datasets available to perform time series classification/clustering and forecasting.

Classification and clustering

from reservoir_computing.datasets import ClfLoader

downloader = ClfLoader()
downloader.available_datasets(details=True)  # Print available datasets
Xtr, Ytr, Xte, Yte = downloader.get_data('Libras')  # Download dataset and return data

Forecasting

Real-world time series

from reservoir_computing.datasets import PredLoader

downloader = PredLoader()
downloader.available_datasets(details=False)  # Print available datasets
X = downloader.get_data('CDR')  # Download dataset and return data

Synthetic time series

from reservoir_computing.datasets import SynthLoader

synth = SynthLoader()
synth.available_datasets()  # Print available datasets
Xs = synth.get_data('Lorenz')  # Generate synthetic time series

Citation

Please, consider citing the original paper if you are using this library in your reasearch

@article{bianchi2020reservoir,
  title={Reservoir computing approaches for representation and classification of multivariate time series},
  author={Bianchi, Filippo Maria and Scardapane, Simone and L{\o}kse, Sigurd and Jenssen, Robert},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  publisher={IEEE}
}

License

The code is released under the MIT License. See the attached LICENSE file.

time-series-classification-and-clustering-with-reservoir-computing's People

Contributors

filippomb avatar muhammadmooazam avatar sscardapane avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

time-series-classification-and-clustering-with-reservoir-computing's Issues

Please add Code license

Hi thanks for providing the code in github. As I don't know other ways to reach you , I am raising this as an issue. But my query is can we use your code for testing and if we use it for any competitions.may be adding a license info will benefit people like me. Thank you

application to meteorological forcasting variable

Hi,
I want to expériment your work on meteorological forcasting. I'm not professionnal coder and work on python and keras since 2 years (never directly on tensorflow)). I take a look to your code and I'm not sure what and where modify the code to change the classification method (MAE in lieu of cross_entropy).

I see the line 166 in MLP.py, my gess is to change this one to !? and remove the one hot enoder in the data preproscessing.

An other modification is the posibility to set the activate fonction to, exemple: 'relu' and the output to 'linear', if my comprehension is good, all output layer have the same activation function.

My goal is to use your esn to preprocessing my data and eventualy feed an more complex model.

my last question, can I use only only reservoir processing (with no bidirectionnal ) and get the process value to feed an other kind of model? If yes, what is the goot output var in the code?

Sorry for all this question, Im just a bit entthousiast about your work an I have so much idea to explore with it.

PS: I add 'softsign' activation to your code, it is a very effective one!
PPS: If you can just guide me I will try to integer my idea to your code and will summit you the modif for approbation if you dont have time.

ESN predicting constant values in classification

Hi,
I'm using the ESNClassification to classify a binary outcome from data which is input in the format (N,T,V). I've played around with the hyperparameters but no matter the combination, I get either only ones or only zeroes.

Here is the config set-up:

`config['seed'] = 1
np.random.seed(config['seed'])

Hyperarameters of the reservoir

config['n_internal_units'] = 100 # size of the reservoir
config['spectral_radius'] = 0.4 # largest eigenvalue of the reservoir
config['leak'] = 0.5 # amount of leakage in the reservoir state update (None or 1.0 --> no leakage)
config['connectivity'] = 0.1 # percentage of nonzero connections in the reservoir
config['input_scaling'] = 0.1 # scaling of the input weights
config['noise_level'] = 0.01 # noise in the reservoir state update
config['n_drop'] = 0 # transient states to be dropped
config['bidir'] = False # if True, use bidirectional reservoir
config['circ'] = False # use reservoir with circle topology

Dimensionality reduction hyperparameters

config['dimred_method'] ='tenpca' # options: {None (no dimensionality reduction), 'pca', 'tenpca'}
config['n_dim'] = 40 # number of resulting dimensions after the dimensionality reduction procedure

Type of MTS representation

config['mts_rep'] = 'last' # MTS representation: {'last', 'mean', 'output', 'reservoir'}
config['w_ridge_embedding'] = 5.0 # regularization parameter of the ridge regression

Type of readout

config['readout_type'] = 'mlp' # readout used for classification: {'lin', 'mlp', 'svm'}

Linear readout hyperparameters

config['w_ridge'] = 5.0 # regularization of the ridge regression readout

SVM readout hyperparameters

config['svm_gamma'] = 0.005 # bandwith of the RBF kernel
config['svm_C'] = 5.0 # regularization for SVM hyperplane

MLP readout hyperparameters

config['mlp_layout'] = (40,1) # neurons in each MLP layer
config['num_epochs'] = 20 # number of epochs
config['w_l2'] = 0.05 # weight of the L2 regularization
config['nonlinearity'] = 'relu' # type of activation function {'relu', 'tanh', 'logistic', 'identity'}`

And the format of the input for training:
print(X_tr.shape) print(X_te.shape) print(y_tr.shape) print(y_te.shape)
with output:
(7801, 200, 130) (1132, 200, 130) (7801, 1) (1132, 1)

when I train on X_tr and y_tr, the output are either all ones or all zeroes. Any guidance on what could be causing this error?
Cheers

Input weights shape

I'm facing a problem understanding the architecture of the reservoir if we look at it as a neural network. In here the shape of the input for the reservoir "current_input" in the code is (N, V), and the shape of the _input_weights is (internal_units, V).
And normally, the shape of the _input_weights should be (size of reservoir* size of the input).
Can I kindly ask why the size of the _input_weights is (internal_units, V) and not (internal_units, V*N)?

TensorPCA yields complex data type array which causes error in Ridge module

Hi @FilippoMB,

I noticed that for the dataset that I'm using, the result of tensorPCA yields a complex data type Numpy array. This in turn causes an error in the ridge module which says that it does not support complex data type. Specifically, error ValueError: Complex data not supported is generated at https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing/blob/master/code/modules.py#L205

I don't face this issue when I use PCA with the same dataset.

I tried to print out the eigenvalue and eigenvector data type at https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing/blob/master/code/tensorPCA.py#L33-L38 and both these vectors are of the data type, complex128, for the dataset I am using.

I tried Googling a bit and found some resources such as https://stackoverflow.com/questions/10420648/complex-eigen-values-in-pca-calculation and https://stackoverflow.com/questions/48695430/how-to-make-the-eigenvalues-and-eigenvectors-stay-real-instead-of-complex. From what I understood, due to some numerical error, the eigenvalues and eigenvectors can have a small imaginary value when linalg.eig is used. I'm not sure whether my understanding is correct.

Any thoughts on this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.