Library for implementing reservoir computing models (echo state networks) for multivariate time series classification and clustering.

Home Page: https://reservoir-computing.readthedocs.io/en/latest/

License: MIT License

Python 100.00%

machine-learning-algorithms reservoir-computing time-series-classification time-series-clustering

time-series-classification-and-clustering-with-reservoir-computing's Introduction

Time series classification and clustering with Reservoir Computing

Figure 1: Overview of the RC classifier.

This library allows to quickly implement different architectures for time series data based on Reservoir Computing (RC), the family of approaches popularized in machine learning by Echo State Networks. This library is primarly design to perform classification and clustering of both univariate and multivariate time series. However, it can also be used to perform time series forecasting.

🚀 Getting Started - 📚 Documentation - 📊 Advanced examples

Installation

The recommended installation is with pip:

pip install reservoir-computing

Alternatively, you can install the library from source:

git clone https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing.git
cd Time-series-classification-and-clustering-with-Reservoir-Computing
pip install -e .

Quick start

The following scripts provide minimalistic examples that illustrate how to use the library for different tasks.

To run them, download the project and cd to the root folder:

git clone https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing.git
cd Time-series-classification-and-clustering-with-Reservoir-Computing

Classification

python examples/classification_example.py

You can also view the notebook or

Clustering

python examples/clustering_example.py

You can also view the notebook or

Forecasting

python examples/forecasting_example.py

You can also view the notebook or

Overview of the framework

In the following, we present the three main functionalities of this library.

Classification

Referring to Figure 1, the RC classifier consists of four different modules.

The reservoir module specifies the reservoir configuration (e.g., bidirectional, leaky neurons, circle topology). Given a multivariate time series $\mathbf{X}$ it generates a sequence of the same length of Reservoir states $\mathbf{H}$.
The dimensionality reduction module (optionally) applies a dimensionality reduction on the sequence of the reservoir's states $\mathbf{H}$ generating a new sequence $\mathbf{\bar H}$.
The representation generates a vector $\mathbf{r}_\mathbf{X}$ from the sequence of reservoir's states, which represents in vector form the original time series $\mathbf{X}$.
The readout module is a classifier that maps the representation $\mathbf{r}_\mathbf{X}$ into the class label $\mathbf{y}$, associated with the time series $\mathbf{X}$.

This library implements also the reservoir model space, a very powerful representation $\mathbf{r}_\mathbf{X}$ for the time series. Details about the methodology are found in the original paper.

The class RC_model contained in modules.py permits to specify, train and test an RC-model. Several options are available to customize the RC model, by selecting different configurations for each module.

The training and test function requires in input training and test data, which must be provided as multidimensional numpy arrays of shape [N,T,V], with:

N = number of samples
T = number of time steps in each sample
V = number of variables in each sample

Training and test labels (Ytr and Yte) must be provided in one-hot encoding format, i.e. a matrix [N,C], where C is the number of classes.

from reservoir_computing.modules import RC_model

clf = RC_model()
clf.fit(Xtr, Ytr) # Training
Yhat = clf.predict(Xte) # Prediction

Clustering

The representation $\mathbf{r}_\mathbf{X}$ obtained from the representation module (step 3) can be used to perform time series clustering. The same class RC_model used for classification can be configured to directly return the time series representations, which can be used in unsupervised tasks such as clustering and dimensionality reduction.

As in the case of classification, the data must be provided as multidimensional NumPy arrays of shape [N,T,V]

from reservoir_computing.modules import RC_model

clst = RC_model(readout_type=None)
clst.fit(X)
rX = clst.input_repr # representations of the input data

The representations rX can be used to perfrom clustering using traditional clustering algorithms for vectorial data, such as those here.

Forecasting

The sequences $\mathbf{H}$ and $\mathbf{\bar H}$ obtained at steps 1 and 2 can be directly used to forecast the future values of the time series.

The class RC_forecaster contained in modules.py permits to specify, train and test an RC-model for time series forecasting.

from reservoir_computing.modules import RC_forecaster

fcst = RC_forecaster()
fcst.fit(Xtr, Ytr) # Training
Yhat = fcst.predict(Xte) # Predictions

Here, Xtr, Ytr are current and future values, respectively, used for training.

Advanced examples

The following notebooks illustrate more advanced use-cases.

Perform dimensionality reduction, cluster analysis, and visualize the results: view or
Probabilistic forecasting with advanced regression models as readout: view or
Use advanced classifiers as readout: view or

Datasets

There are several datasets available to perform time series classification/clustering and forecasting.

Classification and clustering

from reservoir_computing.datasets import ClfLoader

downloader = ClfLoader()
downloader.available_datasets(details=True)  # Print available datasets
Xtr, Ytr, Xte, Yte = downloader.get_data('Libras')  # Download dataset and return data

Forecasting

Real-world time series

from reservoir_computing.datasets import PredLoader

downloader = PredLoader()
downloader.available_datasets(details=False)  # Print available datasets
X = downloader.get_data('CDR')  # Download dataset and return data

Synthetic time series

from reservoir_computing.datasets import SynthLoader

synth = SynthLoader()
synth.available_datasets()  # Print available datasets
Xs = synth.get_data('Lorenz')  # Generate synthetic time series

Citation

Please, consider citing the original paper if you are using this library in your reasearch

@article{bianchi2020reservoir,
  title={Reservoir computing approaches for representation and classification of multivariate time series},
  author={Bianchi, Filippo Maria and Scardapane, Simone and L{\o}kse, Sigurd and Jenssen, Robert},
  journal={IEEE Transactions on Neural Networks and Learning Systems},
  year={2020},
  publisher={IEEE}
}

License

The code is released under the MIT License. See the attached LICENSE file.

time-series-classification-and-clustering-with-reservoir-computing's People

Contributors

Stargazers

Watchers

Forkers

limpins ahmadhajmosa aihill pfbalan totonac yangdempe carlosliverani amandakeasson preritt dw-liedji fabridamicelli jnthnroy urkang annamiller ivanvishev wzpy ykjin sunyh3 unai-ar nikhil-garg lukasleroy mahbubnoor shahrokhx bramamoorthy naimahmednesaragi dylanthomas newsma rogiskhan muhammadmooazam zcmail rashanaz jorsorokin biyoyo lclosson japjeet26 tang-guoxin yyqian123 ashishpatel26 next-mooon aravind-sundararajan teodorkasap stockedge joshua-shuhan jagandecapri mcfcwangke5555 leedaga sdwfrost squalidux mzy2240 hellosummer8850 arnabkar jcassiojr valeman dotooarthur shenyww antelligent-app wangxueyuan2020 nghenzi hendriktpl am610 shauncassini ipsych researcherlifeng sunpengfei1122 tdl77 poornass jimmy-inl ursu1964 chamoddamitha snanfinsen ziwei-utc jiaruixu tkexchange saisumanv bharathijeeva aphexus typicle iamsandipanbhowmick sriramant

time-series-classification-and-clustering-with-reservoir-computing's Issues

Test dataset is used in training in the example? classification_example.py

The examples show it's using the full 640 MTS-"X" for training and then use 370 of them - "Xte" to test. Isn't Xte a subset of X?

Please add Code license

Hi thanks for providing the code in github. As I don't know other ways to reach you , I am raising this as an issue. But my query is can we use your code for testing and if we use it for any competitions.may be adding a license info will benefit people like me. Thank you

application to meteorological forcasting variable

Hi,
I want to expériment your work on meteorological forcasting. I'm not professionnal coder and work on python and keras since 2 years (never directly on tensorflow)). I take a look to your code and I'm not sure what and where modify the code to change the classification method (MAE in lieu of cross_entropy).

I see the line 166 in MLP.py, my gess is to change this one to !? and remove the one hot enoder in the data preproscessing.

An other modification is the posibility to set the activate fonction to, exemple: 'relu' and the output to 'linear', if my comprehension is good, all output layer have the same activation function.

My goal is to use your esn to preprocessing my data and eventualy feed an more complex model.

my last question, can I use only only reservoir processing (with no bidirectionnal ) and get the process value to feed an other kind of model? If yes, what is the goot output var in the code?

Sorry for all this question, Im just a bit entthousiast about your work an I have so much idea to explore with it.

PS: I add 'softsign' activation to your code, it is a very effective one!
PPS: If you can just guide me I will try to integer my idea to your code and will summit you the modif for approbation if you dont have time.

are you available for freelancing

@FilippoMB
are you interesting in freelancing, i have a task. What is your email?

ESN predicting constant values in classification

Hi,
I'm using the ESNClassification to classify a binary outcome from data which is input in the format (N,T,V). I've played around with the hyperparameters but no matter the combination, I get either only ones or only zeroes.

Here is the config set-up:

`config['seed'] = 1
np.random.seed(config['seed'])

Hyperarameters of the reservoir

config['n_internal_units'] = 100 # size of the reservoir
config['spectral_radius'] = 0.4 # largest eigenvalue of the reservoir
config['leak'] = 0.5 # amount of leakage in the reservoir state update (None or 1.0 --> no leakage)
config['connectivity'] = 0.1 # percentage of nonzero connections in the reservoir
config['input_scaling'] = 0.1 # scaling of the input weights
config['noise_level'] = 0.01 # noise in the reservoir state update
config['n_drop'] = 0 # transient states to be dropped
config['bidir'] = False # if True, use bidirectional reservoir
config['circ'] = False # use reservoir with circle topology

Dimensionality reduction hyperparameters

config['dimred_method'] ='tenpca' # options: {None (no dimensionality reduction), 'pca', 'tenpca'}
config['n_dim'] = 40 # number of resulting dimensions after the dimensionality reduction procedure

Type of MTS representation

config['mts_rep'] = 'last' # MTS representation: {'last', 'mean', 'output', 'reservoir'}
config['w_ridge_embedding'] = 5.0 # regularization parameter of the ridge regression

Type of readout

config['readout_type'] = 'mlp' # readout used for classification: {'lin', 'mlp', 'svm'}

Linear readout hyperparameters

config['w_ridge'] = 5.0 # regularization of the ridge regression readout

SVM readout hyperparameters

config['svm_gamma'] = 0.005 # bandwith of the RBF kernel
config['svm_C'] = 5.0 # regularization for SVM hyperplane

MLP readout hyperparameters

config['mlp_layout'] = (40,1) # neurons in each MLP layer
config['num_epochs'] = 20 # number of epochs
config['w_l2'] = 0.05 # weight of the L2 regularization
config['nonlinearity'] = 'relu' # type of activation function {'relu', 'tanh', 'logistic', 'identity'}`

And the format of the input for training:
print(X_tr.shape) print(X_te.shape) print(y_tr.shape) print(y_te.shape)
with output:
(7801, 200, 130) (1132, 200, 130) (7801, 1) (1132, 1)

when I train on X_tr and y_tr, the output are either all ones or all zeroes. Any guidance on what could be causing this error?
Cheers

Input weights shape

I'm facing a problem understanding the architecture of the reservoir if we look at it as a neural network. In here the shape of the input for the reservoir "current_input" in the code is (N, V), and the shape of the _input_weights is (internal_units, V).
And normally, the shape of the _input_weights should be (size of reservoir* size of the input).
Can I kindly ask why the size of the _input_weights is (internal_units, V) and not (internal_units, V*N)?

TensorPCA yields complex data type array which causes error in Ridge module

Hi @FilippoMB,

I noticed that for the dataset that I'm using, the result of tensorPCA yields a complex data type Numpy array. This in turn causes an error in the ridge module which says that it does not support complex data type. Specifically, error ValueError: Complex data not supported is generated at https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing/blob/master/code/modules.py#L205

I don't face this issue when I use PCA with the same dataset.

I tried to print out the eigenvalue and eigenvector data type at https://github.com/FilippoMB/Time-series-classification-and-clustering-with-Reservoir-Computing/blob/master/code/tensorPCA.py#L33-L38 and both these vectors are of the data type, complex128, for the dataset I am using.

I tried Googling a bit and found some resources such as https://stackoverflow.com/questions/10420648/complex-eigen-values-in-pca-calculation and https://stackoverflow.com/questions/48695430/how-to-make-the-eigenvalues-and-eigenvectors-stay-real-instead-of-complex. From what I understood, due to some numerical error, the eigenvalues and eigenvectors can have a small imaginary value when linalg.eig is used. I'm not sure whether my understanding is correct.

Any thoughts on this?

filippomb / time-series-classification-and-clustering-with-reservoir-computing Goto Github PK

time-series-classification-and-clustering-with-reservoir-computing's Introduction

Time series classification and clustering with Reservoir Computing

Installation

Quick start

Overview of the framework

Classification

Clustering

Forecasting

Advanced examples

Datasets

Citation

License

time-series-classification-and-clustering-with-reservoir-computing's People

Contributors

Stargazers

Watchers

Forkers

time-series-classification-and-clustering-with-reservoir-computing's Issues

Hyperarameters of the reservoir

Dimensionality reduction hyperparameters

Type of MTS representation

Type of readout

Linear readout hyperparameters

SVM readout hyperparameters

MLP readout hyperparameters

Recommend Projects

Recommend Topics

Recommend Org