Giter Club home page Giter Club logo

hctsa's Introduction

〰️ hctsa 〰️: highly comparative time-series analysis

hctsa is a software package for running highly comparative time-series analysis using Matlab (full support for versions R2018b or later).

The software provides a code framework that enables the extraction of thousands of time-series features from a time series (or a time-series dataset). It also provides a range of tools for visualizing and analyzing the resulting time-series feature matrix, including:

  1. Normalizing and clustering the data,
  2. Producing low-dimensional representations of the data,
  3. Identifying and interpreting discriminating features between different classes of time series,
  4. Learning multivariate classification models.

Feel free to email me for help with real-world applications of hctsa 🤓

Acknowledgement 👍

If you use this software, please read and cite these open-access articles:

Feedback, as email, github issues or pull requests, is much appreciated.

For commercial use of hctsa, including licensing and consulting, contact Engine Analytics.

Getting Started 😊

Documentation 📖

Comprehensive documentation for hctsa, from getting started through to more advanced analyses is on gitbook.

Downloading the repository ⬇️

For users unfamiliar with git, the current version of the repository can be downloaded by simply clicking the green Clone or download button, and then clicking Download .zip.

It is recommended to use the repository with git. For this, please make a fork of it, clone it to your local machine, and then set an upstream remote to keep it synchronized with the main repository e.g., using the following code:

git remote add upstream git://github.com/benfulcher/hctsa.git

(make sure that you have generated an ssh key and associated it with your Github account).

You can then update to the latest stable version of the repository by pulling the master branch to your local repository:

git pull upstream master

For analyzing specific datasets, we recommend working outside of the repository so that incremental updates can be pulled from the upstream repository. Details on how to merge the latest version of the repository with the local changes in your fork can be found here.

Related resources

CompEngine 💥

CompEngine is an accompanying web resource for this project. It is a self-organizing database of time-series data that allows users to upload, explore, and compare thousands of diverse types of time-series data. This vast and growing collection of time-series data can also be downloaded. You can read more about it in our 📙preprint.

catch22 2️⃣2️⃣

Is over 7000 just a few too many features for your application? Do you not have access to a Matlab license? catch22 has all of your faux-rhetorical questions covered. This reduced set of 22 features, determined through a combination of classification performance and mutual redundancy as explained in this paper, is available here as an efficiently coded C implementation with wrappers for python and R.

hctsa datasets and example workflows 💾

There are a range of open datasets with pre-computed hctsa features, as well as some examples of hctsa workflows.

(If you have data to share and host, let me know and I'll add it to this list)

Running hctsa on a cluster 💻

Matlab code for computing features for an initialized HCTSA.mat file, by distributing the computation across a large number of cluster jobs (using pbs or slurm schedulers) is here.

Publications 📕

Here we provide a list of publications that have used hctsa.

Our publications

Where journal articles (📗) are not open access, we also provide a link to the preprint (📙). Links to Github code repositories (:octocat:) are provided where appropriate.

The development of hctsa and other resources for feature-based time-series analysis

See the following publications for details of how the highly-comparative approach to time-series analysis has developed since our initial publication in 2013:

Applications of hctsa

We have used hctsa to:

Other Publications

hctsa has been used to:

(Let me know if I've missed any!)

hctsa licenses

Internal licenses

There are two licenses applied to the core parts of the repository:

  1. The framework for running hctsa analyses and visualizations is licensed as the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. A license for commercial use is available from Engine Analytics.

  2. Code for computing features from time-series data is licensed as GNU General Public License version 3.

A range of external code packages are provided in the Toolboxes directory of the repository, and each have their own associated license (as outlined below).

External packages and dependencies

The following Matlab toolboxes are used by hctsa and are required for full functionality of the software. In the case that some toolboxes are unavailable, the hctsa software can still be used, but only a reduced set of time-series features will be computed.

  1. Statistics Toolbox
  2. Signal Processing Toolbox
  3. Curve Fitting Toolbox
  4. System Identification Toolbox
  5. Wavelet Toolbox
  6. Econometrics Toolbox

The following time-series analysis packages are provided with the software (in the Toolboxes directory), and are used by our main feature extraction algorithms to compute meaningful structural features from time series:

Other time-series analysis resources

Other good resources for time-series analysis, e.g., in other programming languages (python and R) are listed here.

pyopy

This excellent repository allows users to run hctsa software from within python: pyopy.

hctsaAnalysisPython

Some beginner-level python code for analyzing the results of hctsa calculations is here.

Generating time-series data from synthetic models

A Matlab repository for generating time-series data from diverse model systems is here.

tsfresh

Native python time-series code to extract hundreds of time-series features, with in-built feature filtering, is tsfresh; cf. their paper.

tscompdata and tsfeatures

These R packages are by Rob Hyndman. The first, tscompdata, makes available existing collections of time-series data for analysis. The second, tsfeatures, includes implementations of a range of time-series features.

TSFEL

TSFEL, 'Time Series Feature Extraction Library', is a python package with implementations of 60 simple time-series features (with unit tests).

Khiva

Khiva is an open-source library of efficient algorithms to analyse time series in GPU and CPU.

pyunicorn

A python-based nonlinear time-series analysis and complex systems code package, pyunicorn.

TSFuse (python)

TSFuse can extract features from multivariate time series.

Acknowledgements 👋

Many thanks go to Romesh Abeysuriya for helping with the mySQL database set-up and install scripts, and Santi Villalba for lots of helpful feedback and advice on the software.

hctsa's People

Contributors

benfulcher avatar brendanjohnharris avatar chlubba avatar jamesmccormac avatar krishnaprajeeth avatar lukaslcf avatar philiphorst avatar randoruf avatar sarabsethi avatar vp007-py avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.