arundo / tsaug Goto Github PK

View Code? Open in Web Editor NEW

341.0 11.0 37.0 21.91 MB

A Python package for time series augmentation

Home Page: https://tsaug.readthedocs.io

License: Apache License 2.0

Python 100.00%

time-series data-augmentation deep-learning audio

tsaug's Introduction

tsaug

tsaug is a Python package for time series augmentation. It offers a set of augmentation methods for time series, as well as a simple API to connect multiple augmenters into a pipeline.

See https://tsaug.readthedocs.io complete documentation.

Installation

Prerequisites: Python 3.5 or later.

It is recommended to install the most recent stable release of tsaug from PyPI.

pip install tsaug

Alternatively, you could install from source code. This will give you the latest, but unstable, version of tsaug.

git clone https://github.com/arundo/tsaug.git
cd tsaug/
git checkout develop
pip install ./

Examples

A first-time user may start with two examples:

Examples of every individual augmenter can be found here

For full references of implemented augmentation methods, please refer to References.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

Please see Contributing for more details.

License

tsaug is licensed under the Apache License 2.0. See the LICENSE file for details.

tsaug's People

Contributors

Stargazers

Watchers

tsaug's Issues

How to augment multi_variate time series data?

I noticed that while augmenting multi-variate time series data, augmented data is concatenated on 0 axes, instead of being added to a new axis ie third axis.
Let suppose data shape is (18,1000), after augmentation it turns to be (72,1000), but i believe it should be (4,18,1000). simply reshaping data.reshape(4,18,1000) resolve the problem or not?

can't find the deepad python package

In the quickstart notebook https://github.com/arundo/tsaug/blob/master/docs/quickstart.ipynb
from deepad.visualization import plot
where can you find the deepad package to install?

Equivalence in transformation names

Hello

I'm very interested to use and apply Tsaug library in my personal project.

I have read the paper "Data Augmentation ofWearable Sensor Data for Parkinson’s
Disease Monitoring using Convolutional Neural Networks" and I'm quite confused about the name of the transformations.

What are the equivalent in TSAUG library for the transformations Jittering, Scaling, rotation, permutation, MagWarp mentioned in this paper?

Also, I have read the blog "https://www.arundo.com/arundo_tech_blog/tsaug-an-open-source-python-package-for-time-series-augmentation", and I didn´t find the equivalent for RandomMagnify, RandomJitter, etc.

Could you help me with these doubts.

Best regards

Oscar

Missing function calls in documentation

Hi!

I noticed that documentation is actually missing few important notes.

For instance, first example contains such snippet:

>>> import numpy as np
>>> X = np.load("./X.npy")
>>> Y = np.load("./Y.npy")
>>> from tsaug.visualization import plot
>>> plot(X, Y)

and shows a chart which suggests that it is immediately rendered after calling plot function.

In configurations I've seen and worked on, plot function does not render any chart immediately. Instead it returns Tuple[matplotlib.figure.Figure, matplotlib.axes._axes.Axes]. This means that we need to take first element of returned tuple and call .show() on it, so this example should rather be:

>>> import numpy as np
>>> X = np.load("./X.npy")
>>> Y = np.load("./Y.npy")
>>> from tsaug.visualization import plot
>>> figure, _ = plot(X, Y)
>>> figure.show()

I can create a push request with such corrections if you're open for contribution

How to cite this repo?

Basically the title.
I used this awesome repo and I would like to cite this repo in my paper. How to do it.
If you could provide a bibtex entry that will be great

Static random augmentation across multiple time series

Hello,

I have a use case where I apply temporal augmentation with the same random anchor across multiple time series within a segmented object. I.e., I want certain augmentations to vary across objects, but remain constant within objects.

In TimeWarp, e.g., I've added an optional keyword argument (static_rand):

    def __init__(
         self,
         n_speed_change: int = 3,
         max_speed_ratio: Union[float, Tuple[float, float], List[float]] = 3.0,
         repeats: int = 1,
         prob: float = 1.0,
         seed: Optional[int] = _default_seed,
         static_rand: Optional[bool] = False
     ):

which is used by:

         if self.static_rand:                                                                                                                      
             anchor_values = rand.uniform(low=0.0, high=1.0, size=self.n_speed_change + 1)
             anchor_values = np.tile(anchor_values, (N, 1))
         else:
             anchor_values = rand.uniform(
                 low=0.0, high=1.0, size=(N, self.n_speed_change + 1)
             )

Thus, instead of having N time series with different random anchor_values, I generate N time series with the same anchor value.

I use this approach with TimeWarp and Drift. Would this be of any interest as a PR, or does it sound too specific?

Thanks for the nice library.

ValueError: The numbers of series in X and Y are different.

The shape of X is (54, 337) and the shape of y is (54,).
But I am getting error. I am using the following code

from tsaug import TimeWarp, Crop, Quantize, Drift, Reverse
my_augmenter = (
    TimeWarp() * 5  # random time warping 5 times in parallel
    + Crop(size=300)  # random crop subsequences with length 300
    + Quantize(n_levels=[10, 20, 30])  # random quantize to 10-, 20-, or 30- level sets
    + Drift(max_drift=(0.1, 0.5)) @ 0.8  # with 80% probability, random drift the signal up to 10% - 50%
    + Reverse() @ 0.5  # with 50% probability, reverse the sequence
)
data, labels = my_augmenter.augment(data, labels)

Let's use type hints!

We like to start using type hints for better practice of python programming. For a first-time contributor, this is probably a nice starting point, as you will go through every part of the code base and familiarize yourself with the code structure.

To-do's:

Add type hints to all functions.
Modify docstrings accordingly, so sphinx-autodoc will automatically grab type info from type hints.
Add unit tests (with mypy?) for type checking.

Have fun!

How to understand normalisation in add_noise and how it is achieved？

Hello, I'd like to ask about the normalize parameter in add_noise, I'm using it so that it is True, I'm finding where to use this parameter in add_noise, and I can't figure out how to do the normalisation by noise*(X.max(axis=1)-X.min(axis=1)), I think it might be using the is max-min normalisation, but the query says that the max-min normalisation formula is X*=(X-min(X))/(max(X)-min(X)). How should I understand this place, and the role of normalise.
f self.kind == "additive": if self.normalize: X_aug = X + noise * ( X.max(axis=1, keepdims=True) - X.min(axis=1, keepdims=True) ) else: X_aug = X + noise

Default _Augmentor arguments will raise an error

While working on #1 I found that the default args for initializing an _Augmentor object could lead to the code trying to call None when expecting a function.

See:

tsaug/src/tsaug/augmentor.py

Line 5 in ebf1955

def __init__(self, augmentor_func=None, is_random=None, **kwargs):

tsaug/src/tsaug/augmentor.py

Line 6 in ebf1955

self._augmentor_func = augmentor_func

and

tsaug/src/tsaug/augmentor.py

Line 47 in ebf1955

X_aug = self._augmentor_func(X_aug, **self._params)

I know that it's not intended to be initialized without an augmenter function, function, but I was wondering if you want to explicitly prevent an error here.

Or is something else supposed to be happening?

Data augmentation for regression

First of all, many thanks to issue this open source library! I have been trying to model a time series regression with a limited time series data set. In particular, I have only 5000 data points (20 years of market/economic data) to model my companies returns. I have been searching how to augment time series and I came across this library. However, while checking the augmentation functions, I noticed that target has to be binary. So would it be possible to use this library for continuous target values?

Many thanks in advance!

Regards,
Mehmet

_Augmenter should be exposed properly as tsaug.Augmenter

Might be related to #1

In the current state of the package, the _Augmenter class is an internal class that should not be used outside of the package itself... but it's also the base class for all usable classes from tsaug. This makes it very weird to type "generic" functions outside of tsaug, e.g.

# this should not appear in a normal Python code
from tsaug._augmenters.base import _Augmenter

def apply_transformation(aug: _Augmenter):
    ...

The _Augmenter class should be exposed as tsaug.Augmenter so that it can be used for proper typing outside of the tsaug package.

arundo / tsaug Goto Github PK

tsaug's Introduction

tsaug

Installation

Examples

Contributing

License

tsaug's People

Contributors

Stargazers

Watchers

Forkers

tsaug's Issues

Recommend Projects

Recommend Topics

Recommend Org