Giter Club home page Giter Club logo

Comments (5)

titu1994 avatar titu1994 commented on July 23, 2024

Please refer to the comments on Issue #4.

The dimension shuffle layer is utilized to reduce the univariate time series problem to a multivariate time series problem with one time step. Doing so reduces the model capacity of the LSTM block, therefore it alone is not a strong classifier. It, however, works in conjunction with the FCN block, which is the basic feature extractor.

Our motivation for doing so was multifold - regular LSTM severely overfits the simple classification problems of the UCR dataset and gets much lower accuracy than the SOTA, LSTM with dim shuffle alone severely underfits the task due to reduced capacity, FCN alone gets good performance but not as much as the concatenation of both the FCN and LSTM branch, and the necessity to process sequential information in a fast way without losing all semantics of sequential nature of data.

The Dim shuffle LSTM achieves fast training due to single timestep (for univariate problems, it becomes M time steps where M is the number of variables for a multivariate input time series), and augments the performance of the CNN.

As your work requires a multivariate input (2 variables), I suggest referring to our follow up work - https://arxiv.org/abs/1801.04503 which discusses the extension of this model to multivariate time series classification. The model architecture and training scripts are available at: https://github.com/titu1994/MLSTM-FCN

As to why we chose not to use a bidirectional LSTM, or a stack of LSTMs is because they simply overfit on the simple datasets of UCR, and that the additional capacity leads to the reduced overall performance of the LSTM-FCN model.

from lstm-fcn.

Goschjann avatar Goschjann commented on July 23, 2024

@titu1994 thanks a lot for your comprehensive answer, highly appreciated!

I started to question your approach, because in my case the dimension shuffle had a negative effect compared to a non-shuffling version! But sure, this can be due to my problem being more complicated than the UCR-problems. In this context (model is already too complex) for sure bilstm/stacking does not make sense from your perspective (though I found it to be performance-improving).

Again, thanks for your time and work!

from lstm-fcn.

shaform avatar shaform commented on July 23, 2024

I have a follow up question: What's the difference between
(1) Dimension shuffle + LSTM with 1 time step
(2) Simply feed the whole time series to fully-connected layers with tanh activation where the input size is the same as the the time steps of the input series.

from lstm-fcn.

titu1994 avatar titu1994 commented on July 23, 2024

We had similar questions from others as well, which is why we performed an extensive ablation study which can be found here - Insights into LSTM Fully Convolutional Networks for Time Series Classification.

In it, we replaced the dimension shuffled LSTM with dimension shuffled GRU, basic RNN and a fully connected layer with sigmoid activation function (which is similar to your (2), but with sigmoid instead of tanh).

We find that LSTM with dimension shuffle beats the rest in a large majority of cases. In addition, we find that the simple fully connected layer with sigmoid activation performs closer to the LSTM than all the other RNNs.

We used sigmoid, as 3 / 4 of LSTM activations are based on sigmoid, and think that the complex gating of LSTM is what boosts performance compared to a singular fully connected layer with sigmoid / tanh activation.

from lstm-fcn.

philippreis7 avatar philippreis7 commented on July 23, 2024

@titu1994 I am actually working with the MALSTM-FCN Architecture for a classification task and I am impressed by its performance so far. What I want to ask you is, do you think a sliding window would increase the performance? In time-series, it is often necessary to find temporal small patterns, wouldn't a sliding window helping finding them? Have you tesed it?

Thank you in Advance for a quick response!

from lstm-fcn.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.