robromijnders / ae_ts Goto Github PK

Auto encoder for time series

License: MIT License

Python 100.00%

ae_ts's Introduction

Auto encoder for time series

EDIT 3 December 2018, I receive many questions over email. I compiled the most common questions into a FAQ at the end of this readme

This repo presents a simple auto encoder for time series. It visualizes the embeddings using both PCA and tSNE. I show this on a dataset of 5000 ECG's. The model doesn't use the labels during training. Yet, the produced clusters visually separate the classes of ECG's.

People repeatedly ask me how to find patterns in time series using ML. The usual wavelet transforms and other features fail to yield results. They wonder what ML has to offer.

For categorical data, a usual choice are techniques like PCA, tSNE.
For images, a usual choice is convolutional auto encoders
For time series, what is the usual choice?
- This repo implements a recurrent auto encoder

Why use a Recurrent Neural Network in an auto encoder?

The length of time series may vary from sample to sample. Conventional techniques only work on inputs of fixed size.
The patterns in time series can have arbitrary time span and be non stationary. The recurrent neural network can learn patterns in arbitrary time scaling. The convolutional net, however, assumes only stationary patterns

The network

From here on, RNN refers to our Recurrent Neural Network architecture, the Long Short-term memory Our network in AE_ts_model.py has four main blocks

The encoder is a RNN that takes a sequence of input vectors
The encoder to latent vector is a linear layer that maps the final hidden vector of the RNN to a latent vector
The latent vector to decoder is a linear layer that maps the latent vector to the input vector for the decoder
The decoder is a RNN that takes this single input vector and maps to a sequence of output vectors

Training Objective

An auto encoder learns the identity function, so the sequence of input and output vectors must be similar. In our case, we take a probabilistic approach. Every output is a tuple of a mean, mu and standard deviation. Let this mu and sigma parametrize a Gaussian distribution. Now we minimize the log-likelihood of the input under this distribution. We train this using backpropagation into the weights of the encoder, decoder and linear layers.

Example data

I showcase the recurrent auto encoder on a dataset of 5000 ECG's. Accurately named ECG5000 on the UCR archive. I choose ECG, because humans understand them easily. Yet, their complexity remains challenging enough for a machine learning model.

Here are some examples, each column represents another input class

Results

We run the recurrent auto encoder with a 20D latent space. The following figure plots the latent vectors with both PCA and tSNE.

This figure shows that the latent space exhibits structure. We color the vectors with their corresponding labels. The light blue and dark blue labels obviously cluster in different parts of the space. Interestingly, the lower left corner in the tSNE shows another cluster of orange points. That might be interesting for doctors to look at. (Note that the class distributions are highly unbalanced. The orange and greeen colored data occur less frequently)

Conclusion

We present an auto encoder that learns structure in the time-series. Training is unsupervised. When we color the latent vectors with the actual labels, we show that the structure makes sense.

FAQ

To my great joy, I receive many questions and suggestions over email. I compiled some of the commonly asked questions so you can get started quickly

How can I use the representations for other purposes than visualization? After training, you can fetch the representations by running sess.run(model.z_mu, feed_dict=my_feed_dict)
I get an import error. What versions do you use? See the docs/requirements.txt file for all versions
How could I extract classes from the representations? The auto-encoder framework belongs to unsupervised learning. Hence, classes will only follow from some sort of clustering. You can apply a clustering model to the hidden representations. Or you could implement another model that naturally clusters time series, for example neural expectation maximization or simply HMM's. Moreover, if you do have supervision for your data, then I recommend you to use supervised model. For example, a linear classification model, linear dynamical systems or a normal recurrent/convolutional neural network.

The loss function on the latent space resembles the VAE loss function. How does your model differ from the VAE? For clarity, this question usually refers to the loss in tf.reduce_mean(tf.square(lat_mean) + lat_var - tf.log(lat_var) - 1). I see two immediate differences with the VAE

	* The VAE follows from amortized inference on a latent variable model. All terms in the VAE model have a probabilistic interpretation. In contrast, our auto encoder learns according to the maximum likelihood principle. We implement this loss functions only to improve our visualization.
	* The VAE penalizes the KL divergence with the prior for each representation. In contrast, we penalize the KL divergence with the marginal distribution on the representations. In other words, the VAE *wants* each representation to have zero mean and unit variance; our auto encoder want all representations marginally to have zero mean and unit variance.

Please let me know if I forgot your questions in this FAQ section

As always, I am curious to any comments and questions. Reach me at [email protected]

ae_ts's People

Contributors

Stargazers

Watchers

Forkers

ricky1203 nhanitvn 0ptimiz3dprime allen9408 jazzman37 jfrydendall jayvischeng abhishek31704 zeyu-h jbrownuf07 winggy shizifan eulertech schaelle hongminwu zhukaisjtu kevinmtian nicolovendramin theorm g-wang kaoboyandy rogerzhangsc tzebin wangfe hot-cheeto dwujellyfish kinect59 huanghuayh birajaghoshal hexene jiwoncpark samithaj aascode deeep-learning afcarl kstanski laox1ao jinzhilin shubhampachori12110095 leonlee723 bemoregt leilin-research aldrichleo vishalbelsare huqicheng tungk ahoyosid cetinhasari anayaj data-man-34 lcfenglinwan elephann shulinway carlinix edgbr lif3line lcwy220 rockycamp ahlfors allensmile chenxingqiang gzqhappy sudheerexperiments guopengshen0613 xcbat tcrapse ntran16 mcsuc jahdoos nickxjl sixuerain exp-time-series-tools engheta prsharma-dl david-ml xh256 gyyixr berysaidi abdulhady-feteiha simonfahle michaeldoron lyndonckz birdflyto sundarviswanathan oceanlight massymeniche peter943 rauswarn pengchen233 yanqiuyan llmhao foundersix t-triobox nicl-nno zcmail ykchong45 fuyuan-li raamraam fdoperezi pankajkarman

ae_ts's Issues

Training Objective

Hi, thanks for posting this code!

I'm not sure I understand the training objective you're using - is it a variational auto-encoder?

Is loss_seq the Kullback-Libeler divergence - line 141

and loss_lat_batch a reconstruction loss - line 113

If you've got a link to a paper or book which describes you code that would be really appreciated.

Thanks very much for you help,

Interpreting Output of main.py

Hi Rob,

Very interesting work. Just wondering, once I run main.py successfully and it outputs a bunch of files, how do I process these files to actually get to identifying the clusters of variables?

Thanks,
Lou

Decoder Initial State and Input

Hi @RobRomijnders ,
Thanks for sharing your code. I am very new to RNNs so forgive me for a silly question. In your decoder I am a little confused about the inputs and initial state. Shouldn't the input be the one obtained from the encoder (the latent space)? It seems in your code

initial_state_dec = tuple([(z_state, z_state)] * num_layers)
dec_inputs = [tf.zeros([batch_size, 1])] * sl 
outputs_dec, _ = tf.contrib.rnn.static_rnn(cell_dec, inputs=dec_inputs,
                                                       initial_state=initial_state_dec)

the initial state is obtained from the encoder and not the inputs? So if you could please explain me the reason for that?

reconstructed h_out are basically identical, any help?

hello, I have downloaded the code here, and tried on my time-series dataset (time-series contains last 150 days data, and has about 10000 rows). I basically followed your default settings, and when I tried to print and plot the reconstructed data h_out (h_mu and h_sigma), I found in my experiment, all the reconstructed h_out are basically identical, and the time series of the h_mu is basically the mean of the input time series.

Also from log, it can be found that the 'loss_seq' is nearly dominate the losses.
Following are some logs:
At 0 / 200 train (3.173, 1.425, 1.748), val (3.112, 1.421,1.691) in order (total, seq, lat)
At 20 / 200 train (1.430, 1.406, 0.024), val (1.414, 1.412,0.001) in order (total, seq, lat)
At 40 / 200 train (1.441, 1.427, 0.014), val (1.412, 1.411,0.001) in order (total, seq, lat)
At 60 / 200 train (1.422, 1.415, 0.007), val (1.414, 1.412,0.002) in order (total, seq, lat)
At 80 / 200 train (1.432, 1.420, 0.012), val (1.418, 1.417,0.001) in order (total, seq, lat)
At 100 / 200 train (1.412, 1.404, 0.008), val (1.402, 1.402,0.001) in order (total, seq, lat)
At 120 / 200 train (1.413, 1.404, 0.010), val (1.383, 1.381,0.002) in order (total, seq, lat)
At 140 / 200 train (1.391, 1.384, 0.007), val (1.407, 1.406,0.001) in order (total, seq, lat)
At 160 / 200 train (1.431, 1.423, 0.008), val (1.329, 1.328,0.002) in order (total, seq, lat)
At 180 / 200 train (1.374, 1.368, 0.006), val (1.393, 1.392,0.001) in order (total, seq, lat)

Any ideas about the reason here? Overfitting somewhere or I misunderstood anything here? Thanks.

The code sends the labels obtained from the dataset to the plot_z_run function

Isn't it supposed to be Unsupervised?
What if we had datasets with no labels and we wanted to visualize the resultant clusters.
Where are they? How do we obtain them?

@RobRomijnders @tejaslodaya

How to get decoded output with same dimension as input?

Hi thanks for the great work, I am currently trying to detect outliers from the network output. Basically I want to detect samples with high reconstruction error. May I know how to obtain the decoded outputs?
Thanks in advance.

Tensorboard cannot show the projection embedding

Thanks a lot for the great project.

When i open the Tensorboard, the projector cannot show any embedding. May i ask what i can do to correct this issue?

Many thanks

tensorflow

in ()
1 """Training time!"""
----> 2 model = Model(config)
3 sess = tf.Session()
4 perf_collect = np.zeros((2,int(np.floor(max_iterations/plot_every))))
5

~/AE_ts_model.py in init(self, config)
105 initial_state_enc = cell_enc.zero_state(batch_size, tf.float32)
106
--> 107 outputs_enc, _ = tf.contrib.legacy_seq2seq.rnn_decoder(tf.unstack(self.x_exp, axis=2), initial_state_enc, cell_enc)
108 cell_output = outputs_enc[-1] # Only use the final hidden state #tensor in [batch_size,hidden_size]
109 with tf.name_scope("Enc_2_lat") as scope:

~/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py in rnn_decoder(decoder_inputs, initial_state, cell, loop_function, scope)
150 if i > 0:
151 variable_scope.get_variable_scope().reuse_variables()
--> 152 output, state = cell(inp, state)
153 outputs.append(output)
154 if loop_function is not None:

~/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in call(self, inputs, state, scope)
711 self._recurrent_input_noise,
712 self._input_keep_prob)
--> 713 output, new_state = self._cell(inputs, state, scope)
714 if _should_dropout(self._state_keep_prob):
715 new_state = self._dropout(new_state, "state",

~/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in call(self, inputs, state, scope)
951 state, [0, cur_state_pos], [-1, cell.state_size])
952 cur_state_pos += cell.state_size
--> 953 cur_inp, new_state = cell(cur_inp, cur_state)
954 new_states.append(new_state)
955 new_states = (tuple(new_states) if self._state_is_tuple else

~/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in call(self, inputs, state, scope)
396 with _checked_scope(self, scope or "lstm_cell",
397 initializer=self._initializer,
--> 398 reuse=self._reuse) as unit_scope:
399 if self._num_unit_shards is not None:
400 unit_scope.set_partitioner(

~/anaconda3/lib/python3.6/contextlib.py in enter(self)
80 def enter(self):
81 try:
---> 82 return next(self.gen)
83 except StopIteration:
84 raise RuntimeError("generator didn't yield") from None

~/anaconda3/lib/python3.6/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py in _checked_scope(cell, scope, reuse, **kwargs)
75 "this error will remain until then.)"
76 % (cell, cell_scope.name, scope_name, type(cell).name,
---> 77 type(cell).name))
78 else:
79 weights_found = False

ValueError: Attempt to reuse RNNCell <tensorflow.contrib.rnn.python.ops.core_rnn_cell_impl.LSTMCell object at 0x12084c470> with a different variable scope than its first use. First use of cell was with scope 'Encoder/rnn_decoder/multi_rnn_cell/cell_0/lstm_cell', this attempt is with scope 'Encoder/rnn_decoder/multi_rnn_cell/cell_1/lstm_cell'. Please create a new instance of the cell if you would like it to use a different set of weights. If before you were using: MultiRNNCell([LSTMCell(...)] * num_layers), change to: MultiRNNCell([LSTMCell(...) for _ in range(num_layers)]). If before you were using the same cell instance as both the forward and reverse cell of a bidirectional RNN, simply create two instances (one for forward, one for reverse). In May 2017, we will start transitioning this cell's behavior to use existing stored weights, if any, when it is called with scope=None (which can lead to silent model degradation, so this error will remain until then.)

robromijnders / ae_ts Goto Github PK

ae_ts's Introduction

Auto encoder for time series

Why use a Recurrent Neural Network in an auto encoder?

The network

Training Objective

Example data

Results

Conclusion

FAQ

ae_ts's People

Contributors

Stargazers

Watchers

Forkers

ae_ts's Issues

Recommend Projects

Recommend Topics

Recommend Org