philipperemy / cond_rnn Goto Github PK

View Code? Open in Web Editor NEW

223.0 223.0 34.0 13.33 MB

Conditional RNNs for Tensorflow / Keras.

License: MIT License

Python 100.00%

deep-learning lstm rnn tensorflow time-series

cond_rnn's Introduction

Solving Artificial Intelligence one step at a time 👋

Are you an individual / company willing to invest in open source? Become a sponsor!

cond_rnn's People

Contributors

Stargazers

Watchers

cond_rnn's Issues

CondLSTM with Embedding layer

Hello and first of all thank you very much for your work!
I want to use injection layer for categorical features before using CondLSTM like below and get error

forward_layer = ConditionalRecurrent(LSTM(units=256, return_sequences=True))
backward_layer = ConditionalRecurrent(LSTM(units=256, return_sequences=True, go_backwards=True))

i1 = Input(shape=(24, 14))
ic_1 = Input(shape=(4,))
norm = normalizer(i1)
v = vectorize_layer(ic_1)
embeding = Embedding(49, 4, input_length=4)(v)
inputs = (norm, embeding)
x = Bidirectional(layer=forward_layer,
                  backward_layer=backward_layer)(inputs)
x = Flatten()(x)
x = Dropout(.25)(x)
output = Dense(units=4, activation='linear')(x)
model = keras.Model([i1, ic_1], output)

in user code:

File "/usr/local/lib/python3.8/dist-packages/cond_rnn/cond_rnn.py", line 86, in call  *
    cond = self._standardize_condition(cond[0])
File "/usr/local/lib/python3.8/dist-packages/cond_rnn/cond_rnn.py", line 54, in _standardize_condition  *
    raise Exception('Initial cond should have shape: [2, batch_size, hidden_size] '

Exception: ('Initial cond should have shape: [2, batch_size, hidden_size] or [batch_size, hidden_size]. Shapes do not match.', TensorShape([None, 4, 4]))

Call arguments received by layer 'forward_conditional_recurrent_10' (type ConditionalRecurrent):
• inputs=('tf.Tensor(shape=(None, 24, 14), dtype=float32)', 'tf.Tensor(shape=(None, 4, 4), dtype=float32)')
• training=None
• kwargs=<class 'inspect._empty'>

Encoder-Decoder

When I try to use Encoder-Decoder LSTM model, I encounnter this error.
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 10), dtype=tf.float32, name='input_18'), name='input_18', description="created by layer 'input_18'") at layer "conditional_recurrent_3". The following previous layers were accessed without issue: []

Adding dropout layer to stacked conditional RNNs

I'm trying to build the following architecture inspired from the stacked lstm example code in the repository. The only difference is that I also include Dropout layers between two stacked LSTM layers.

x = ConditionalRecurrent(LSTM(64,
                              batch_input_shape=(batchSize, num_samples, num_features), 
                              activation='tanh', #'relu', 
                              return_sequences=True, stateful=stateful))([i, c])
x = Dropout(0.2)(x)    
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=stateful))([x, c])
x = Dropout(0.2)(x)    
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=stateful))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)

This however ends up giving me the below error due to the Dropout layer:

Exception encountered when calling layer "dropout_114" (type Dropout).

Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor.

Call arguments received:
  • inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>
  • training=False
Traceback (most recent call last):
  File "<ipython-input-111-1334bb85e127>", line 295, in run_LSTM
    history, model, Y_pred = train_and_get_predictions(initial_layer, model_x_train, Y_train, model_x_test)
  File "<ipython-input-111-1334bb85e127>", line 169, in train_and_get_predictions
    trainingModel = createModel_v2(initial_layer,
  File "<ipython-input-111-1334bb85e127>", line 130, in createModel_v2
    x = Dropout(0.2)(x)
  File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py", line 102, in convert_to_eager_tensor
    return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Exception encountered when calling layer "dropout_114" (type Dropout).

Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor.

Call arguments received:
  • inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>
  • training=False

Could you help me understand if adding Dropout layers is supported and I'm doing it wrong, or its known that its not supported? Thanks! Happy to provide more context if needed.

Error with adding Embedding layer before ConditionalRecurrent

Hi @philipperemy, thank you for providing this library! I'm faced with an error with adding embedding layer before ConditionalRecurrent, could you help take a look? perhaps I'm not using it correctly. Appreciate your time and input! here is my code:

model = Sequential()
model.add(Embedding(input_dim=5, output_dim=4, input_length=35))
model.add(ConditionalRecurrent(GRU(units=64, return_sequences=True)))
model.add(Flatten())
model.add(Dense(units=6, activation='linear'))

the error is:
----> [3] model.add(ConditionalRecurrent(GRU(units=64, return_sequences=True)))
AssertionError: Exception encountered when calling layer "conditional_recurrent_52" (type ConditionalRecurrent).
in user code:
File "~/local/lib/python3.9/site-packages/cond_rnn/cond_rnn.py", line 74, in call *
assert isinstance(inputs, (list, tuple)) and len(inputs) >= 2
AssertionError:
Call arguments received by layer "conditional_recurrent_52" (type ConditionalRecurrent):
• inputs=tf.Tensor(shape=(None, 35, 4), dtype=float32)
• training=None
• kwargs=<class 'inspect._empty'>

[QUESTION] how to predicting future unseen dataframe?

Hello, I have tried "lstm.py" example files.

In the example, the time start from 1995-01-01 until 2020-05-13.
Now, how to predict the unseen dataframe? Start from 2020-05-14 until desired time?

If I predict it with normal pure autoregressive GRU model. I was success. But, I dont know how to be predicting multivariate time series in the future? since I dont have any condition dataframe after 2020-05-13.

[QUESTION] Difference between ConditionalRNN and Other Approach

Hi,

First of all, thanks for the package that allows to use time series data (time variant) with time invariant data (conditions). I think there are a lot of cases/scenarios where both types of data are needed to use at same time.

I'm having a similar situation, where I want to use both types of data, where I have my time series data and currently using a LSTM and I have my conditions (time invariant features) and want to incorporate them into the problem.

Obviously, the most straight forward solution was to duplicate the conditions into the time series, and have them not change over time. I had no problem doing this, and it did boost my performance a lot (compared to only using conditions/time invariant features alone), however, I also read some papers which share the thought on not doing that combination, because it pollute the time variant information with time invariant data and it might just create a harder problem to solve.

I did read in other papers to not used the above approach, one of them was to modify/specify the hidden state an initial step and feed the time invariant into it. And I was trying to do this in my own (I did not find your package before that) and come up with the following model definition (example):

lstm_input = keras.Input(
    shape=(3,3), name="lstm_input"
)  # Variable-length sequence of ints
pre_go_live_input_hidden_states = keras.Input(shape=(3,), name="pre-go-live")  # Variable-length sequence of ints


pre_go_live_fc_layer_1 = keras.layers.Dense(50, activation='relu')(pre_go_live_input_hidden_states)
# Reduce sequence of embedded words in the title into a single 128-dimensional vector
lstm_features = LSTM(50)
outputs = lstm_features(lstm_input, initial_state=[pre_go_live_fc_layer_1,pre_go_live_fc_layer_1])

# Stick a department classifier on top of the features
pred = keras.layers.Dense(1, name="prediction_layer")(outputs)

# Instantiate an end-to-end model predicting both priority and department
model = keras.Model(
    inputs=[lstm_input, pre_go_live_input_hidden_states],
    outputs=[pred]
)

Here, my conditions are the pre_go_live_input_hidden_states variable, which need a Dense layer afterwards to reduce the data to 50, so it can be added to the initial_state of the LSTM (ignore the part that I'm also passing the cell state initial values).

My question is regarding the difference between the previous model definition, and how ConditionalRNN works behind scenes? Is it similar to it?

I'm having trouble understanding the ConditionalRecurrent class, mostly around the treatment giving to the conditions input.

My current time series data is in the following shape: (29000, 7, 14). And my conditions data shape is: (29000,124), which of those 124 features, some of them are encoded variables and numerical variables.

Is it possible to use your package, and still use the 124 conditions to initialize the hidden state at initial step?

Thanks,
Francisco Parrilla A

Sequential vs Functional API. Drop in model accuracy

Hi, I am doing research on time-series forecasting for electric load. I am quite new to tensorflow, so please excuse me if make errors.
I came across your library and thought it was very interesting! I am building a 24-ahead MIMO forecasting model. I have noticed that pure LSTM may disregard features that I considered important, so I have experimented with this framework.
I have tried using conditional RNN cell on the meter data using the examples in the repository. For auxillary feautures I used features of the predicted time step (e.g. one hot encoded day of week) and important lags as features.
First, writing and fitting the sequential model:

ConditionalRNN = Sequential(layers=[ConditionalRNN(32, cell='LSTM'),
                                    Dense(HORIZON)])
ConditionalRNN.compile(optimizer=tf.optimizers.Adam(), loss='mse', metrics=[tf.metrics.MeanSquaredError()])
history = ConditionalRNN.fit([train_inputs['X'],train_c1,train_c2], train_inputs['target'], batch_size=32, epochs=MAX_EPOCHS, validation_split=0.15, callbacks=[early_stopping], verbose=1)

The model works fine and shows better accuracy than the LSTM with the full data passed through it.
However, I was getting this warning from tensoflow:

WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'tuple'> input: (<tf.Tensor 'IteratorGetNext:0' shape=(None, 24, 5) dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=(None, 8) dtype=float64>, <tf.Tensor 'IteratorGetNext:2' shape=(None, 2) dtype=float64>)
Consider rewriting this model with the Functional API.

I have rewritten the same model in a functional API way:

i = Input(shape=[HORIZON, 5], name='time-series')
c1 = Input(shape=[8], name='one-hot')
c2 = Input(shape=[2], name='lags-as-features')
x = ConditionalRNN(32, cell='LSTM', name='cond_rnn_0')([i, c1, c2])
#x = Dense(HORIZON)(x)
#ConditionalRNN = Model(inputs=[i, c1, c2], outputs=[x])

This way there has been no warnings but the accuracy has significantly dropped down.
Do you have any explanation on why it might be the case?

Basic conditional LSTM

Hello,

I currently have a very basic LSTM NN, which predicts the next array of n_features, from the past 10 time steps, i.e.

model = Sequential()
model.add(LSTM(n_lookback, input_shape=(n_lookback, n_features)))
model.add(Dense(n_features))
model.compile()

where n_features is the number of features in each tilmestep and n_lookback is the number of previous time steps provided to the LSTM.
Since I am trying to also add conditions that are not time dependent, I thought I'd try your cond_RNN.
I am having some issues starting out though, since the example are a bit to complex for me, I have only started a few months ago working with NN and Keras.
The condition would be a float, i.e. a value of a physical property, such as 3.1 or 4.5 etc.
Could you offer any help on how to start out?

Thanks in advance. You have already been so helpful with he MacOS import issue 😊

Use regularization method in cond_rnn

Thank you so much for your work, very excellent library. I'm trying to use this library in my project, but I encounter a problem: I'd like to use regularizer method such as 'kernel_regularizer' in ConditionalRNN layer, but I found it can't support regularizer method in ConditionalRNN. I check the tensorflow document, it seems that "tf.keras.layers.rnn" doesn't have "kernel_regularizer" argument, so may I ask how to use regularizer in ConditionalRNN? Thanks

Failed to load model with CondRNN layer

I have built a net that contains a CondRNN layer and save the compiled model using model.save('./cond_rnn_lstm.h5'). But when I load the saved model from file ./cond_rnn_lstm.h5 like this tensorflow.keras.models.load_model('./cond_rnn_lstm.h5'), I encountered an error ValueError: Unknown layer: ConditionalRNN. Does anyone ever encounter this error, and how can I load the model with a CondRNN layer inside? Thanks!

My model net:

import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential

from cond_rnn import ConditionalRNN

model = Sequential(layers=[
    ConditionalRNN(units=64, cell='LSTM', return_sequences=True),  # num_cells = 10
    TimeDistributed(Dense(Y_train_others.shape[2], activation='relu'),name = 'others_output')
])
optim = Adam(lr=0.003,)
model.compile(optimizer=optim, loss={'output_1': 'mse'}, metrics={'output_1': 'mse'})

callbacks = [
                EarlyStopping(patience=30, monitor='val_mse'),
                TensorBoard(log_dir='./training_logs_0508/seq'),
            ]

out = model.fit(x=[X_train, categorical_appid, categorical_advertiser], 
          y=Y, epochs=100, batch_size = 1024, 
          verbose=2, callbacks=callbacks, workers = 100, validation_data=([X_eval, categorical_appid_eval, categorical_advertiser_eval], Y_eval),
          sample_weight=None)
model.save('./cond_rnn_lstm.h5')

previous_model = tf.keras.models.load_model('./cond_rnn_lstm.h5')

details fo the error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-86-1bf00b883022> in <module>

----> 25 previous_model = tf.keras.models.load_model('./cond_rnn_lstm.h5')

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py in load_model(filepath, custom_objects, compile, options)
    205           (isinstance(filepath, h5py.File) or h5py.is_hdf5(filepath))):
    206         return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
--> 207                                                 compile)
    208 
    209       filepath = path_to_string(filepath)

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile)
    182     model_config = json_utils.decode(model_config.decode('utf-8'))
    183     model = model_config_lib.model_from_config(model_config,
--> 184                                                custom_objects=custom_objects)
    185 
    186     # set weights

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/saving/model_config.py in model_from_config(config, custom_objects)
     62                     '`Sequential.from_config(config)`?')
     63   from tensorflow.python.keras.layers import deserialize  # pylint: disable=g-import-not-at-top
---> 64   return deserialize(config, custom_objects=custom_objects)
     65 
     66 

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/serialization.py in deserialize(config, custom_objects)
    175       module_objects=LOCAL.ALL_OBJECTS,
    176       custom_objects=custom_objects,
--> 177       printable_module_name='layer')

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
    356             custom_objects=dict(
    357                 list(_GLOBAL_CUSTOM_OBJECTS.items()) +
--> 358                 list(custom_objects.items())))
    359       with CustomObjectScope(custom_objects):
    360         return cls.from_config(cls_config)

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/sequential.py in from_config(cls, config, custom_objects)
    491     for layer_config in layer_configs:
    492       layer = layer_module.deserialize(layer_config,
--> 493                                        custom_objects=custom_objects)
    494       model.add(layer)
    495     if (not model.inputs and build_input_shape and

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/serialization.py in deserialize(config, custom_objects)
    175       module_objects=LOCAL.ALL_OBJECTS,
    176       custom_objects=custom_objects,
--> 177       printable_module_name='layer')

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
    345     config = identifier
    346     (cls, cls_config) = class_and_config_for_serialized_keras_object(
--> 347         config, module_objects, custom_objects, printable_module_name)
    348 
    349     if hasattr(cls, 'from_config'):

/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
    294   cls = get_registered_object(class_name, custom_objects, module_objects)
    295   if cls is None:
--> 296     raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
    297 
    298   cls_config = config['config']

ValueError: Unknown layer: ConditionalRNN

How to handle conditional LSTM with input conv1d layers?

I would like to use a conditional LSTM for a guitar amplifier model, where the conditional part would be user controls such as gain/drive on an amplifier (reference https://github.com/GuitarML/GuitarLSTM repo). Would the cond_rnn layer work if my model has two conv1d layers prior to the conditional LSTM? For example:

    model = Sequential()
    model.add(Conv1D(conv1d_filters, 12, strides=conv1d_strides, activation=None, padding='same',input_shape=(input_size,1)))
    model.add(Conv1D(conv1d_filters, 12, strides=conv1d_strides, activation=None, padding='same'))
    model.add(LSTM(hidden_units))
    model.add(Dense(1, activation=None))

Where "input_size" is something like 100, for an input shape of (100,1) to the first conv1d layer. Would it work to replace the LSTM here with your conditional rnn? Or is a different approach required where the condition is introduced at the first layer.

Thanks!

Question on processing time-series text data

Thanks for the example implementation and making this into a library! Very helpful!

I am writing to see if you have any suggestions on processing time-series text data? I haven't been able to find much help in this area since most examples only show time series numerical data. Specifically, do you have any ideas on how we can process sequences of text data for a patient?

E.g. A patient can have multiple doctors' notes attached to the patient's profile; each doctor's note occurs at a different timestamp. As a result (and expectedly), each patient in the data can have wildly different timestamps. The order of the doctor's notes is important. More recent doctor notes are more important than older ones. It's also important to keep track of the patient ID. The goal is to estimate mortality risk for the patients in the next year.

General Usage + Validation Data Usage + Batch Size

In the general behavior of fitting an RNN model, we are using validation data to see the loss changes on it, too! Is it possible to apply it in Conditional Recurrent? I couldn't see it in the examples.

history = model.fit(X_train, y_train, epochs=10, batch_size=8, validation_data=(X_test, y_test), verbose=0)

My other question is if we use batch_size as an equal number while fitting (like in the above), does it also take that number of batch size for the external data also?
I haven't applied yet Conditional Recurrent; I also have a medium level of knowledge of coding, so if my question becomes too silly, please understand me and reply in that way. I'm just asking this to be sure of what I'm doing. Let's assume that I have input data of 3 cities' temperature values and I also have additional data like given below. While I want to make a prediction of the temperature of one of the cities, I also want to use additional data, but I don't want to look back too much data in the additional data because it causes noise. For this reason, I just want to go back for 3 days for the additional data; however, in my original data, I want to create samples while looking back 30 days to make a prediction. This Conditional Recurrent library provides me with that, am I right?

Confusion between conditions?

Hello,

I have a LSTM RNN with the ConditionalRecurrent layer, for a time series prediction where I input the last 10 timesteps and a condition. It seems to me that at times, the conditions get confused, since when testing after training, the outputs of some predictions look extremely similar to the ones expected for a different condition than the one provided.
More precisely:
Let's say I have 5 possible conditions, A,B,C,D or E (that are float values). While training and checking the predictions every N epochs, I noticed that the model gets really good at predicting when the condition is A, B, C or D but when the condition is E, the output looks similar to what is to be expected if the condition was A, for example. If I train for more epochs, the situation is the same, but for different conditions, i.e. A, B and E work fine, but D predictions look like B predictions.

Is this something that can happen? Otherwise it means that my implementation of the ConditionalRecurrent layer is incorrect.

Installation issue

Hi, I tried to install the conditional rnn but it failed. Bellow is the error:
pip install cond-rnn
Collecting cond-rnn
Using cached cond-rnn-3.2.1.tar.gz (6.8 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

pytorch version available?

Your cond_rnn is a great work that I want to use it in my work.
Unfortunately, we use pytorch more often.
Could you also provide a pytorch version, or give me some hint to implement it?

Best Regards,
kingaza

Dummy stations example

cond_rnn/examples/dummy_stations_example.py

Line 13 in b35616f

continuous_variables_per_station = 3 # A,B,C where C is the target.

First of all, thanks a lot for your work! Sorry I'm having trouble understanding parts of the code. We say C is the target, but I don't see C getting any special treatment further down; in fact, it appears we're predicting all of A,B and C for each timestep, since our y includes all of these values. Could you please let me know what's actually going on here?

How to generate input data with proper shapes

I have a list of 564 time series. Each time series is 1d vector of different sizes (e.g. 366, 558, 812, ...).
I also have a respective dataframe of 564X3 that holds the condition (categorical features).
My batch size is 4 and timestamps is 10.
What is the best way to transform these data to the proper data to use conditional RNN?
(3-D Tensor with shape [batch_size, timesteps, input_dim] from series list of timeseries, and 2-D Tensor or list of tensors with shape [batch_size, cond_dim])?

Why not merge conditions in only one vector?

This is an inspiring project. Thanks.

I am relatively new to neural network. I do not see why you use separate dense layers for different conditions. It seems to me that merging condition vectors into one vector and feed the vector into a dense layer can achieve the same result.

Take predicting air quality for example. If we have two features that are not time series, say city and number of vehicles, the first feature will be converted to one-hot encoding, and the second feature will be directly represented as a number. Can I append the number of vehicles to the city vector and feed the new vector to a dense layer? That seems natural to me.

What are your concerns when you use different dense layers for different conditions?

shape of the input tensor when using conditional RNN

Thanks for sharing this interesting project. I have been trying to understand how to shape the input tensor using conditional RNN in Keras but I am still very unclear about how to present the input data in the correct shape.

I am working on 10 stations (num_stations=10). For each station, I have one-year (timesteps = 365) records of three continuous variables: A and B are predictive variables (thus, dim_input=2) and C is the target variable. For each station, I also have two conditions (a categorical condition that can take 5 classes (dim_cond1 = 5) and a continuous condition (dim_cond2 = 1)).

What I want to do is to have a model that is trained based on information from the ten stations taking into account the two conditions for every station (I call this mode the global model).

What I am confused about is the shape of the input tensor that I should feed into the model. I know that for an LSTM model that is trained based on the time series of only one station (I call this model the local model), the shape of the input tensor takes the form [batch_size, timesteps, input_dim]. For the local model, I am able to use a generator that extracts and yields a tuple (samples, targets), where samples (one batch of input data) and targets are from one station.

But the global model should sample a part from the data of each station before completing each epoch.

Appending the time series of different stations on top of the other and iterating through all the rows, does not make sense in the context of my problem since the date goes, for instance, from 2020-12-29 (a winter day) to 1986-07-01 (a summer day).

I have trouble in understanding how the passage and batch extraction from one station to another should take place in the global model. Probably the two following possible solutions:

1- To be able to use a generator similar to that of the local model: create a training loop on stations and train on data from each station one after another and update the weights but reset the state to differentiate between time series.

2- Otherwise, is there a way to build a generator that could somehow yield a batch from all stations?

Thanks for your thoughts.

Can Conditional-BiLSTM works?

I wonder if this is possible, thx!

Implementation as Keras Layer

Hi,
thank you very much for sharing this code. Im wondering if one could implement the Conditional RNN as a true keras layer (inherited from tf.keras.layers.Layer) so it could be used in a keras model.

This would help peoples which are not familar with tensorflow low level api.

Lars

example real data

Thanks for sharing your code. Could you provide an example on real data, with real I mean data from life measurements e.g. https://www.kaggle.com/sudalairajkumar/daily-temperature-of-major-cities .
The example so far proofs you have working code, not that the model is superior.
I can't pass all possible regularizers to the model, I think you could pass them at line 38 and 44 in your code .
Here it could be nice to use things like kernel regularizers etc.
Your code is compatible with tensorflow 2.2.0 RC. This should be changed in the requirements.
Note, the only thing I could find with realistic examples is Modern Techniques for Forecasting Time Series.
I am not active in the Kaggle competition; just wanted to give an example.

Condition vector size different than LSTM hidden states?

Hi there,

Thank you for this wonderful repo. I just wanted to drop a theoretical question. Since we expect a smaller size of condition vector size (let's say size of 10) and bigger lstm layer (lets says 64 hidden states), then how could we initialize very first time step of LSTM layer? I think by mapping the vector as in ''𝑣⃗ =𝐖𝑥⃗ +𝑏⃗ '' you secure same dimensions, however would that not cause lot of zeros in the matrix?

Warm Regards.

Simple LSTM

Hello, since you have been so helpful so far I thought I'd try again 😄

I have this simple LSTM, where the inputs are the time series, of shape (None, 10, 500), i.e. 10 time-steps and 500 features, and the condition is a single value per input so (None, 1).
My first attempt was designing the network without specifying the input shape and plugging in the data at training:

n_features = 500
model = Sequential()
model.add(ConditionalRecurrent(LSTM(128)))
model.add(Dense(n_features))
model.add(Dense(n_features))
model.compile(loss='mae', optimizer='adam')

# Train the model
history = model.fit(x = [train_x, train_c], y = train_y,
                validation_data=([test_x, test_c], test_y),
                epochs=1,
                batch_size=int(train_x.shape[0]/2),
                verbose=1, shuffle=True)

Obviously I get the error:

WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor. 
Received: inputs=(<tf.Tensor 'IteratorGetNext:0' shape=(3430, 10, 500) dtype=float32>,
 <tf.Tensor 'IteratorGetNext:1' shape=(3430, 1) dtype=float32>).
 Consider rewriting this model with the Functional API.

but the results are surprisingly good, better than before introducing the cond_rnn.

Trying to solve the above issue, I have tried following another one of your examples and ended up with this:

i = Input(shape=[10,500], name='input_0')
c = Input(shape=[1], name='input_1')

x = ConditionalRecurrent(LSTM(128, return_sequences=True, name='cond_rnn_0'))([i,c])

x = ConditionalRecurrent(LSTM(128, return_sequences=False, name='cond_rnn_1'))([x,c])

x = Dense(units=1, activation='softmax')(x)

model = Model(inputs=[i,c], outputs=[x])

model.compile(optimizer='adam',loss='mae')

In this case I get no error regarding the inputs. but the results are completely useless, outputting always zero for every feature.

Any idea on what is wrong in my code? I have followed your examples as much as possible while trying to apply it to my case, so maybe I have messed up something in the process.

EDIT: Solved part of the issue myself, so I removed that part of the question

Confusion between double conditions?

Hi, this is similar to a previous issue I had opened here.
I basically have a model with two conditional parameters. Previously I had only one, and I found that feeding it twice was the best way to get the model to avoid confusion. Without feeding it twice to the model, the predictions would 'confuse the conditions' i.e. predict similar outputs for any condition.

I now have introduced another condition, that can either be a 1 or a 0 since it acts as a switch or boolean. To avoid the confusion that affected the previous model, I have fed the condition twice this time as well. This time this is not enough. This is the diagram of the model:

As you can see the two (None,1) layers (conditional) are fed twice but at the same time. This method still does not give good results.

Any idea on how this can be improved?

Use with Keras

Can the the ConditionalRNN layer be used with Keras' functional API?

def nn_model(ahead = 2):
    # descriptive data from umap
    input_vector = Input(name='vector',shape = [100])
    
    x1 = Dense(256, use_bias=False)(input_vector)
    x1 = BatchNormalization()(x1)
    x1 = LeakyReLU()(x1)
    x1 = Dropout(0.2)(x1, training=True)
    for _ in range(2):
        x1 = Dense(256, use_bias=False)(x1)
        x1 = BatchNormalization()(x1)
        x1 = LeakyReLU()(x1)
        x1 = Dropout(0.2)(x1, training=True)
    
    # rnn
    input_series = Input(name='ts', shape=(4,9))
    rnn = ConditionalRNN(256, cell='LSTM', cond=x1, activation=None, use_bias=False)(input_series)
    rnn = BatchNormalization()(rnn)
    rnn = Activation('tanh')(rnn)
    rnn = Dropout(0.2)(rnn, training=True)
    
    x2 = Dense(1024, use_bias=False)(rnn)
    x2 = BatchNormalization()(x2)
    x2 = LeakyReLU()(x2)
    x2 = Dropout(0.2)(x2, training=True)
    for _ in range(4):
        x2 = Dense(1024, use_bias=False)(x2)
        x2 = BatchNormalization()(x2)
        x2 = LeakyReLU()(x2)
        x2 = Dropout(0.2)(x2, training=True)
        
    output = Dense(ahead)(x2)
    output = Reshape(target_shape = [ahead])(output)
    model = Model(inputs = [input_vector, input_series], outputs = output)
    model.compile(optimizer = 'Adam', loss='mean_squared_error', metrics=['mean_squared_error'])
    
    return model

model = nn_model()
model.summary()

Getting this error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-67-9e49694ef66e> in <module>()
     44     return model
     45 
---> 46 model = nn_model()
     47 model.summary()

<ipython-input-67-9e49694ef66e> in nn_model(ahead)
     16     # rnn
     17     input_series = Input(name='ts', shape=(4,9))
---> 18     rnn = ConditionalRNN(256, cell='LSTM', cond=x1, activation=None, use_bias=False)(input_series)
     19     rnn = BatchNormalization()(rnn)
     20     rnn = Activation('tanh')(rnn)

~/anaconda3/envs/main/lib/python3.6/site-packages/cond_rnn/cond_rnn.py in __init__(self, units, cell, cond, *args, **kwargs)
     47             cond = self._standardize_condition(cond)
     48             if cond is not None:
---> 49                 self.init_state = tf.keras.layers.Dense(units=units)(cond)
     50                 self.init_state = tf.unstack(self.init_state, axis=0)
     51         self.rnn = tf.keras.layers.RNN(cell=self._cell, *args, **kwargs)

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
    559       # framework.
    560       if base_layer_utils.needs_keras_history(inputs):
--> 561         base_layer_utils.create_keras_history(inputs)
    562 
    563     # Handle Keras mask propagation from previous layer to current layer.

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in create_keras_history(tensors)
    198     keras_tensors: The Tensors found that came from a Keras Layer.
    199   """
--> 200   _, created_layers = _create_keras_history_helper(tensors, set(), [])
    201   return created_layers
    202 

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
    244             constants[i] = backend.function([], op_input)([])
    245       processed_ops, created_layers = _create_keras_history_helper(
--> 246           layer_inputs, processed_ops, created_layers)
    247       name = op.name
    248       node_def = op.node_def.SerializeToString()

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
    251       created_layers.append(op_layer)
    252       op_layer._add_inbound_node(  # pylint: disable=protected-access
--> 253           layer_inputs, op.outputs)
    254       processed_ops.update([op])
    255   return processed_ops, created_layers

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in _add_inbound_node(self, input_tensors, output_tensors, arguments)
   1793     """
   1794     inbound_layers = nest.map_structure(lambda t: t._keras_history.layer,
-> 1795                                         input_tensors)
   1796     node_indices = nest.map_structure(lambda t: t._keras_history.node_index,
   1797                                       input_tensors)

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/util/nest.py in map_structure(func, *structure, **kwargs)
    513 
    514   return pack_sequence_as(
--> 515       structure[0], [func(*x) for x in entries],
    516       expand_composites=expand_composites)
    517 

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/util/nest.py in <listcomp>(.0)
    513 
    514   return pack_sequence_as(
--> 515       structure[0], [func(*x) for x in entries],
    516       expand_composites=expand_composites)
    517 

~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in <lambda>(t)
   1792             `call` method of the layer at the call that created the node.
   1793     """
-> 1794     inbound_layers = nest.map_structure(lambda t: t._keras_history.layer,
   1795                                         input_tensors)
   1796     node_indices = nest.map_structure(lambda t: t._keras_history.node_index,

AttributeError: 'tuple' object has no attribute 'layer'

ConditionalRNN or ConditionalRecurrent?

I am trying to import the function but it is now called ConditionalRNN, but there is no mention of this on the README. If I only change that in the example code, I get an error:

from tensorflow.keras import Input
from tensorflow.keras.layers import LSTM

from cond_rnn import ConditionalRNN

time_steps, input_dim, output_dim, batch_size, cond_size = 128, 6, 12, 32, 5
inputs = Input(batch_input_shape=(batch_size, time_steps, input_dim))
cond_inputs = Input(batch_input_shape=(batch_size, cond_size))

outputs = ConditionalRNN(LSTM(units=output_dim))([inputs, cond_inputs])
print(outputs.shape)  # (batch_size, output_dim)

TypeError: '<' not supported between instances of 'LSTM' and 'int'

Any idea on what is going on?

Using cond_rnn with different type of static data

Hi philiperemy,

I could not figure out how to contact you so, I created this issue.

Thank you for sharing this code.

Before looking into the details I wanted to know if this would be suitable for my problem.

I noticed that you have conditions (one-hot encoded vectors) such as: [0 0 1] or [1 0 0], etc. Referring to the blog you originally built this code for, I understand that these condition vectors can be different cities that might have different weather conditions if I understand it correctly.

I have static and time series data and I want my static features to affect my predictions (forecast). Output is a sequence (not classification).

Time-series data example: 100, 90, 60, 40, 20, .... (single feature with multiple examples with varying sequence length)
Static data: 7 different features for each example at time t=0. Example: [ 5, 100, 0.8, 0.5, 10, 3.65, 7] --> This is not a one-hot encoded vector such as [1 0 0], etc. and each number has an effect on how the time series progresses for each example.

Can this code be used for a problem like this or does the input have to be a one-hot encoded vector?

Any help is appreciated. If you are interested in my problem, and if this works out, I'm planning to write a paper using the data I have and I can include your name in the paper.

Thank you.

No more than 10 conditional features

I got "IndexError: list index out of range" when there is more than 10 conditional features.
In the following code, as long as c11 is included, the model will throw the IndexError.

kwargs for tf.keras.layers.RNN

Hi,

I've just started to use cond_rnn and I'm trying to replace LSTM layers in my model with ConditionalRNN.

The README says that one should be able to use keywords like return_sequences.

*args, **kwargs: Any parameters of the tf.keras.layers.RNN class, such as return_sequences, return_state, stateful, unroll...

However, since be1e97f ConditionalRNN derives from tf.keras.layers.Layer which prohibits the usage of these kwargs, as only the following kwargs are allowed for tf.keras.layers.Layer:

input_shape
batch_input_shape
batch_size
weights
activity_regularizer
autocast

Due to the super().__init__(**kwargs) call, keywords for the RNN layer are considered to be invalid and an exception is raised.

My environment:

tensorflow==2.1.0
Keras==2.3.1
cond-rnn==2.1

Bidirectional Layer with Functional API

Hi Philippe Remy,

I have been trying to run ConditionalRecurrent wrapper on the Bidirectional Layer with the Functional API to be able to stack layers with no success yet. It ran successfully for layers such as LSTM, GRU but, not Bidirectional. The code looks something like below:

import numpy as np
from tensorflow.keras.layers import Dense, GRU, LSTM, Bidirectional
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import Input, Model

from cond_rnn import ConditionalRecurrent

patience = 50

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
                                                      patience=patience,
                                                      mode='min')
optimizer = tf.keras.optimizers.Adam(
    learning_rate=0.00001,
    beta_1=0.95,
    beta_2=0.999,
    epsilon=1e-07,
    amsgrad=False)

# Weight initialization
initializer = tf.keras.initializers.Orthogonal()

# Functional API
i = Input(shape=[6, 3], name='input_0')
c = Input(shape=[3], name='input_1')

# Bi-directional layers
forward_layer = ConditionalRecurrent(GRU(units=30, return_sequences=True))
backward_layer = ConditionalRecurrent(GRU(units=30, return_sequences=True, go_backwards=True))

x = Bidirectional(layer=forward_layer,backward_layer=backward_layer, name='cond_rnn_1')([i,c])
x = Dense(units=2, activation='linear')(x)
model = Model(inputs=[i, c], outputs=[x])

model.compile(optimizer=optimizer, loss='mse', metrics = ['mae','mape'])
history = model.fit(w1.train, epochs=40000, validation_data=w1.val, verbose=2,callbacks=[early_stopping])

The error I'm getting is:

ValueError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_100012\3844763617.py in
30 backward_layer = ConditionalRecurrent(GRU(units=30, return_sequences=True, go_backwards=True))
31
---> 32 x = Bidirectional(layer=forward_layer,backward_layer=backward_layer, name='cond_rnn_1')([i,c])
33 x = Dense(units=2, activation='linear')(x)
34 model = Model(inputs=[i, c], outputs=[x])

~\AppData\Local\Continuum\anaconda3\envs\Tensorflow\lib\site-packages\keras\layers\wrappers.py in call(self, inputs, initial_state, constants, **kwargs)
598 if num_states % 2 > 0:
599 raise ValueError(
--> 600 'When passing initial_state to a Bidirectional RNN, '
601 'the state should be a list containing the states of '
602 'the underlying RNNs. '

ValueError: When passing `initial_state` to a Bidirectional RNN, the state should be a list containing the states of the underlying RNNs. Received: [<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'input_1')>]

My guess is that the Bidirectional RNN itself does not like ([i,c]) states that I'm trying to pass it but, I'm not sure if I'm correct here.

Another question I have in mind is:
If I were to add an Encoder-Decoder architecture, can I still use ConditionalRecurrent or would that also have similar issues as the Bidirectional Layer. I haven't tried this yet but, it is something I have in mind that I want to try.

Thank you.

[QUESTION]

Non-issue: fix exception message

cond_rnn/cond_rnn/cond_rnn.py

Line 56 in c555fd9

 raise Exception('Initial cond should have shape: [2, batch_size, hidden_size]\n' 

As titled, non-issue, but still I spent couple of minutes trying to figure out what "\nor" means. Is it negating OR? )))
Either remove carriage return or put initial_cond_shape inside the exception message string manually.
In your case repr of tuple ("msg", initial_cond_shape) is being printed which does escaped chars and not parsed.

Bidirectional LSTM

I would like to use your ConditionalRNN code with a bidirectional LSTM layer, e.g. like here https://keras.io/examples/nlp/bidirectional_lstm_imdb/.
Is this possible?

Problem with one example found in Stackoverflow

Hi,

Thank you very much for writing the library.

I was looking using non-sequential input to my LSTM and found this code snippet from STOF.

import numpy as np

from tensorflow.keras.models import Model
from tensorflow.keras.layers import LSTM, Dense, Input
from cond_rnn import ConditionalRNN

inputs = Input(name='in',shape=(5,5)) # Each observation has 5 dimensions à 5 time-steps each
x = Dense(64)(inputs)

inputs_aux = Input(name='in_aux', shape=[5]) # For each of the 5 dimensions, a non-time-series observation too
x = ConditionalRNN(7, cell='LSTM', cond=inputs_aux)(x)

predictions = Dense(1)(x)
model = Model(inputs=[inputs, inputs_aux], outputs=predictions)
#model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop', loss='mean_squared_error', metrics=['mse'])
data = np.random.standard_normal([100,5,5]) # Sample of 100 observations with 5 dimensions à 5 time-steps each
data_aux = np.random.standard_normal([100,5]) # Sample of 100 observations with 5 dimensions à only 1 non-time-series value each
labels = np.random.standard_normal(size=[100]) # For each of the 100 obs., a corresponding (single) outcome variable

model.fit([data,data_aux], labels)

I just tried to run it but got an error

TypeError: ('Keyword argument not understood:', 'cond')

According to the API the condition input should be like follows

outputs = cond_rnn.ConditionalRNN(units=NUM_CELLS, cell='GRU')([inputs, cond])

but I am not clear why this sample code uses the condition as parameter. Can you help me to correct the above code. Thank you very much for your time.

Embeding Layer with ConditionalRecurrent

Hi,

First thanks for this great library, it's very usefull and inspiring.

I'm wondering, if using an embeding layer before the Conditionnal reccurent layers could have a limitation ?
I tried it and it was working.

I just want to make sure from your point of view if I don't miss anything.

Thanks,
Alexandre

Question on conditions

Hi,
I came across this repo through Stack Overflow and really liked the idea! I'm trying to use what you did in order to predict the goals of players in a football league but I'm a bit confused on how I should structure the inputs.

Let's suppose I'm just focusing on strikers, on a league of 20 teams, around 3 strikers per team. Doing a parallelism with your example the stations would be the strikers, 20x3=60 in total. The number of timesteps would be (20-1)x2=38.
The variables I have are:

Expected goals of player/minute, at timestep t-1;
Goals of player for the game at timestep t; (= temperature)
Player, one-hot encoded; (= city)
Age;
My player's team;
Opposing team for the game at timestep t.

I would use the first two as continuous variables and the remaining as conditions, but I'm not really confident about this choice.
Also, I don't get why I have num_stations conditions and not just one. Probably because I need one condition for each prediction, so num_stations conditions, but wanted to be sure about this as well.

Matteo

what does the static data shape?

I have a dataset about glucose for 200 patients, and I have some static data that doesn't relate to time. These static data are from a case form that every patient answers, it's about 40 columns and the rows are 200, because of the patients. and I have roughly 3000 rows for glucose for each patient. I want to predict glucose 30 minutes later. What should I do? How can I use this library for my work?"

Predicted time series for multiple users

Hi Philippe,

I was going through this repository and found it very interesting. Great work ! :)

I am trying to predict sales for multiple stores . My goal is to make a single LSTM model to predict sales from these parallel time series having multiple features.

My input features for training would be:
+----------+-------+--------------+-------+
| Date | Store | DayOfTheWeek | Sales |
+----------+-------+--------------+-------+
| 1/1/2019 | A | 2 | 100 |
| 1/2/2019 | A | 3 | 200 |
| 1/3/2019 | A | 4 | 150 |
| 1/1/2019 | B | 2 | 300 |
| 1/2/2019 | B | 3 | 550 |
| 1/3/2019 | B | 4 | 1000 |
+----------+-------+--------------+-------+

and my output for training would be

+----------+-------+--------------+-------+
| Date | Store | DayOfTheWeek | Sales |
+----------+-------+--------------+-------+
| 1/4/2019 | A | 5 | 220 |
| 1/4/2019 | B | 5 | 700 |
+----------+-------+--------------+-------+

Problem is that LSTM takes input as 3D i.e (n_sample, n_timesteps, n_features) and I can pass a single time series for a specific store.

But I need to identify how can I predict parallel multivariate time series? Is there any other way to define in Input LSTM layer that there are for above problem 2 time series with 2 features each i.e (2*2).

Thank you in advance.

AttributeError: module 'tensorflow' has no attribute 'layers'

Hi!

I'm trying to use cond_rnn in my project and faced an AttributeError when intialize ConditionalRNN:

outputs = ConditionalRNN(units=NUM_CELLS, cell='LSTM', cond=cond_list, activation='relu')(inputs)

Here's an error code
WARNING:tensorflow:From C:\Users\nosova.ea\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.

AttributeError Traceback (most recent call last)
in ()
----> 1 outputs = ConditionalRNN(units=NUM_CELLS, cell='LSTM', cond=cond_list, activation='relu')(inputs)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\cond_rnn\cond_rnn.py in init(self, units, cell, cond, *args, **kwargs)
40 for cond in cond:
41 init_state_list.append(tf.keras.layers.Dense(units=units)(cond))
---> 42 multi_cond_projector = tf.layers.Dense(1, activation=None, use_bias=True)
43 multi_cond_state = multi_cond_projector(tf.stack(init_state_list, axis=-1))
44 multi_cond_state = tf.squeeze(multi_cond_state, axis=-1)

AttributeError: module 'tensorflow' has no attribute 'layers'

philipperemy / cond_rnn Goto Github PK

cond_rnn's Introduction

Solving Artificial Intelligence one step at a time 👋

cond_rnn's People

Contributors

Stargazers

Watchers

Forkers

cond_rnn's Issues

ValueError: When passing initial_state to a Bidirectional RNN, the state should be a list containing the states of the underlying RNNs. Received: [<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'input_1')>]

Recommend Projects

Recommend Topics

Recommend Org

ValueError: When passing `initial_state` to a Bidirectional RNN, the state should be a list containing the states of the underlying RNNs. Received: [<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'input_1')>]