Are you an individual / company willing to invest in open source? Become a sponsor!
philipperemy / cond_rnn Goto Github PK
View Code? Open in Web Editor NEWConditional RNNs for Tensorflow / Keras.
License: MIT License
Conditional RNNs for Tensorflow / Keras.
License: MIT License
Are you an individual / company willing to invest in open source? Become a sponsor!
Hello and first of all thank you very much for your work!
I want to use injection layer for categorical features before using CondLSTM like below and get error
forward_layer = ConditionalRecurrent(LSTM(units=256, return_sequences=True))
backward_layer = ConditionalRecurrent(LSTM(units=256, return_sequences=True, go_backwards=True))
i1 = Input(shape=(24, 14))
ic_1 = Input(shape=(4,))
norm = normalizer(i1)
v = vectorize_layer(ic_1)
embeding = Embedding(49, 4, input_length=4)(v)
inputs = (norm, embeding)
x = Bidirectional(layer=forward_layer,
backward_layer=backward_layer)(inputs)
x = Flatten()(x)
x = Dropout(.25)(x)
output = Dense(units=4, activation='linear')(x)
model = keras.Model([i1, ic_1], output)
in user code:
File "/usr/local/lib/python3.8/dist-packages/cond_rnn/cond_rnn.py", line 86, in call *
cond = self._standardize_condition(cond[0])
File "/usr/local/lib/python3.8/dist-packages/cond_rnn/cond_rnn.py", line 54, in _standardize_condition *
raise Exception('Initial cond should have shape: [2, batch_size, hidden_size] '
Exception: ('Initial cond should have shape: [2, batch_size, hidden_size] or [batch_size, hidden_size]. Shapes do not match.', TensorShape([None, 4, 4]))
Call arguments received by layer 'forward_conditional_recurrent_10' (type ConditionalRecurrent):
• inputs=('tf.Tensor(shape=(None, 24, 14), dtype=float32)', 'tf.Tensor(shape=(None, 4, 4), dtype=float32)')
• training=None
• kwargs=<class 'inspect._empty'>
When I try to use Encoder-Decoder LSTM model, I encounnter this error.
ValueError: Graph disconnected: cannot obtain value for tensor KerasTensor(type_spec=TensorSpec(shape=(None, 10), dtype=tf.float32, name='input_18'), name='input_18', description="created by layer 'input_18'") at layer "conditional_recurrent_3". The following previous layers were accessed without issue: []
I'm trying to build the following architecture inspired from the stacked lstm example code in the repository. The only difference is that I also include Dropout layers between two stacked LSTM layers.
x = ConditionalRecurrent(LSTM(64,
batch_input_shape=(batchSize, num_samples, num_features),
activation='tanh', #'relu',
return_sequences=True, stateful=stateful))([i, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(128, activation='relu', return_sequences=True, stateful=stateful))([x, c])
x = Dropout(0.2)(x)
x = ConditionalRecurrent(LSTM(256, activation='relu', return_sequences=True, stateful=stateful))([x, c])
x = Dropout(0.2)(x)
x = Dense(units=78)(x)
x = LeakyReLU()(x)
This however ends up giving me the below error due to the Dropout layer:
Exception encountered when calling layer "dropout_114" (type Dropout).
Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor.
Call arguments received:
• inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>
• training=False
Traceback (most recent call last):
File "<ipython-input-111-1334bb85e127>", line 295, in run_LSTM
history, model, Y_pred = train_and_get_predictions(initial_layer, model_x_train, Y_train, model_x_test)
File "<ipython-input-111-1334bb85e127>", line 169, in train_and_get_predictions
trainingModel = createModel_v2(initial_layer,
File "<ipython-input-111-1334bb85e127>", line 130, in createModel_v2
x = Dropout(0.2)(x)
File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/Users/aditya/miniconda3/envs/kaggle/lib/python3.9/site-packages/tensorflow/python/framework/constant_op.py", line 102, in convert_to_eager_tensor
return ops.EagerTensor(value, ctx.device_name, dtype)
ValueError: Exception encountered when calling layer "dropout_114" (type Dropout).
Attempt to convert a value (<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>) with an unsupported type (<class 'cond_rnn.cond_rnn.ConditionalRecurrent'>) to a Tensor.
Call arguments received:
• inputs=<cond_rnn.cond_rnn.ConditionalRecurrent object at 0x157ce4520>
• training=False
Could you help me understand if adding Dropout layers is supported and I'm doing it wrong, or its known that its not supported? Thanks! Happy to provide more context if needed.
Hi @philipperemy, thank you for providing this library! I'm faced with an error with adding embedding layer before ConditionalRecurrent, could you help take a look? perhaps I'm not using it correctly. Appreciate your time and input! here is my code:
model = Sequential()
model.add(Embedding(input_dim=5, output_dim=4, input_length=35))
model.add(ConditionalRecurrent(GRU(units=64, return_sequences=True)))
model.add(Flatten())
model.add(Dense(units=6, activation='linear'))
the error is:
----> [3] model.add(ConditionalRecurrent(GRU(units=64, return_sequences=True)))
AssertionError: Exception encountered when calling layer "conditional_recurrent_52" (type ConditionalRecurrent).
in user code:
File "~/local/lib/python3.9/site-packages/cond_rnn/cond_rnn.py", line 74, in call *
assert isinstance(inputs, (list, tuple)) and len(inputs) >= 2
AssertionError:
Call arguments received by layer "conditional_recurrent_52" (type ConditionalRecurrent):
• inputs=tf.Tensor(shape=(None, 35, 4), dtype=float32)
• training=None
• kwargs=<class 'inspect._empty'>
Hello, I have tried "lstm.py" example files.
In the example, the time start from 1995-01-01 until 2020-05-13.
Now, how to predict the unseen dataframe? Start from 2020-05-14 until desired time?
If I predict it with normal pure autoregressive GRU model. I was success. But, I dont know how to be predicting multivariate time series in the future? since I dont have any condition dataframe after 2020-05-13.
Hi,
First of all, thanks for the package that allows to use time series data (time variant) with time invariant data (conditions). I think there are a lot of cases/scenarios where both types of data are needed to use at same time.
I'm having a similar situation, where I want to use both types of data, where I have my time series data and currently using a LSTM and I have my conditions (time invariant features) and want to incorporate them into the problem.
Obviously, the most straight forward solution was to duplicate the conditions into the time series, and have them not change over time. I had no problem doing this, and it did boost my performance a lot (compared to only using conditions/time invariant features alone), however, I also read some papers which share the thought on not doing that combination, because it pollute the time variant information with time invariant data and it might just create a harder problem to solve.
I did read in other papers to not used the above approach, one of them was to modify/specify the hidden state an initial step and feed the time invariant into it. And I was trying to do this in my own (I did not find your package before that) and come up with the following model definition (example):
lstm_input = keras.Input(
shape=(3,3), name="lstm_input"
) # Variable-length sequence of ints
pre_go_live_input_hidden_states = keras.Input(shape=(3,), name="pre-go-live") # Variable-length sequence of ints
pre_go_live_fc_layer_1 = keras.layers.Dense(50, activation='relu')(pre_go_live_input_hidden_states)
# Reduce sequence of embedded words in the title into a single 128-dimensional vector
lstm_features = LSTM(50)
outputs = lstm_features(lstm_input, initial_state=[pre_go_live_fc_layer_1,pre_go_live_fc_layer_1])
# Stick a department classifier on top of the features
pred = keras.layers.Dense(1, name="prediction_layer")(outputs)
# Instantiate an end-to-end model predicting both priority and department
model = keras.Model(
inputs=[lstm_input, pre_go_live_input_hidden_states],
outputs=[pred]
)
Here, my conditions are the pre_go_live_input_hidden_states variable, which need a Dense layer afterwards to reduce the data to 50, so it can be added to the initial_state of the LSTM (ignore the part that I'm also passing the cell state initial values).
My question is regarding the difference between the previous model definition, and how ConditionalRNN works behind scenes? Is it similar to it?
I'm having trouble understanding the ConditionalRecurrent class, mostly around the treatment giving to the conditions input.
My current time series data is in the following shape: (29000, 7, 14). And my conditions data shape is: (29000,124), which of those 124 features, some of them are encoded variables and numerical variables.
Is it possible to use your package, and still use the 124 conditions to initialize the hidden state at initial step?
Thanks,
Francisco Parrilla A
Hi, I am doing research on time-series forecasting for electric load. I am quite new to tensorflow, so please excuse me if make errors.
I came across your library and thought it was very interesting! I am building a 24-ahead MIMO forecasting model. I have noticed that pure LSTM may disregard features that I considered important, so I have experimented with this framework.
I have tried using conditional RNN cell on the meter data using the examples in the repository. For auxillary feautures I used features of the predicted time step (e.g. one hot encoded day of week) and important lags as features.
First, writing and fitting the sequential model:
ConditionalRNN = Sequential(layers=[ConditionalRNN(32, cell='LSTM'),
Dense(HORIZON)])
ConditionalRNN.compile(optimizer=tf.optimizers.Adam(), loss='mse', metrics=[tf.metrics.MeanSquaredError()])
history = ConditionalRNN.fit([train_inputs['X'],train_c1,train_c2], train_inputs['target'], batch_size=32, epochs=MAX_EPOCHS, validation_split=0.15, callbacks=[early_stopping], verbose=1)
The model works fine and shows better accuracy than the LSTM with the full data passed through it.
However, I was getting this warning from tensoflow:
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor, but we receive a <class 'tuple'> input: (<tf.Tensor 'IteratorGetNext:0' shape=(None, 24, 5) dtype=float32>, <tf.Tensor 'IteratorGetNext:1' shape=(None, 8) dtype=float64>, <tf.Tensor 'IteratorGetNext:2' shape=(None, 2) dtype=float64>)
Consider rewriting this model with the Functional API.
I have rewritten the same model in a functional API way:
i = Input(shape=[HORIZON, 5], name='time-series')
c1 = Input(shape=[8], name='one-hot')
c2 = Input(shape=[2], name='lags-as-features')
x = ConditionalRNN(32, cell='LSTM', name='cond_rnn_0')([i, c1, c2])
#x = Dense(HORIZON)(x)
#ConditionalRNN = Model(inputs=[i, c1, c2], outputs=[x])
This way there has been no warnings but the accuracy has significantly dropped down.
Do you have any explanation on why it might be the case?
Hello,
I currently have a very basic LSTM NN, which predicts the next array of n_features, from the past 10 time steps, i.e.
model = Sequential()
model.add(LSTM(n_lookback, input_shape=(n_lookback, n_features)))
model.add(Dense(n_features))
model.compile()
where n_features is the number of features in each tilmestep and n_lookback is the number of previous time steps provided to the LSTM.
Since I am trying to also add conditions that are not time dependent, I thought I'd try your cond_RNN.
I am having some issues starting out though, since the example are a bit to complex for me, I have only started a few months ago working with NN and Keras.
The condition would be a float, i.e. a value of a physical property, such as 3.1 or 4.5 etc.
Could you offer any help on how to start out?
Thanks in advance. You have already been so helpful with he MacOS import issue 😊
Thank you so much for your work, very excellent library. I'm trying to use this library in my project, but I encounter a problem: I'd like to use regularizer method such as 'kernel_regularizer' in ConditionalRNN layer, but I found it can't support regularizer method in ConditionalRNN. I check the tensorflow document, it seems that "tf.keras.layers.rnn" doesn't have "kernel_regularizer" argument, so may I ask how to use regularizer in ConditionalRNN? Thanks
I have built a net that contains a CondRNN layer and save the compiled model using model.save('./cond_rnn_lstm.h5')
. But when I load the saved model from file ./cond_rnn_lstm.h5
like this tensorflow.keras.models.load_model('./cond_rnn_lstm.h5')
, I encountered an error ValueError: Unknown layer: ConditionalRNN
. Does anyone ever encounter this error, and how can I load the model with a CondRNN layer inside? Thanks!
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
from cond_rnn import ConditionalRNN
model = Sequential(layers=[
ConditionalRNN(units=64, cell='LSTM', return_sequences=True), # num_cells = 10
TimeDistributed(Dense(Y_train_others.shape[2], activation='relu'),name = 'others_output')
])
optim = Adam(lr=0.003,)
model.compile(optimizer=optim, loss={'output_1': 'mse'}, metrics={'output_1': 'mse'})
callbacks = [
EarlyStopping(patience=30, monitor='val_mse'),
TensorBoard(log_dir='./training_logs_0508/seq'),
]
out = model.fit(x=[X_train, categorical_appid, categorical_advertiser],
y=Y, epochs=100, batch_size = 1024,
verbose=2, callbacks=callbacks, workers = 100, validation_data=([X_eval, categorical_appid_eval, categorical_advertiser_eval], Y_eval),
sample_weight=None)
model.save('./cond_rnn_lstm.h5')
previous_model = tf.keras.models.load_model('./cond_rnn_lstm.h5')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-86-1bf00b883022> in <module>
----> 25 previous_model = tf.keras.models.load_model('./cond_rnn_lstm.h5')
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/saving/save.py in load_model(filepath, custom_objects, compile, options)
205 (isinstance(filepath, h5py.File) or h5py.is_hdf5(filepath))):
206 return hdf5_format.load_model_from_hdf5(filepath, custom_objects,
--> 207 compile)
208
209 filepath = path_to_string(filepath)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/saving/hdf5_format.py in load_model_from_hdf5(filepath, custom_objects, compile)
182 model_config = json_utils.decode(model_config.decode('utf-8'))
183 model = model_config_lib.model_from_config(model_config,
--> 184 custom_objects=custom_objects)
185
186 # set weights
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/saving/model_config.py in model_from_config(config, custom_objects)
62 '`Sequential.from_config(config)`?')
63 from tensorflow.python.keras.layers import deserialize # pylint: disable=g-import-not-at-top
---> 64 return deserialize(config, custom_objects=custom_objects)
65
66
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/serialization.py in deserialize(config, custom_objects)
175 module_objects=LOCAL.ALL_OBJECTS,
176 custom_objects=custom_objects,
--> 177 printable_module_name='layer')
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
356 custom_objects=dict(
357 list(_GLOBAL_CUSTOM_OBJECTS.items()) +
--> 358 list(custom_objects.items())))
359 with CustomObjectScope(custom_objects):
360 return cls.from_config(cls_config)
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/engine/sequential.py in from_config(cls, config, custom_objects)
491 for layer_config in layer_configs:
492 layer = layer_module.deserialize(layer_config,
--> 493 custom_objects=custom_objects)
494 model.add(layer)
495 if (not model.inputs and build_input_shape and
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/layers/serialization.py in deserialize(config, custom_objects)
175 module_objects=LOCAL.ALL_OBJECTS,
176 custom_objects=custom_objects,
--> 177 printable_module_name='layer')
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py in deserialize_keras_object(identifier, module_objects, custom_objects, printable_module_name)
345 config = identifier
346 (cls, cls_config) = class_and_config_for_serialized_keras_object(
--> 347 config, module_objects, custom_objects, printable_module_name)
348
349 if hasattr(cls, 'from_config'):
/opt/conda/lib/python3.7/site-packages/tensorflow/python/keras/utils/generic_utils.py in class_and_config_for_serialized_keras_object(config, module_objects, custom_objects, printable_module_name)
294 cls = get_registered_object(class_name, custom_objects, module_objects)
295 if cls is None:
--> 296 raise ValueError('Unknown ' + printable_module_name + ': ' + class_name)
297
298 cls_config = config['config']
ValueError: Unknown layer: ConditionalRNN
I would like to use a conditional LSTM for a guitar amplifier model, where the conditional part would be user controls such as gain/drive on an amplifier (reference https://github.com/GuitarML/GuitarLSTM repo). Would the cond_rnn layer work if my model has two conv1d layers prior to the conditional LSTM? For example:
model = Sequential()
model.add(Conv1D(conv1d_filters, 12, strides=conv1d_strides, activation=None, padding='same',input_shape=(input_size,1)))
model.add(Conv1D(conv1d_filters, 12, strides=conv1d_strides, activation=None, padding='same'))
model.add(LSTM(hidden_units))
model.add(Dense(1, activation=None))
Where "input_size" is something like 100, for an input shape of (100,1) to the first conv1d layer. Would it work to replace the LSTM here with your conditional rnn? Or is a different approach required where the condition is introduced at the first layer.
Thanks!
Thanks for the example implementation and making this into a library! Very helpful!
I am writing to see if you have any suggestions on processing time-series text data? I haven't been able to find much help in this area since most examples only show time series numerical data. Specifically, do you have any ideas on how we can process sequences of text data for a patient?
E.g. A patient can have multiple doctors' notes attached to the patient's profile; each doctor's note occurs at a different timestamp. As a result (and expectedly), each patient in the data can have wildly different timestamps. The order of the doctor's notes is important. More recent doctor notes are more important than older ones. It's also important to keep track of the patient ID. The goal is to estimate mortality risk for the patients in the next year.
history = model.fit(X_train, y_train, epochs=10, batch_size=8, validation_data=(X_test, y_test), verbose=0)
My other question is if we use batch_size as an equal number while fitting (like in the above), does it also take that number of batch size for the external data also?
I haven't applied yet Conditional Recurrent; I also have a medium level of knowledge of coding, so if my question becomes too silly, please understand me and reply in that way. I'm just asking this to be sure of what I'm doing. Let's assume that I have input data of 3 cities' temperature values and I also have additional data like given below. While I want to make a prediction of the temperature of one of the cities, I also want to use additional data, but I don't want to look back too much data in the additional data because it causes noise. For this reason, I just want to go back for 3 days for the additional data; however, in my original data, I want to create samples while looking back 30 days to make a prediction. This Conditional Recurrent library provides me with that, am I right?
Hello,
I have a LSTM RNN with the ConditionalRecurrent layer, for a time series prediction where I input the last 10 timesteps and a condition. It seems to me that at times, the conditions get confused, since when testing after training, the outputs of some predictions look extremely similar to the ones expected for a different condition than the one provided.
More precisely:
Let's say I have 5 possible conditions, A,B,C,D or E (that are float values). While training and checking the predictions every N epochs, I noticed that the model gets really good at predicting when the condition is A, B, C or D but when the condition is E, the output looks similar to what is to be expected if the condition was A, for example. If I train for more epochs, the situation is the same, but for different conditions, i.e. A, B and E work fine, but D predictions look like B predictions.
Is this something that can happen? Otherwise it means that my implementation of the ConditionalRecurrent layer is incorrect.
Hi, I tried to install the conditional rnn but it failed. Bellow is the error:
pip install cond-rnn
Collecting cond-rnn
Using cached cond-rnn-3.2.1.tar.gz (6.8 kB)
Preparing metadata (setup.py) ... error
error: subprocess-exited-with-error
× python setup.py egg_info did not run successfully.
│ exit code: 1
╰─> [8 lines of output]
Traceback (most recent call last):
File "", line 2, in
File "", line 34, in
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
× Encountered error while generating package metadata.
╰─> See above for output.
note: This is an issue with the package mentioned above, not pip.
hint: See above for details.
hi
Your cond_rnn is a great work that I want to use it in my work.
Unfortunately, we use pytorch more often.
Could you also provide a pytorch version, or give me some hint to implement it?
Best Regards,
kingaza
C
is the target, but I don't see C
getting any special treatment further down; in fact, it appears we're predicting all of A
,B
and C
for each timestep, since our y
includes all of these values. Could you please let me know what's actually going on here?I have a list of 564 time series. Each time series is 1d vector of different sizes (e.g. 366, 558, 812, ...).
I also have a respective dataframe of 564X3 that holds the condition (categorical features).
My batch size is 4 and timestamps is 10.
What is the best way to transform these data to the proper data to use conditional RNN?
(3-D Tensor with shape [batch_size, timesteps, input_dim] from series list of timeseries, and 2-D Tensor or list of tensors with shape [batch_size, cond_dim])?
This is an inspiring project. Thanks.
I am relatively new to neural network. I do not see why you use separate dense layers for different conditions. It seems to me that merging condition vectors into one vector and feed the vector into a dense layer can achieve the same result.
Take predicting air quality for example. If we have two features that are not time series, say city and number of vehicles, the first feature will be converted to one-hot encoding, and the second feature will be directly represented as a number. Can I append the number of vehicles to the city vector and feed the new vector to a dense layer? That seems natural to me.
What are your concerns when you use different dense layers for different conditions?
Thanks for sharing this interesting project. I have been trying to understand how to shape the input tensor using conditional RNN in Keras but I am still very unclear about how to present the input data in the correct shape.
I am working on 10 stations (num_stations=10
). For each station, I have one-year (timesteps = 365
) records of three continuous variables: A and B are predictive variables (thus, dim_input=2
) and C is the target variable. For each station, I also have two conditions (a categorical condition that can take 5 classes (dim_cond1 = 5
) and a continuous condition (dim_cond2 = 1
)).
What I want to do is to have a model that is trained based on information from the ten stations taking into account the two conditions for every station (I call this mode the global model).
What I am confused about is the shape of the input tensor that I should feed into the model. I know that for an LSTM model that is trained based on the time series of only one station (I call this model the local model), the shape of the input tensor takes the form [batch_size, timesteps, input_dim]
. For the local model, I am able to use a generator that extracts and yields a tuple (samples, targets)
, where samples (one batch of input data) and targets are from one station.
But the global model should sample a part from the data of each station before completing each epoch.
Appending the time series of different stations on top of the other and iterating through all the rows, does not make sense in the context of my problem since the date goes, for instance, from 2020-12-29 (a winter day) to 1986-07-01 (a summer day).
I have trouble in understanding how the passage and batch extraction from one station to another should take place in the global model. Probably the two following possible solutions:
1- To be able to use a generator similar to that of the local model: create a training loop on stations and train on data from each station one after another and update the weights but reset the state to differentiate between time series.
2- Otherwise, is there a way to build a generator that could somehow yield a batch from all stations?
Thanks for your thoughts.
I wonder if this is possible, thx!
Hi,
thank you very much for sharing this code. Im wondering if one could implement the Conditional RNN as a true keras layer (inherited from tf.keras.layers.Layer) so it could be used in a keras model.
This would help peoples which are not familar with tensorflow low level api.
Lars
Thanks for sharing your code. Could you provide an example on real data, with real I mean data from life measurements e.g. https://www.kaggle.com/sudalairajkumar/daily-temperature-of-major-cities .
The example so far proofs you have working code, not that the model is superior.
I can't pass all possible regularizers to the model, I think you could pass them at line 38 and 44 in your code .
Here it could be nice to use things like kernel regularizers etc.
Your code is compatible with tensorflow 2.2.0 RC. This should be changed in the requirements.
Note, the only thing I could find with realistic examples is Modern Techniques for Forecasting Time Series.
I am not active in the Kaggle competition; just wanted to give an example.
Hi there,
Thank you for this wonderful repo. I just wanted to drop a theoretical question. Since we expect a smaller size of condition vector size (let's say size of 10) and bigger lstm layer (lets says 64 hidden states), then how could we initialize very first time step of LSTM layer? I think by mapping the vector as in ''𝑣⃗ =𝐖𝑥⃗ +𝑏⃗ '' you secure same dimensions, however would that not cause lot of zeros in the matrix?
Warm Regards.
Hello, since you have been so helpful so far I thought I'd try again 😄
I have this simple LSTM, where the inputs are the time series, of shape (None, 10, 500), i.e. 10 time-steps and 500 features, and the condition is a single value per input so (None, 1).
My first attempt was designing the network without specifying the input shape and plugging in the data at training:
n_features = 500
model = Sequential()
model.add(ConditionalRecurrent(LSTM(128)))
model.add(Dense(n_features))
model.add(Dense(n_features))
model.compile(loss='mae', optimizer='adam')
# Train the model
history = model.fit(x = [train_x, train_c], y = train_y,
validation_data=([test_x, test_c], test_y),
epochs=1,
batch_size=int(train_x.shape[0]/2),
verbose=1, shuffle=True)
Obviously I get the error:
WARNING:tensorflow:Layers in a Sequential model should only have a single input tensor.
Received: inputs=(<tf.Tensor 'IteratorGetNext:0' shape=(3430, 10, 500) dtype=float32>,
<tf.Tensor 'IteratorGetNext:1' shape=(3430, 1) dtype=float32>).
Consider rewriting this model with the Functional API.
but the results are surprisingly good, better than before introducing the cond_rnn.
Trying to solve the above issue, I have tried following another one of your examples and ended up with this:
i = Input(shape=[10,500], name='input_0')
c = Input(shape=[1], name='input_1')
x = ConditionalRecurrent(LSTM(128, return_sequences=True, name='cond_rnn_0'))([i,c])
x = ConditionalRecurrent(LSTM(128, return_sequences=False, name='cond_rnn_1'))([x,c])
x = Dense(units=1, activation='softmax')(x)
model = Model(inputs=[i,c], outputs=[x])
model.compile(optimizer='adam',loss='mae')
In this case I get no error regarding the inputs. but the results are completely useless, outputting always zero for every feature.
Any idea on what is wrong in my code? I have followed your examples as much as possible while trying to apply it to my case, so maybe I have messed up something in the process.
EDIT: Solved part of the issue myself, so I removed that part of the question
Hi, this is similar to a previous issue I had opened here.
I basically have a model with two conditional parameters. Previously I had only one, and I found that feeding it twice was the best way to get the model to avoid confusion. Without feeding it twice to the model, the predictions would 'confuse the conditions' i.e. predict similar outputs for any condition.
I now have introduced another condition, that can either be a 1 or a 0 since it acts as a switch or boolean. To avoid the confusion that affected the previous model, I have fed the condition twice this time as well. This time this is not enough. This is the diagram of the model:
As you can see the two (None,1) layers (conditional) are fed twice but at the same time. This method still does not give good results.
Any idea on how this can be improved?
Can the the ConditionalRNN layer be used with Keras' functional API?
def nn_model(ahead = 2):
# descriptive data from umap
input_vector = Input(name='vector',shape = [100])
x1 = Dense(256, use_bias=False)(input_vector)
x1 = BatchNormalization()(x1)
x1 = LeakyReLU()(x1)
x1 = Dropout(0.2)(x1, training=True)
for _ in range(2):
x1 = Dense(256, use_bias=False)(x1)
x1 = BatchNormalization()(x1)
x1 = LeakyReLU()(x1)
x1 = Dropout(0.2)(x1, training=True)
# rnn
input_series = Input(name='ts', shape=(4,9))
rnn = ConditionalRNN(256, cell='LSTM', cond=x1, activation=None, use_bias=False)(input_series)
rnn = BatchNormalization()(rnn)
rnn = Activation('tanh')(rnn)
rnn = Dropout(0.2)(rnn, training=True)
x2 = Dense(1024, use_bias=False)(rnn)
x2 = BatchNormalization()(x2)
x2 = LeakyReLU()(x2)
x2 = Dropout(0.2)(x2, training=True)
for _ in range(4):
x2 = Dense(1024, use_bias=False)(x2)
x2 = BatchNormalization()(x2)
x2 = LeakyReLU()(x2)
x2 = Dropout(0.2)(x2, training=True)
output = Dense(ahead)(x2)
output = Reshape(target_shape = [ahead])(output)
model = Model(inputs = [input_vector, input_series], outputs = output)
model.compile(optimizer = 'Adam', loss='mean_squared_error', metrics=['mean_squared_error'])
return model
model = nn_model()
model.summary()
Getting this error:
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-67-9e49694ef66e> in <module>()
44 return model
45
---> 46 model = nn_model()
47 model.summary()
<ipython-input-67-9e49694ef66e> in nn_model(ahead)
16 # rnn
17 input_series = Input(name='ts', shape=(4,9))
---> 18 rnn = ConditionalRNN(256, cell='LSTM', cond=x1, activation=None, use_bias=False)(input_series)
19 rnn = BatchNormalization()(rnn)
20 rnn = Activation('tanh')(rnn)
~/anaconda3/envs/main/lib/python3.6/site-packages/cond_rnn/cond_rnn.py in __init__(self, units, cell, cond, *args, **kwargs)
47 cond = self._standardize_condition(cond)
48 if cond is not None:
---> 49 self.init_state = tf.keras.layers.Dense(units=units)(cond)
50 self.init_state = tf.unstack(self.init_state, axis=0)
51 self.rnn = tf.keras.layers.RNN(cell=self._cell, *args, **kwargs)
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
559 # framework.
560 if base_layer_utils.needs_keras_history(inputs):
--> 561 base_layer_utils.create_keras_history(inputs)
562
563 # Handle Keras mask propagation from previous layer to current layer.
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in create_keras_history(tensors)
198 keras_tensors: The Tensors found that came from a Keras Layer.
199 """
--> 200 _, created_layers = _create_keras_history_helper(tensors, set(), [])
201 return created_layers
202
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
244 constants[i] = backend.function([], op_input)([])
245 processed_ops, created_layers = _create_keras_history_helper(
--> 246 layer_inputs, processed_ops, created_layers)
247 name = op.name
248 node_def = op.node_def.SerializeToString()
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer_utils.py in _create_keras_history_helper(tensors, processed_ops, created_layers)
251 created_layers.append(op_layer)
252 op_layer._add_inbound_node( # pylint: disable=protected-access
--> 253 layer_inputs, op.outputs)
254 processed_ops.update([op])
255 return processed_ops, created_layers
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in _add_inbound_node(self, input_tensors, output_tensors, arguments)
1793 """
1794 inbound_layers = nest.map_structure(lambda t: t._keras_history.layer,
-> 1795 input_tensors)
1796 node_indices = nest.map_structure(lambda t: t._keras_history.node_index,
1797 input_tensors)
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/util/nest.py in map_structure(func, *structure, **kwargs)
513
514 return pack_sequence_as(
--> 515 structure[0], [func(*x) for x in entries],
516 expand_composites=expand_composites)
517
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/util/nest.py in <listcomp>(.0)
513
514 return pack_sequence_as(
--> 515 structure[0], [func(*x) for x in entries],
516 expand_composites=expand_composites)
517
~/anaconda3/envs/main/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py in <lambda>(t)
1792 `call` method of the layer at the call that created the node.
1793 """
-> 1794 inbound_layers = nest.map_structure(lambda t: t._keras_history.layer,
1795 input_tensors)
1796 node_indices = nest.map_structure(lambda t: t._keras_history.node_index,
AttributeError: 'tuple' object has no attribute 'layer'
I am trying to import the function but it is now called ConditionalRNN, but there is no mention of this on the README. If I only change that in the example code, I get an error:
from tensorflow.keras import Input
from tensorflow.keras.layers import LSTM
from cond_rnn import ConditionalRNN
time_steps, input_dim, output_dim, batch_size, cond_size = 128, 6, 12, 32, 5
inputs = Input(batch_input_shape=(batch_size, time_steps, input_dim))
cond_inputs = Input(batch_input_shape=(batch_size, cond_size))
outputs = ConditionalRNN(LSTM(units=output_dim))([inputs, cond_inputs])
print(outputs.shape) # (batch_size, output_dim)
TypeError: '<' not supported between instances of 'LSTM' and 'int'
Any idea on what is going on?
Hi philiperemy,
I could not figure out how to contact you so, I created this issue.
Thank you for sharing this code.
Before looking into the details I wanted to know if this would be suitable for my problem.
I noticed that you have conditions (one-hot encoded vectors) such as: [0 0 1] or [1 0 0], etc. Referring to the blog you originally built this code for, I understand that these condition vectors can be different cities that might have different weather conditions if I understand it correctly.
I have static and time series data and I want my static features to affect my predictions (forecast). Output is a sequence (not classification).
Time-series data example: 100, 90, 60, 40, 20, .... (single feature with multiple examples with varying sequence length)
Static data: 7 different features for each example at time t=0. Example: [ 5, 100, 0.8, 0.5, 10, 3.65, 7] --> This is not a one-hot encoded vector such as [1 0 0], etc. and each number has an effect on how the time series progresses for each example.
Can this code be used for a problem like this or does the input have to be a one-hot encoded vector?
Any help is appreciated. If you are interested in my problem, and if this works out, I'm planning to write a paper using the data I have and I can include your name in the paper.
Thank you.
Hi,
I've just started to use cond_rnn and I'm trying to replace LSTM layers in my model with ConditionalRNN.
The README says that one should be able to use keywords like return_sequences.
*args, **kwargs: Any parameters of the tf.keras.layers.RNN class, such as return_sequences, return_state, stateful, unroll...
However, since be1e97f ConditionalRNN derives from tf.keras.layers.Layer which prohibits the usage of these kwargs, as only the following kwargs are allowed for tf.keras.layers.Layer:
Due to the super().__init__(**kwargs)
call, keywords for the RNN layer are considered to be invalid and an exception is raised.
My environment:
tensorflow==2.1.0
Keras==2.3.1
cond-rnn==2.1
Hi Philippe Remy,
I have been trying to run ConditionalRecurrent wrapper on the Bidirectional Layer with the Functional API to be able to stack layers with no success yet. It ran successfully for layers such as LSTM, GRU but, not Bidirectional. The code looks something like below:
import numpy as np
from tensorflow.keras.layers import Dense, GRU, LSTM, Bidirectional
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import Input, Model
from cond_rnn import ConditionalRecurrent
patience = 50
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss',
patience=patience,
mode='min')
optimizer = tf.keras.optimizers.Adam(
learning_rate=0.00001,
beta_1=0.95,
beta_2=0.999,
epsilon=1e-07,
amsgrad=False)
# Weight initialization
initializer = tf.keras.initializers.Orthogonal()
# Functional API
i = Input(shape=[6, 3], name='input_0')
c = Input(shape=[3], name='input_1')
# Bi-directional layers
forward_layer = ConditionalRecurrent(GRU(units=30, return_sequences=True))
backward_layer = ConditionalRecurrent(GRU(units=30, return_sequences=True, go_backwards=True))
x = Bidirectional(layer=forward_layer,backward_layer=backward_layer, name='cond_rnn_1')([i,c])
x = Dense(units=2, activation='linear')(x)
model = Model(inputs=[i, c], outputs=[x])
model.compile(optimizer=optimizer, loss='mse', metrics = ['mae','mape'])
history = model.fit(w1.train, epochs=40000, validation_data=w1.val, verbose=2,callbacks=[early_stopping])
The error I'm getting is:
ValueError Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_100012\3844763617.py in
30 backward_layer = ConditionalRecurrent(GRU(units=30, return_sequences=True, go_backwards=True))
31
---> 32 x = Bidirectional(layer=forward_layer,backward_layer=backward_layer, name='cond_rnn_1')([i,c])
33 x = Dense(units=2, activation='linear')(x)
34 model = Model(inputs=[i, c], outputs=[x])
~\AppData\Local\Continuum\anaconda3\envs\Tensorflow\lib\site-packages\keras\layers\wrappers.py in call(self, inputs, initial_state, constants, **kwargs)
598 if num_states % 2 > 0:
599 raise ValueError(
--> 600 'When passing initial_state
to a Bidirectional RNN, '
601 'the state should be a list containing the states of '
602 'the underlying RNNs. '
initial_state
to a Bidirectional RNN, the state should be a list containing the states of the underlying RNNs. Received: [<KerasTensor: shape=(None, 3) dtype=float32 (created by layer 'input_1')>]My guess is that the Bidirectional RNN itself does not like ([i,c]) states that I'm trying to pass it but, I'm not sure if I'm correct here.
Another question I have in mind is:
If I were to add an Encoder-Decoder architecture, can I still use ConditionalRecurrent or would that also have similar issues as the Bidirectional Layer. I haven't tried this yet but, it is something I have in mind that I want to try.
Thank you.
Line 56 in c555fd9
As titled, non-issue, but still I spent couple of minutes trying to figure out what "\nor" means. Is it negating OR? )))
Either remove carriage return or put initial_cond_shape inside the exception message string manually.
In your case repr of tuple ("msg", initial_cond_shape) is being printed which does escaped chars and not parsed.
I would like to use your ConditionalRNN code with a bidirectional LSTM layer, e.g. like here https://keras.io/examples/nlp/bidirectional_lstm_imdb/.
Is this possible?
Hi,
Thank you very much for writing the library.
I was looking using non-sequential input to my LSTM and found this code snippet from STOF.
import numpy as np
from tensorflow.keras.models import Model
from tensorflow.keras.layers import LSTM, Dense, Input
from cond_rnn import ConditionalRNN
inputs = Input(name='in',shape=(5,5)) # Each observation has 5 dimensions à 5 time-steps each
x = Dense(64)(inputs)
inputs_aux = Input(name='in_aux', shape=[5]) # For each of the 5 dimensions, a non-time-series observation too
x = ConditionalRNN(7, cell='LSTM', cond=inputs_aux)(x)
predictions = Dense(1)(x)
model = Model(inputs=[inputs, inputs_aux], outputs=predictions)
#model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop', loss='mean_squared_error', metrics=['mse'])
data = np.random.standard_normal([100,5,5]) # Sample of 100 observations with 5 dimensions à 5 time-steps each
data_aux = np.random.standard_normal([100,5]) # Sample of 100 observations with 5 dimensions à only 1 non-time-series value each
labels = np.random.standard_normal(size=[100]) # For each of the 100 obs., a corresponding (single) outcome variable
model.fit([data,data_aux], labels)
I just tried to run it but got an error
TypeError: ('Keyword argument not understood:', 'cond')
According to the API the condition input should be like follows
outputs = cond_rnn.ConditionalRNN(units=NUM_CELLS, cell='GRU')([inputs, cond])
but I am not clear why this sample code uses the condition as parameter. Can you help me to correct the above code. Thank you very much for your time.
Hi,
First thanks for this great library, it's very usefull and inspiring.
I'm wondering, if using an embeding layer before the Conditionnal reccurent layers could have a limitation ?
I tried it and it was working.
I just want to make sure from your point of view if I don't miss anything.
Thanks,
Alexandre
Hi,
I came across this repo through Stack Overflow and really liked the idea! I'm trying to use what you did in order to predict the goals of players in a football league but I'm a bit confused on how I should structure the inputs.
Let's suppose I'm just focusing on strikers, on a league of 20 teams, around 3 strikers per team. Doing a parallelism with your example the stations would be the strikers, 20x3=60 in total. The number of timesteps would be (20-1)x2=38.
The variables I have are:
I would use the first two as continuous variables and the remaining as conditions, but I'm not really confident about this choice.
Also, I don't get why I have num_stations conditions and not just one. Probably because I need one condition for each prediction, so num_stations conditions, but wanted to be sure about this as well.
Matteo
I have a dataset about glucose for 200 patients, and I have some static data that doesn't relate to time. These static data are from a case form that every patient answers, it's about 40 columns and the rows are 200, because of the patients. and I have roughly 3000 rows for glucose for each patient. I want to predict glucose 30 minutes later. What should I do? How can I use this library for my work?"
Hi Philippe,
I was going through this repository and found it very interesting. Great work ! :)
I am trying to predict sales for multiple stores . My goal is to make a single LSTM model to predict sales from these parallel time series having multiple features.
My input features for training would be:
+----------+-------+--------------+-------+
| Date | Store | DayOfTheWeek | Sales |
+----------+-------+--------------+-------+
| 1/1/2019 | A | 2 | 100 |
| 1/2/2019 | A | 3 | 200 |
| 1/3/2019 | A | 4 | 150 |
| 1/1/2019 | B | 2 | 300 |
| 1/2/2019 | B | 3 | 550 |
| 1/3/2019 | B | 4 | 1000 |
+----------+-------+--------------+-------+
and my output for training would be
+----------+-------+--------------+-------+
| Date | Store | DayOfTheWeek | Sales |
+----------+-------+--------------+-------+
| 1/4/2019 | A | 5 | 220 |
| 1/4/2019 | B | 5 | 700 |
+----------+-------+--------------+-------+
Problem is that LSTM takes input as 3D i.e (n_sample, n_timesteps, n_features) and I can pass a single time series for a specific store.
But I need to identify how can I predict parallel multivariate time series? Is there any other way to define in Input LSTM layer that there are for above problem 2 time series with 2 features each i.e (2*2).
Thank you in advance.
Hi!
I'm trying to use cond_rnn in my project and faced an AttributeError when intialize ConditionalRNN:
outputs = ConditionalRNN(units=NUM_CELLS, cell='LSTM', cond=cond_list, activation='relu')(inputs)
Here's an error code
WARNING:tensorflow:From C:\Users\nosova.ea\AppData\Roaming\Python\Python36\site-packages\tensorflow_core\python\ops\resource_variable_ops.py:1630: calling BaseResourceVariable.init (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
AttributeError Traceback (most recent call last)
in ()
----> 1 outputs = ConditionalRNN(units=NUM_CELLS, cell='LSTM', cond=cond_list, activation='relu')(inputs)
~\AppData\Local\Continuum\anaconda3\lib\site-packages\cond_rnn\cond_rnn.py in init(self, units, cell, cond, *args, **kwargs)
40 for cond in cond:
41 init_state_list.append(tf.keras.layers.Dense(units=units)(cond))
---> 42 multi_cond_projector = tf.layers.Dense(1, activation=None, use_bias=True)
43 multi_cond_state = multi_cond_projector(tf.stack(init_state_list, axis=-1))
44 multi_cond_state = tf.squeeze(multi_cond_state, axis=-1)
AttributeError: module 'tensorflow' has no attribute 'layers'
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.