mmalekzadeh / dana Goto Github PK

DANA: Dimension-Adaptive Neural Architecture (UbiComp'21)( ACM IMWUT)

Home Page: https://arxiv.org/abs/2008.02397

License: MIT License

Python 1.52% Jupyter Notebook 98.48%

deep-learning time-series sensor activity-recognition neural-networks deep-neural-networks adaptive-learning human-activity-recognition sensor-data accelerometer

dana's People

Contributors

Stargazers

Watchers

Forkers

yris-brice piugie vu1seek crocodilegogogo shuowang-ai aascode shdeldari namjaegyeong xiangyw99 gkaviani

dana's Issues

Error when training DANA model

Hi,

When I tried to train the DANA model using the notebook from the repo in Google colab; I get the following error

I have tried to print the various shapes.

Note that in Google Colab, the Tensorflow version is now 2.6; most likely you used an older version to train. But indeed the shape of X does look a little strange.

I have not changed anything in your notebook (except for from_logits=False)

Regards & thanks
Kapil

DAT - Potential difference between the paper & the code

Hi,

I am using your 13th august copy of the paper.

The algorithm DAT from the paper is:

When I read this algorithm, I get the impression -

For every iteration in an epoch, we would construct few batches e.g. B=5 batches

a) Compute the loss for the given batch (in which dimension randomization has been applied)
b) Compute the gradient for this batch
c) Accumulate the gradients. Most importantly - do not apply the gradients

Once we have run the batches (i.e. 5 of them) then apply the accumulated gradients.

Now, when I look at the code, I see the following:

for epoch in range(num_epochs):  

        ## Training
        train_dataset = tf.data.Dataset.from_tensor_slices((X_train, Y_train))
        train_dataset = iter(train_dataset.shuffle(len(X_train)).batch(batch_size))
        n_iterations_per_epoch = len(X_train)//(batch_size*n_batch_per_train_setp)
        epoch_loss_avg = tf.keras.metrics.Mean()           

        for i in range(n_iterations_per_epoch):
            rnd_order_H = np.random.permutation(len(H_combinations))
            rnd_order_W = np.random.permutation(len(W_combinations))
            n_samples = 0.
            with tf.GradientTape() as tape:
                accum_loss = tf.Variable(0.)
                for j in range(n_batch_per_train_setp):
                    try:
                        X, Y = next(train_dataset)
                    except:
                        break
                    X = X.numpy()
                    sample_weight = [data_class_weights[y] for y in Y.numpy()]
                    
                    ### Dimension Randomization 
                    ####### Random Sensor Selection
                    rnd_H = H_combinations[rnd_order_H[j%len(rnd_order_H)]]                    
                    X = X[:,:,rnd_H,:] 
                    ####### Random Sampling Rate Selection  
                    rnd_W = W_combinations[rnd_order_W[j%len(rnd_order_W)]]
                    X = tf.image.resize(X, (rnd_W, len(rnd_H)))    

                    logits =  model(X)               
                    accum_loss = accum_loss + loss_fn(Y, logits, sample_weight)
                    n_samples = n_samples + 1.
            gradients = tape.gradient(accum_loss, model.trainable_weights)
            gradients = [g*(1./n_samples) for g in gradients]
            optimizer.apply_gradients(zip(gradients, model.trainable_weights))
            epoch_loss_avg.update_state(accum_loss*(1./n_samples))

If I understood the code properly, you

accumulate the loss for the 5 batches.
then compute the gradient on this accumulated loss.
then take the average of gradients
then apply the gradients

The flow in code vs the algorithm in the paper seems different but maybe the end result is the same.

Would appreciate it if you could clarify/confirm.

Regards & thanks
Kapil

Potential bug - Usage of from_logits=True on the model that has softmax activated output

Hi,

First of all, thank you for an excellent paper and also for developing a very good corresponding code base.

I tried to run your notebook in the google colab and have observed few issues

def Ordonez2016DeepOriginal(inp_shape, out_shape):   
    nb_filters = 64 
    drp_out_dns = .5 
    nb_dense = 128 
    
    inp = Input(inp_shape)

    x = Conv2D(nb_filters, kernel_size = (5,1),
              strides=(1,1), padding='valid', activation='relu')(inp)    
    x = Conv2D(nb_filters, kernel_size = (5,1),
              strides=(1,1), padding='valid', activation='relu')(x)
    x = Conv2D(nb_filters, kernel_size = (5,1), 
              strides=(1,1), padding='valid', activation='relu')(x)
    x = Conv2D(nb_filters, kernel_size = (5,1), 
              strides=(1,1), padding='valid', activation='relu')(x)    
    x = Reshape((x.shape[1],x.shape[2]*x.shape[3]))(x)
    act = LSTM(nb_dense, return_sequences=True, activation='tanh', name="lstm_1")(x)        
    act = Dropout(drp_out_dns, name= "dot_1")(act)
    act = LSTM(nb_dense, activation='tanh', name="lstm_2")(act)        
    act = Dropout(drp_out_dns, name= "dot_2")(act)
    out_act = Dense(out_shape, activation='softmax',  name="act_smx")(act)
    
    model = keras.models.Model(inputs=inp, outputs=out_act)
    return model

def standard_training(model, X_train, Y_train, X_val, Y_val, data_class_weights,
                      batch_size=128, num_epochs=128, save_dir=None):
    loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)

Since your model's last layer i.e. out_act is using softmax, I think you should use

loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False)

As such now Tensorflow 2.6 is generating a warning to arrest this kind of situation.

I changed it and the testing accuracy on the standard model is 0.9213 and training loss is zero.... but of course, there is always an element of randomness here.

This issue of from_logits=True is also present when you use the DANA model.

However, I am not able to run the DANA model because of another problem. I am creating a separate issue for that.

Regards & thanks
Kapil

mmalekzadeh / dana Goto Github PK

dana's People

Contributors

Stargazers

Watchers

Forkers

dana's Issues

Error when training DANA model

DAT - Potential difference between the paper & the code

Potential bug - Usage of from_logits=True on the model that has softmax activated output

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent