vincent-leguen / dilate Goto Github PK
View Code? Open in Web Editor NEWCode for our NeurIPS 2019 paper "Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models"
License: Other
Code for our NeurIPS 2019 paper "Shape and Time Distortion Loss for Training Deep Time Series Forecasting Models"
License: Other
Really interesting stuff!
Can this be used as a loss function for boosted learners, I'm thinking GBMs/GBDTs. The requirements are twice continuously differentiable. Is this the case with DILATE?
Were you able to test this model against financial time series? Does the model work well with more-complex time series, such as FOREX or stock market forecasting?
Hi, I am trying to run your code and find the code will report error no matter what numba version I tried with python version of 3.9. Do you know the version of numba and other environment settings for running the code?
In my case:problem happens in "from numba.np import npyimpl" called by "path, sim = dtw_path(target_k_cpu, output_k_cpu)"
Numba =0.53 or 0.56: Final error message from numba/np/npyimpl.py: kernel = kernels[ufunc]; KeyError: <ufunc 'invert'>
Numba == 0.47: Final error message: ImportError: cannot import name 'npyimpl' from 'numba.np'
Hi @vincent-leguen,
nice work, congrats.
I am still studying the way you calculate DILATE loss.
However I think there is a small formal bug in main.py at row 78:
bath_size, N_output = target.shape[0:2]
should be:
batch_size, N_output = target.shape[0:2]
it is a small typo, but I do't think it impacts the normal flow of the code.
Cheers
Dear author, thank you so much for sharing your work, but I may ask if there may be a potential misuse of parameter input. Could you please take the time to check it?
in your main.py, lines 53 to 54, you use:
if (loss_type=='dilate'):
loss, loss_shape, loss_temporal = dilate_loss(target,outputs,alpha, gamma, device)
so the first arg is target(ground truth) and the second arg is output (prediction)
But in your dilate_loss.py, line 5, you use:
def dilate_loss(outputs, targets, alpha, gamma, device):
So the first arg is changed to output (prediction) and the second arg is the target(ground truth).
So may I ask should the code in main.py, lines 53 to 54, to be changed as:
if (loss_type=='dilate'):
loss, loss_shape, loss_temporal = dilate_loss(outputs,target,alpha, gamma, device)
Because in your paper, you mention the latter parameter is the ground truth.
Thanks a lot! Looking forward to your reply!
The dataloading stage is quite intense, is there possibly a way to showcase how a generaly pandas df with date index to be converted to fit your format here/
X_train_input,X_train_target,X_test_input,X_test_target,train_bkp,test_bkp = create_synthetic_dataset(N,N_input,N_output,sigma)
dataset_train = SyntheticDataset(X_train_input,X_train_target, train_bkp)
dataset_test = SyntheticDataset(X_test_input,X_test_target, test_bkp)
trainloader = DataLoader(dataset_train, batch_size=batch_size,shuffle=True, num_workers=1)
testloader = DataLoader(dataset_test, batch_size=batch_size,shuffle=False, num_workers=1)
Hi,
Really appreciate your great work. I really enjoyed reading through your paper and code.
I am curious how long does your model take for training (with and without using GPU) on long time steps like for ECG5000 and traffic cases?
Thanks in advance
Thanks a lot for the code release. I have a small question to ask you.
Why did I run the code and there were only two pictures output, but there were three(seq2seq MSE,seq2seq DTW Seq2seq DILAT) in the paper,
Looking forward to your reply
Hello Everyone! Christmas almost arrives, I wish you all Christmas :)
Unfortunately, my code is not happy....when I try to reproduce the same result as Vincent's paper for ECG5000 dataset, I failed....
The dataset ECG5000 that I used is http://storage.googleapis.com/download.tensorflow.org/data/ecg.csv
Maybe it's the problem of the dataset?
My result for sequence to sequence model MSE loss function:
epoch 0 loss 1.0772087574005127 loss shape 0 loss temporal 0
Eval mse= 0.8767168291977474 dtw= 6.805486422927132 tdi= 2.0440532069970843
epoch 50 loss 0.3731319308280945 loss shape 0 loss temporal 0
Eval mse= 0.3903078040906361 dtw= 3.0475928054320987 tdi= 1.055505193148688
epoch 100 loss 0.24237403273582458 loss shape 0 loss temporal 0
Eval mse= 0.3061082886798041 dtw= 2.4698390916706847 tdi= 0.9395733418367348
epoch 150 loss 0.2584376335144043 loss shape 0 loss temporal 0
Eval mse= 0.22645755005734308 dtw= 1.9507582848166412 tdi= 0.7841070517492711
epoch 200 loss 0.15192793309688568 loss shape 0 loss temporal 0
Eval mse= 0.24554287110056197 dtw= 2.016065729844596 tdi= 0.6677225765306123
epoch 250 loss 0.1566656529903412 loss shape 0 loss temporal 0
Eval mse= 0.2019440990473543 dtw= 1.880212425006847 tdi= 0.7010162172011662
epoch 300 loss 0.12690874934196472 loss shape 0 loss temporal 0
Eval mse= 0.19364993029407093 dtw= 1.852122344428214 tdi= 0.6694605502915453
epoch 350 loss 0.12332551181316376 loss shape 0 loss temporal 0
Eval mse= 0.1977188979940755 dtw= 1.8767601223560113 tdi= 0.6332683126822158
epoch 400 loss 0.10750801116228104 loss shape 0 loss temporal 0
Eval mse= 0.21163474129778997 dtw= 1.8957054875381265 tdi= 0.7194324890670553
epoch 450 loss 0.10328985005617142 loss shape 0 loss temporal 0
Eval mse= 0.19005996425236973 dtw= 1.8016342832546404 tdi= 0.6326609876093293
epoch 500 loss 0.0954132080078125 loss shape 0 loss temporal 0
Eval mse= 0.1958838226539748 dtw= 1.7668837248907803 tdi= 0.6169864249271136
epoch 550 loss 0.09286423027515411 loss shape 0 loss temporal 0
Eval mse= 0.19250875785946847 dtw= 1.7858421110580047 tdi= 0.6433294460641399
epoch 600 loss 0.09554877132177353 loss shape 0 loss temporal 0
Eval mse= 0.19318228970680917 dtw= 1.8026873947053852 tdi= 0.6590691508746356
epoch 650 loss 0.06814754754304886 loss shape 0 loss temporal 0
Eval mse= 0.19417715136493954 dtw= 1.7603379046970657 tdi= 0.672698250728863
epoch 700 loss 0.07659073919057846 loss shape 0 loss temporal 0
Eval mse= 0.21282084967408862 dtw= 1.7930373014810026 tdi= 0.6734083454810496
epoch 750 loss 0.07163602858781815 loss shape 0 loss temporal 0
Eval mse= 0.20653479067342623 dtw= 1.7746248434154144 tdi= 0.6520079263848396
epoch 800 loss 0.06505869328975677 loss shape 0 loss temporal 0
Eval mse= 0.19753393722432 dtw= 1.7156214133114422 tdi= 0.6494909803206996
epoch 850 loss 0.07344229519367218 loss shape 0 loss temporal 0
Eval mse= 0.194216572280441 dtw= 1.7329767024008997 tdi= 0.6313555029154518
epoch 900 loss 0.06015300750732422 loss shape 0 loss temporal 0
Eval mse= 0.20844823292323522 dtw= 1.741320480761508 tdi= 0.6895331632653061
epoch 950 loss 0.05017583444714546 loss shape 0 loss temporal 0
Eval mse= 0.20004522502422334 dtw= 1.7107445075037588 tdi= 0.6221432215743441
The result from Vincent's paper: mse: 0.212 dtw: 0.178 tdi: 0.827
Compared to my result, the mse and tdi is Ok, but why dtw is tooooo far away ???? I don't know why....
Really hope someone can help me!!!!
The code is the same as Vincent, but I will also copy here in order to detect if I made some stupid mistakes..
import numpy as np
import torch
from data.synthetic_dataset import create_synthetic_dataset, SyntheticDataset
from models.seq2seq import EncoderRNN, DecoderRNN, Net_GRU
from loss.dilate_loss import dilate_loss
from torch.utils.data import DataLoader
import random
from tslearn.metrics import dtw, dtw_path
import matplotlib.pyplot as plt
import warnings
import warnings; warnings.simplefilter('ignore')
import pandas as pd
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
random.seed(0)
dataframe = pd.read_csv('http://storage.googleapis.com/download.tensorflow.org/data/ecg.csv', header=None)
X_train_input=dataframe.iloc[0:500,0:84].values
X_test_input=dataframe.iloc[500:4000,0:84].values
X_train_target=dataframe.iloc[0:500,84:140].values
X_test_target=dataframe.iloc[500:4000,84:140].values
batch_size = 50
N_input = 84
N_output = 56
gamma = 0.01
dataset_train = SyntheticDataset(X_train_input,X_train_target)
dataset_test = SyntheticDataset(X_test_input,X_test_target)
trainloader = DataLoader(dataset_train, batch_size=batch_size,shuffle=True, num_workers=0,drop_last=True)
testloader = DataLoader(dataset_test, batch_size=batch_size,shuffle=False, num_workers=0,drop_last=True)
def train_model(net,loss_type, learning_rate, epochs=1000, gamma = 0.01,
print_every=50,eval_every=50, verbose=1, Lambda=1, alpha=0.5):
optimizer = torch.optim.Adam(net.parameters(),lr=learning_rate)
criterion = torch.nn.MSELoss()
for epoch in range(epochs):
for i, data in enumerate(trainloader, 0):
inputs, target = data
inputs = torch.tensor(inputs, dtype=torch.float32).to(device)
target = torch.tensor(target, dtype=torch.float32).to(device)
batch_size, N_output = target.shape[0:2]
# forward + backward + optimize
outputs = net(inputs)
loss_mse,loss_shape,loss_temporal = torch.tensor(0),torch.tensor(0),torch.tensor(0)
if (loss_type=='mse'):
loss_mse = criterion(target,outputs)
loss = loss_mse
if (loss_type=='dilate'):
loss, loss_shape, loss_temporal = dilate_loss(target,outputs,alpha, gamma, device)
optimizer.zero_grad()
loss.backward()
optimizer.step()
if(verbose):
if (epoch % print_every == 0):
print('epoch ', epoch, ' loss ',loss.item(),' loss shape ',loss_shape.item(),' loss temporal ',loss_temporal.item())
eval_model(net,testloader, gamma,verbose=1)
def eval_model(net,loader, gamma,verbose=1):
criterion = torch.nn.MSELoss()
losses_mse = []
losses_dtw = []
losses_tdi = []
for i, data in enumerate(loader, 0):
loss_mse, loss_dtw, loss_tdi = torch.tensor(0),torch.tensor(0),torch.tensor(0)
# get the inputs
inputs, target = data
inputs = torch.tensor(inputs, dtype=torch.float32).to(device)
target = torch.tensor(target, dtype=torch.float32).to(device)
batch_size, N_output = target.shape[0:2]
outputs = net(inputs)
# MSE
loss_mse = criterion(target,outputs)
loss_dtw, loss_tdi = 0,0
# DTW and TDI
for k in range(batch_size):
target_k_cpu = target[k,:,0:1].view(-1).detach().cpu().numpy()
output_k_cpu = outputs[k,:,0:1].view(-1).detach().cpu().numpy()
path, sim = dtw_path(target_k_cpu, output_k_cpu)
loss_dtw += sim
Dist = 0
for i,j in path:
Dist += (i-j)*(i-j)
loss_tdi += Dist / (N_output*N_output)
loss_dtw = loss_dtw /batch_size
loss_tdi = loss_tdi / batch_size
# print statistics
losses_mse.append( loss_mse.item() )
losses_dtw.append( loss_dtw )
losses_tdi.append( loss_tdi )
print( ' Eval mse= ', np.array(losses_mse).mean() ,' dtw= ',np.array(losses_dtw).mean() ,' tdi= ', np.array(losses_tdi).mean())
encoder = EncoderRNN(input_size=1, hidden_size=128, num_grulstm_layers=1, batch_size=batch_size).to(device)
decoder = DecoderRNN(input_size=1, hidden_size=128, num_grulstm_layers=1,fc_units=16, output_size=1).to(device)
net_gru_mse = Net_GRU(encoder,decoder, N_output, device).to(device)
train_model(net_gru_mse,loss_type='mse',learning_rate=0.001, epochs=1000, gamma=gamma, print_every=50, eval_every=50,verbose=1)
I also changed a little bit the format in synthetic_dataset.py But I don't think that is matter
class SyntheticDataset(torch.utils.data.Dataset):
def __init__(self, X_input, X_target):
super(SyntheticDataset, self).__init__()
self.X_input = X_input
self.X_target = X_target
def __len__(self):
return (self.X_input).shape[0]
def __getitem__(self, idx):
return (self.X_input[idx,:,np.newaxis], self.X_target[idx,:,np.newaxis])
Have a good day !!!!! Hopefully someone can answer that :)
i have a question about my loss_shape, i don't know why the value is negative. i think it's not right, but i don't know why
Dear author,
I appreciate your work. I notice that Traffic datasets has mentioned in the original paper. However, the content of this version doesn't release the relevant code of it, especially the part of the data input. Would you please sent the code of it? Thank you! My email address is [email protected]
Looking for your reply!
Kind regards
I am looking forward to your code! It an interesting paper.
Is your arxiv paper already your final version or is it possible to still suggest improvements? My main suggestion is to rework some of the notations. Just a short example: You use
{x_i}_{i\in{ 1:N }}
which already considered to be a notation reserved to denote sequences, but you use it to describe a set. Instead, I'd suggest to use the set builder notation: https://en.m.wikipedia.org/wiki/Set-builder_notation
Also the notation of the pairwise cost matrix is difficult.
Is <> denoting an inner product? How is it defined if it is not the standard for product? Logically I think you want to use the element wise hardamard product, which uses a circle with a dot in the middle as a symbol (but should still be mentioned as such), together with a L2 norm: || A⊙ \Delta(...) ||_2
How is A chosen in (2)?
It would be nice if the license is indicated in the LICENSE file or somewhere.
Without a proper license, we cannot use the code without legal concerns.
Thanks a lot for the code release.
I had a doubt regarding your code. How can I extend the loss function for cases with multi-dimensional outputs like vehicle trajectory forecasting (2D)? Currently, the code does not support this I assume.
Any suggestions on the changes I need to make to the code / other references would be really helpful.
In the paper "Laura et al., Assessing energy forecasting inaccuracy by simultaneously considering
temporal and absolute errors, 2017", TDI is defined in the range of [0,1].
However, your results are in the range of [0.xx, 2.xx]. (Table 1)
What is the difference between the TDI measure defined in the above paper and yours?
I love the idea behind DILATE and would like to include it in pytorch-forecasting. However, a GPU-only implementation is probably needed for wider adoption. Do you plan on a CUDA or performant pure PyTorch implementation?
I want to apply this loss function to a time series data set of indefinite length. For example, I have a target tensor of the shape BLD (Batch Size, Length, Dimension). But the true length of each sequence is stored in a tensor of the shape B*1.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.