Giter Club home page Giter Club logo

Comments (25)

cookieminions avatar cookieminions commented on May 16, 2024 8

Hi,

I make a graph to help explain how to get test samples from test series, hope it will be helpful:
image

For example, the first sample is [0, 168](seq_len)+[168-48, 168](label_len)+[168, 168+48](pred_len), the second sample is [1, 169](seq_len)+[169-48, 169](label_len)+[169, 169+48](pred_len). The label_len sequence is a part of seq_len sequence.

As for the figure, testlr1 is just the description of the experiment and the results is also obtained a long time ago (before the data scaling method change #41). And I think that due to the generative decoder, it's normal to get different figures in different repeated experiments with same parameters. So maybe you can try different datasets, prediction lengths(with different parameters) or repeated experiments, and it does not have to be the same as my figure.

And hope Informer can be helpful to your work in the use of other specific tasks.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024 3

I also still can't reproduce the level of accuracy as reported in the paper in univariate mode.

Don’t worry, we will update the experimental results and provide experimental parameters in a few days.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024 3

Yes, your understand is correct.

But the size of Train, Validation and Test data you see when running data is the size of dataloader. However, in code exp/exp_informer.py line 85-90, we used drop_last=True, so the dataloader will drop the last incomplete batch (it will not affect the test samples because we use the same batch_size for all methods and the test dataloader does not be shuffled).

So the length of data_x is not the same with the length of dataloader when running the code.

if flag == 'test':
    shuffle_flag = False; drop_last = True; batch_size = args.batch_size
else:
    shuffle_flag = True; drop_last = True; batch_size = args.batch_size

data_loader = DataLoader(
            data_set,
            batch_size=batch_size,
            shuffle=shuffle_flag,
            num_workers=args.num_workers,
            drop_last=drop_last)

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024 2

The target of Informer is 'point prediction', so our code does not provide the function of prediction confidence interval now, but it is possible. Maybe you can refer to DeepAR to modify the last layer of the model and loss function in experiment.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024 1
  1. Yes, data_x in test_set uses seq_len in vali_set.
  2. Yes, it's okay to use model to predict new data without test.
  3. You just need change the scale to False when you initialize the dataset and the data will not be scaled.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024

Hi,

ETTh2 has different characteristics compared to ETTh1, so you have to change hyper parameters.
You can try seq_len=48; label_len=24; pred_len=24; d_ff=2048; e_layer=2; d_layer=1 to get ETTh2 experiment prediction results, my results are:

>>>>>>>testing : informer_ETTh2_ftM_sl48_ll48_pl24_dm512_nh8_el2_dl1_df2048_atprob_fc5_ebtimeF_dtTrue_test_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2857
test shape: (89, 32, 24, 7) (89, 32, 24, 7)
test shape: (2848, 24, 7) (2848, 24, 7)
mse:0.49008972642133736, mae:0.5386070956473465

>>>>>>>testing : informer_ETTh2_ftM_sl48_ll48_pl24_dm512_nh8_el2_dl1_df2048_atprob_fc5_ebtimeF_dtTrue_test_1<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2857
test shape: (89, 32, 24, 7) (89, 32, 24, 7)
test shape: (2848, 24, 7) (2848, 24, 7)
mse:0.5389239759632527, mae:0.5776364176629023

We performed many different combinations of seq_len, label_len, e_layers and d_layers in the experiments (the values of sequence length are all multiples of the 24), and you can also try for data with different characteristics.

By the way, our current code has updated the data scaling way compared with the paper on arxiv, so we will update the experiment results later.

Thanks for your attention to our work.

from informer2020.

zhouhaoyi avatar zhouhaoyi commented on May 16, 2024

Suppose there is no more discussion, and I will close this issue now.

from informer2020.

18kiran12 avatar 18kiran12 commented on May 16, 2024

Hi Again,

Thanks a lot for the quick response. Would it be possible for you to provide the hyper parameters for all the datasets used in the univariate and multivariate setting (for each sequence length) so that I could do a comparison. Sorry for the delay in response.

Additionally, I am getting much better results with the ETTm1 dataset while using the hyper parameters from ETTh1 dataset.

MSE MAE
0.1510243027 0.2733270268
0.1615392239 0.283628478

Could you please confirm if these values makes sense? or am I missing something?

Thanks again.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024

Hi,

It seems that we named ETTm1 and ETTm2 files incorrectly in ETDataset , and we will fix this error immediately.

And we will update two new experimental results tables (due to changes of data scaling) at March 20th, and we will also provide the hyperparameter settings for each experiment at that time.

Please look forward to it. Thanks!

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Hi,

ETTh2 has different characteristics compared to ETTh1, so you have to change hyper parameters.
You can try seq_len=48; label_len=24; pred_len=24; d_ff=2048; e_layer=2; d_layer=1 to get ETTh2 experiment prediction results, my results are:

>>>>>>>testing : informer_ETTh2_ftM_sl48_ll48_pl24_dm512_nh8_el2_dl1_df2048_atprob_fc5_ebtimeF_dtTrue_test_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2857
test shape: (89, 32, 24, 7) (89, 32, 24, 7)
test shape: (2848, 24, 7) (2848, 24, 7)
mse:0.49008972642133736, mae:0.5386070956473465

>>>>>>>testing : informer_ETTh2_ftM_sl48_ll48_pl24_dm512_nh8_el2_dl1_df2048_atprob_fc5_ebtimeF_dtTrue_test_1<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2857
test shape: (89, 32, 24, 7) (89, 32, 24, 7)
test shape: (2848, 24, 7) (2848, 24, 7)
mse:0.5389239759632527, mae:0.5776364176629023

We performed many different combinations of seq_len, label_len, e_layers and d_layers in the experiments (the values of sequence length are all multiples of the 24), and you can also try for data with different characteristics.

By the way, our current code has updated the data scaling way compared with the paper on arxiv, so we will update the experiment results later.

Thanks for your attention to our work.

Hi,
I have changed the code for ETTh2 according to this, my results are:

testing : informer_ETTh2_ftM_sl48_ll24_pl24_dm512_nh8_el2_dl1_df2048_atprob_ebtimeF_dtTrue_exp_0<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
test 2857
test shape: (89, 32, 24, 7) (89, 32, 24, 7)
test shape: (2848, 24, 7) (2848, 24, 7)
mse:0.5467308275443579, mae:0.574276831705922

However, the prediction figure does not match. Do I have something missing? Thank you very much.
image

I am also very interested on how to tune to reproduce this figure.
image

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024

Hi
I just checked the past results and the figure I draw looks like this:
image

The results in the table are just the mse and mae calculated using prediction and groundtruth, which do not consider the result of drawing. But according to our observations in the experiment, LSTM can also obtain very small mse and mae, but you will get a straight line when you draw its result of a longer sequence prediction.

The figure in our appendix is a sample selected from results of ETTm1 (actually it is ETTm2, we made a mistake in the naming of the file...). I checked a past result, but I’m not sure if this location of the sample is correct, and the figure looks like this:
image

You can also try to experiment with the parameters shown in the 'path'.
Thanks!

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

@cookieminions Hi, Thank you for your answer. For Etth2, the pred and true are still not matching well after I changed all the parameter according to yours, is there anything else I can change to improve the results. I also couldn't find the true figure in the Etth2 data set, has the result data been scaled? Is the first prediction point start at (8569(train)+2857(val)+500(i)+48(input_len)+24(label_len))th points?
image

Thank you very much.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024

Hi,

  1. The groundtruth used in the figure is scaled.
  2. The first prediction point starts at (8569(train)+2857(val)+500(i))th point, because the encoder input starts at (8569(train)+2857(val)+500(i)-48(input_len))th point and decoder start token starts at (8569(train)+2857(val)+500(i)-24(label_len))th point.
  3. My suggestion about prediction figure is that maybe you can try to use --lradj 'type2' and use larger seq_len and e_layers. Another suggestion is that you can try to experiment with a larger pred_len, such as pred_len=48, Informer will perform better than other methods in long sequence prediction. I just draw some figures using a past experiments results:
    image
    However, we only calculated the mse and mae of the model after we get the ETTh2 results, and did not do more detailed adjustments for the drawing. We also pay attention to this(especially the drawing of short sequence prediction) and will do further experiments and improvements in the future work.

Thank you very much for your attention and help to our work.
Wish you all the best with your work and research!

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Hi @cookieminions
Sorry for asking so many questions, but I still couldn't get similar result as yours.
In my code, It is informer_ETTh2_ftM_sl168_ll48_pl48_dm512_nh8_el3_dl2_df512_atprob_ebfixed_dtTrue_exp_0
where your code is
informer_ETTh2_ftM_sl168_ll48_pl48_dm512_nh8_el3_dl2_df512_atprob_ebfixed_testlr1_1
How could I change to testlr1_1.
image

I am unsure about the testing procedure, if I understand testing part correctly, each 168(input_len)+48(label len)+48(pred_len)=264 points is one section and then repeat for the next section (264 points) for prediction. So the 48 prediction points are using model trained from previous training and validation data and then depend on 168(input len) and 48(label len) to make prediction (24 points).

Could you please also check the code whether there are any code need to be changed?
Sorry for the inconvenience. Thank you so much for your help!
Copy of Informer.ipynb - Colaboratory.pdf

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Hi,

I make a graph to help explain how to get test samples from test series, hope it will be helpful:
image

For example, the first sample is [0, 168](seq_len)+[168-48, 168](label_len)+[168, 168+48](pred_len), the second sample is [1, 169](seq_len)+[169-48, 169](label_len)+[169, 169+48](pred_len). The label_len sequence is a part of seq_len sequence.

As for the figure, testlr1 is just the description of the experiment and the results is also obtained a long time ago (before the data scaling method change #41). And I think that due to the generative decoder, it's normal to get different figures in different repeated experiments with same parameters. So maybe you can try different datasets, prediction lengths(with different parameters) or repeated experiments, and it does not have to be the same as my figure.

And hope Informer can be helpful to your work in the use of other specific tasks.

Thank you for your kindness. I really appreciate your help!

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024

You are welcome. Thanks for your attention to our work.
Wish you all the best with your work and research!

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Hi,

  1. The groundtruth used in the figure is scaled.
  2. The first prediction point starts at (8569(train)+2857(val)+500(i))th point, because the encoder input starts at (8569(train)+2857(val)+500(i)-48(input_len))th point and decoder start token starts at (8569(train)+2857(val)+500(i)-24(label_len))th point.
  3. My suggestion about prediction figure is that maybe you can try to use --lradj 'type2' and use larger seq_len and e_layers. Another suggestion is that you can try to experiment with a larger pred_len, such as pred_len=48, Informer will perform better than other methods in long sequence prediction. I just draw some figures using a past experiments results:
    image
    However, we only calculated the mse and mae of the model after we get the ETTh2 results, and did not do more detailed adjustments for the drawing. We also pay attention to this(especially the drawing of short sequence prediction) and will do further experiments and improvements in the future work.

Thank you very much for your attention and help to our work.
Wish you all the best with your work and research!

Hi @cookieminions,

When I try to find where the true points are in the dataset. I just notice the first prediction actually start at 8569(train)+2857(val)+500(i)+48+48 th point. Where the input len is 48, label len is 24, pred len is 24.
I don't understand why there is 2*input len=96 points before the prediction.
thank you.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024

I think maybe your confusion can be explained by code, in code data/data_loader.py line 48-51 and line 78-79 and line 95-96:

# 48-51
border1s = [0, 12*30*24 - self.seq_len, 12*30*24+4*30*24 - self.seq_len]
border2s = [12*30*24, 12*30*24+4*30*24, 12*30*24+8*30*24]
border1 = border1s[self.set_type]
border2 = border2s[self.set_type]

#78-79
self.data_x = data[border1:border2]
self.data_y = data[border1:border2]

# 95-96
def __len__(self):
    return len(self.data_x) - self.seq_len- self.pred_len + 1

# test flag:
# data_x: [12*30*24+4*30*24-self.seq_len: 12*30*24+8*30*24]
# data_x: [train_len+val_len-seq_len: train_len+val_len+test_len]
# test_len = 4*30*24

The length of data_x in test_dataset is test_len+seq_len, and the size of test_dataset is in line 96: len(self.data_x) - self.seq_len- self.pred_len + 1, which means test_dataset size has nothing to do with seq_len (len(data_x)=test_len+seq_len, so test_dataset size is test_len+seq_len-seq_len-pred_len+1 = test_len-pred_len+1). So you can adjust seq_len freely in the experiment.

When you load the first sample of test_dataset, you will get an encoder input with length seq_len and a groundtruth sequence with length pred_len. No matter how long your seq_len is, the first sample starts from the same place.

from informer2020.

Qiuzhuang avatar Qiuzhuang commented on May 16, 2024

I also still can't reproduce the level of accuracy as reported in the paper in univariate mode.

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

I think maybe your confusion can be explained by code, in code data/data_loader.py line 48-51 and line 78-79 and line 95-96:

# 48-51
border1s = [0, 12*30*24 - self.seq_len, 12*30*24+4*30*24 - self.seq_len]
border2s = [12*30*24, 12*30*24+4*30*24, 12*30*24+8*30*24]
border1 = border1s[self.set_type]
border2 = border2s[self.set_type]

#78-79
self.data_x = data[border1:border2]
self.data_y = data[border1:border2]

# 95-96
def __len__(self):
    return len(self.data_x) - self.seq_len- self.pred_len + 1

# test flag:
# data_x: [12*30*24+4*30*24-self.seq_len: 12*30*24+8*30*24]
# data_x: [train_len+val_len-seq_len: train_len+val_len+test_len]
# test_len = 4*30*24

The length of data_x in test_dataset is test_len+seq_len, and the size of test_dataset is in line 96: len(self.data_x) - self.seq_len- self.pred_len + 1, which means test_dataset size has nothing to do with seq_len (len(data_x)=test_len+seq_len, so test_dataset size is test_len+seq_len-seq_len-pred_len+1 = test_len-pred_len+1). So you can adjust seq_len freely in the experiment.

When you load the first sample of test_dataset, you will get an encoder input with length seq_len and a groundtruth sequence with length pred_len. No matter how long your seq_len is, the first sample starts from the same place.

Hi @cookieminions,
Thank you for your explanation, I am not sure I understand correctly.
image
When I run the code, I can see the size of Train, Validation and Test data.

start training : informer_ETTh1_ftM_sl96_ll48_pl24_dm512_nh8_el3_dl2_df512_atprob_ebtimeF_dtTrue_exp_0>>>>>>>>>>>>>>>>>>>>>>>>>>
train 8521
val 2857
test 2857

In the script, I can see
border1s = [0, 12x30x24 - self.seq_len, 12x30x24+4x30x24 - self.seq_len]
border2s = [12x30x24, 12x30x24+4x30x24, 12x30x24+8x30x24]
seq_len is 96,
border1s=[0,8544,11424]
border2s=[8640,11520,14400]

Therefore,
Train=8640,
Val=2976
Test=2976

There is 24 points different to the code output.

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Yes, your understand is correct.

But the size of Train, Validation and Test data you see when running data is the size of dataloader. However, in code exp/exp_informer.py line 85-90, we used drop_last=True, so the dataloader will drop the last incomplete batch (it will not affect the test samples because we use the same batch_size for all methods and the test dataloader does not be shuffled).

So the length of data_x is not the same with the length of dataloader when running the code.

if flag == 'test':
    shuffle_flag = False; drop_last = True; batch_size = args.batch_size
else:
    shuffle_flag = True; drop_last = True; batch_size = args.batch_size

data_loader = DataLoader(
            data_set,
            batch_size=batch_size,
            shuffle=shuffle_flag,
            num_workers=args.num_workers,
            drop_last=drop_last)

Great! Thank you very much.

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Hi @cookieminions
I have few more questions, hope you could help me.

  1. Is the first prediction uses seq_len in validation set?
  2. When training and validation is finished, is it possible to add new data and make fast prediction without retrain the model.
  3. My data is ranged from -3 to 3, which does not need to be normalised. If I don't want my data to be normalised and scaled, do I just go to dataloader.py and change scale: True to False? Will it influence on softmax and the prediction result?
    Thank you very much.

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024

Thank you, also is it possible to output confidence interval or prediction interval?

from informer2020.

Erickurashi avatar Erickurashi commented on May 16, 2024
  1. Yes, data_x in test_set uses seq_len in vali_set.
  2. Yes, it's okay to use model to predict new data without test.
  3. You just need change the scale to False when you initialize the dataset and the data will not be scaled.

Hi @cookieminions,
I don't understand why is data_x in test_set uses seq_len in vali_set. Should we use the new data from test set as seq_len, so there won't leak any information for prediction.
Thank you.

from informer2020.

cookieminions avatar cookieminions commented on May 16, 2024
  1. Yes, data_x in test_set uses seq_len in vali_set.
  2. Yes, it's okay to use model to predict new data without test.
  3. You just need change the scale to False when you initialize the dataset and the data will not be scaled.

Hi @cookieminions,
I don't understand why is data_x in test_set uses seq_len in vali_set. Should we use the new data from test set as seq_len, so there won't leak any information for prediction.
Thank you.

Hi,

If using new data from test set as seq_len, different seq_len will cause different number of prediction samples because the length of test set series is fixed. But using seq_len in vali set, the data in vali set is just used as encoder's input and not used as prediction series, so it will not influence the test set samples.

from informer2020.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.