Comments (11)
As I read it, he's saying he excluded the current time step. I'm still squinting at this code myself. This thing is really complicated- it has seven layers.
BTW the code throws a device mismatch error on a GPU. You need to import constants.device and call to(device) on the arrays returned by torch.zeroes() in modules.py. Then it works.
They also deprecated torch.autograd.Variable in Pytorch 1.0 so you don't need to call it anymore.
from da-rnn.
Yeah he didn't explain that well- it took me a while to notice the word "contemporary" as referring to data at time T.
The paper specifies that the inputs are the values of the target series from 1 to T-1, along with all the values from all exogenous series from 1 to T. He's trying to predict the target series at T.
He's worried that the network will learn that it can ignore all the older data, focusing only the data from the very last time point, and then simply return the sum of the prices at time T. You're basically giving it all the information it needs to cheat if you include data at T.
I'm not sure why the paper includes it either. "T" means "tomorrow". Today is T-1. An algorithm to predict tomorrow's NASDAQ shouldn't be requiring tomorrow's individual prices as input. I can do that in Excel.
from da-rnn.
So, what I'm getting from this discussion is the paper appears to be including the current time-series and is thus doing regression. However, is Chandler's code also including the current time step? Have I replicated this error?
It's been a while since I've used this code and I don't like how I did the data pre-processing here; it's quite hard to read. Consequently, if you don't know the answer to my question, that's fine, I can figure it out later.
from da-rnn.
I'm trying to dig into this dataloader (really sophisticated btw).
What i've noticed: on a dummy "linear" data it skips y_history (or y_target, it depends on the point of view) at time T. i.e.: feats are 1,2..9 (for T=10), y_hist is 11,12..19, and y_target generated from prep_train_data function is [21]. For me it either should be 1,2..10 and 11,12...20 along with [21] as y_target, or [20] as target. But can't get idea of limiting lenght of time window from 10 to 9 timesteps during data processing.
from da-rnn.
@jtiscione Yeah, it is meaningless to add the current exogenous sequence for prediction, which is more like an auxiliary measurement method. In addition, I suggest that you refer to the later paper (GeoMAN: Multi-level Attention Networks for Geo-sensory Time Series Prediction). Their ideas are consistent and the code is open source.
Another problem is that I can't call all the GPUs to run this program. The utilization rate is about 20% and the memory utilization rate is 9%. I see that you mentioned the changes in GPU operation. Could you give me more detailed guidance? Look forward to your reply.
from da-rnn.
@Seanny123 Yes, you didn't use the current external sequence, and Chandler also does not do this. Your approach is not the same as that mentioned in the original paper, but it is more meaningful, but the result is worse.
In addition, this way of data proprocessing together is not realistic, because we will not know the future series in advance. But if the training set and the test set are operated separately, the result will not fluctuate much, so this operation is acceptable. The original author did the same thing.
from da-rnn.
@notonlyvandalzzz For this proble, please pay attention to all T-1 in the code, especially when raw data is entered.
from da-rnn.
@lyq1471 yes, i see this. Replacing all T-1 with T gives gapless data, but both Chandler and @Seanny123 made code with T-1
from da-rnn.
@notonlyvandalzzz Maybe you should believe the truth, not anyone. Besides, are you running on GPU? Can all GPU resources be called? I think maybe the Pytorch version (PyTorch0.3.0) or the delay in sending and receiving data causes me to not use the GPU.
from da-rnn.
Yeah he didn't explain that well- it took me a while to notice the word "contemporary" as referring to data at time T.
The paper specifies that the inputs are the values of the target series from 1 to T-1, along with all the values from all exogenous series from 1 to T. He's trying to predict the target series at T.
He's worried that the network will learn that it can ignore all the older data, focusing only the data from the very last time point, and then simply return the sum of the prices at time T. You're basically giving it all the information it needs to cheat if you include data at T.
I'm not sure why the paper includes it either. "T" means "tomorrow". Today is T-1. An algorithm to predict tomorrow's NASDAQ shouldn't be requiring tomorrow's individual prices as input. I can do that in Excel.
@jtiscione original DA-RNN paper from arxiv.org says about y(1...T-1) and x(1...T) mapped to y(T)
from da-rnn.
Quoted from the Abstract of DA-RNN paper:
The Nonlinear autoregressive exogenous (NARX)
model, which predicts the current value of a time
series based upon its previous values as well as the
current and past values of multiple driving (exogenous) series, has been studied for decades.
Their scenario setup are based on NARX,
By googling "NARX", two types of problem definition were found, differed in U(t) (exogenous driving in time t) is included or not.
https://en.wikipedia.org/wiki/Nonlinear_autoregressive_exogenous_model
https://www.mathworks.com/help/deeplearning/ug/design-time-series-narx-feedback-neural-networks.html
In the paper it gives two experiments, SML 2010 and NASDAQ 100:
In SML 2020, by checking the attributes definition: https://archive.ics.uci.edu/ml/datasets/SML2010
It makes sense to include U(t) in prediction of Y(t), since we don't know the mysterious relation between Temp. and the exogenous driving such as Wind speed, CO2 ppm, Date, etc.
But in NASDAQ100, the experiment setup is somewhat ambiguous. Since NASDAQ100 index can be directly computed using market-cap weighted method in real time, given its 100 composites prices. So if U covers all 100 composites, U(t) can produce Y(t) at 100% accuracy, it becomes a failed-setup prediction problem. But in the paper:
In the NASDAQ 100 Stock dataset1, we collected the stock
prices of 81 major corporations under NASDAQ 100, which
are used as the driving time series. The index value of the
NASDAQ 100 is used as the target series.
Only 81 composites were used in U, so the inclusion of U(t) in prediction of Y(t) still can be said a meaningful problem setup, though it contributes most part of the learning effort.
from da-rnn.
Related Issues (20)
- Error using CUDA HOT 4
- why not use tanh in encoder while use it in decoder ? HOT 2
- FileNotFoundError: [Errno 2] No such file or directory: '/da-rnn/plots/pred_0.png' HOT 1
- I got an error when run main_predict.py after running main.py successful HOT 3
- one of the variables needed for gradient computation has been modified by an inplace operation
- How many epoch should I choose? HOT 1
- why use companies' stock price to predict NASDAQ-100 Index?
- Is there any room for gpu memory improvements? HOT 2
- The result value is different from raw data because of StandardScaler(). How can I get the plots and calculate MSE use raw data?
- It's weird that this code can only performance well on predicting 'NDX' HOT 1
- Dose this model genelarize well on your (other) dataset? HOT 1
- Evaluation mode missing on validation and predict
- tensor problem
- Multi-Step Prediction
- Can't find tanh function in eqn. 8 HOT 1
- Regarding scaling of data HOT 2
- Reg predicting the output HOT 1
- NaN issues when changing dat-aset. HOT 2
- Data overlapping in train/test split HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from da-rnn.