damitkwr / esrnn-gpu Goto Github PK
View Code? Open in Web Editor NEWPyTorch GPU implementation of the ES-RNN model for time series forecasting
License: MIT License
PyTorch GPU implementation of the ES-RNN model for time series forecasting
License: MIT License
How would one update the config for Hourly or Daily ?
On Daily I seem to be getting errors.
On hourly I have in the config:
'chop_val': 200,
'variable': "Hourly",
'dilations': ((1, 12), (12, 24)),
'state_hsize': 50,
'seasonality': 24,
'input_size': 24,
'output_size': 48,
'level_variability_penalty': 50
and I get the error
input.size(-1) must be equal to input_size. Expected 30, got 25
On Daily (default) I have:
'chop_val': 200,
'variable': "Daily",
'dilations': ((1, 7), (14, 28)),
'state_hsize': 50,
'seasonality': 7,
'input_size': 7,
'output_size': 14,
'level_variability_penalty': 50
and I get the error:
ValueError: Item wrong length 4226 instead of 4227.
Any advice on how to proceed would be appreciated
Hi @damitkwr
How would you use the model with different data.
Say univariate daily data.
Many thanks,
Best,
Andrew
great code thanks
may you clarify :
will it work for multivariate time series prediction both regression and classification
1
where all values are continues values
weight height age target
1 56 160 34 1.2
2 77 170 54 3.5
3 87 167 43 0.7
4 55 198 72 0.5
5 88 176 32 2.3
2
or even will it work for multivariate time series where values are mixture of continues and categorical values
for example 2 dimensions have continues values and 3 dimensions are categorical values
color weight gender height age target
1 black 56 m 160 34 yes
2 white 77 f 170 54 no
3 yellow 87 m 167 43 yes
4 white 55 m 198 72 no
5 white 88 f 176 32 yes
when i run the code in pthon es_rnn module not found occurs shows
Edit: Never mind. The loss is just getting averaged as batch_num
is outside of the for-loop that it increases in.
In trainer.py
inside the train
method after an epoch finishes, the epoch_loss
is divided by the batch_num
+ 1. That means that after every batch, the epoch_loss
is forcefully decreased as the denominatior (the batch_num
) is constantly getting bigger:
epoch_loss = epoch_loss / (batch_num + 1)
Maybe I'm misunderstanding something here, but it doesn't seem right that the loss is getting artificially decreased simply based on which batch the training loop is on. I looked through the original C++ implementation, but couldn't find anything that looked like the above line (I don't know C++ very well, so that may be why).
P.S.
Thanks for the python/torch implementation of this project btw, it's a great resource for learning some good forecasting methods/strategies.
Hi, I've tried to understand the dataset and how do you really train the model on it, but it seems that the information is not available about the competition anymore.
Can you explain how the data is loaded for example from monthly Train.csv?
Any chance you could push an end-to-end example as a jupyter notebook? It's really hard to follow the codebase... thanks!
Hey, thank you for publishing your results, very impressive.
What is the csv formatting for Train and Test? I've noticed that read_file creates arrays of different shapes:
Train ends up in shape (number_of_series, )
Test - (number_of_series, time_steps)
I would like to reproduce it on my data. How to format Train csv with pd.to_csv to be properly processed by your code?
Thanks!
Best regards
def read_file(C:\Users\welcome\Downloads\m4 forecast\M4-methods-master\Dataset()):
SyntaxError: unexpected character after line continuation character
I'm running the code nearly unchanged on a Google Cloud Compute instance with 2x Nvidia V100, 60GB RAM, 16CPUs. config.py is unchanged.
With 15 epochs on the Quarterly data, total training time is 16.01 minutes, almost double the 8.94minutes shown in the paper. However, the validation results at the end of epoch 15 are nearly identical to the paper's reported results:
{'Demographic': 10.814908027648926, 'Finance': 10.71678638458252, 'Industry': 7.436440944671631, 'Macro': 9.547700881958008, 'Micro': 11.63847827911377, 'Other': 7.911505699157715, 'Overall': 10.091866493225098, 'loss': 7.8162946701049805}
When I remove the model saving step, training time decreased to 15.76 minutes.
I downloaded the dataset from the provided link, and made no changes.
I'm using updated package versions, although I wouldn't expect this to halve performance:
What hardware configuration was the authors' testing done on? I'm using dual V100s, the highest-end GPUs available on GCP. I'd expect to match or outperform the reported benchmarks. Do you have any thoughts on why my performance is considerably worse in my situation?
AttributeError: module 'tensorboard.summary._tf.summary' has no attribute 'FileWriter'..........how to fix it
Sorry to bother you. I can't find which part in this code implement the prediction intervals(PI) function. Is this code just for point forecast(PF)?
Thank you.
i have a problem in installing i type pip install git+https://github.com/damitkwr/ESRNN-GPU.git then error occurs then i type pip install git+https://github.com/damitkwr/ESRNN-GPU.git#egg=ESRNN-GPU then also error show plz help
Hi,
I am trying to use the code, but I encounter several issues. One is that,
"train_path = '../data/Train/%s-train.csv' % (config['variable'])
test_path = '../data/Test/%s-test.csv' % (config['variable'])"
in main.py cause some problems. If I understand the code correctly, it changes the path to "../data/Train/Daily-train.csv", which is a CSV file that does not exist. The other issue is that I do not really understand what kind of information info.csv should contain. Would you please help me deal with these problems?
Thanks
For Weekly exemple
Hi,
It seems that max_loss
in function train_epochs()
at esrnn/trainer.py
is not being updated appropriately
def train_epochs(self):
max_loss = 1e8
start_time = time.time()
for e in range(self.max_epochs):
self.scheduler.step()
epoch_loss = self.train()
if epoch_loss < max_loss:
self.save()
epoch_val_loss = self.val()
if e == 0:
file_path = os.path.join(self.csv_save_path, 'validation_losses.csv')
with open(file_path, 'w') as f:
f.write('epoch,training_loss,validation_loss\n')
with open(file_path, 'a') as f:
f.write(','.join([str(e), str(epoch_loss), str(epoch_val_loss)]) + '\n')
print('Total Training Mins: %5.2f' % ((time.time()-start_time)/60))
Thanks!
Hi there,
My dataset only have two column- Date and Price. Is it possible to let the input be the price and the output be the price as well. If so, how should I divide them into x_train,y_train, and x_test, y_test. Will the algorithm do it for me automatically? Or if I only have these two columns, I will not be able to use this algorithm?
Hello,
I've tried to install ESRNN via the instruction in this link: https://pypi.org/project/ESRNN/
which were: pip install ESRNN
However, when I try to run the follow code:
from ESRNN.m4_data import prepare_m4_data
from ESRNN.utils_evaluation import evaluate_prediction_owa
from ESRNN import ESRNN
I get the following error:
Traceback (most recent call last):
File "ESRNN.py", line 2, in
from ESRNN.m4_data import prepare_m4_data
File "C:\Users\mario\Documents\Python Benjamin\ESRNN.py", line 2, in
from ESRNN.m4_data import prepare_m4_data
ModuleNotFoundError: No module named 'ESRNN.m4_data'; 'ESRNN' is not a package
I've tried adding ESRNN as a path variable and still get the same error.
Could anyone please assist?
You import it in logger.py
Great work on this project! Having used a version of the original ES-RNN code, I have a few questions about the differences between the implementations and the results presented in the paper.
The paper mentions "Note that for monthly data, Smyl et al. (2018) were running the algorithm of 6 pairs of 2 workers and for quarterly data, 4 pairs of 2 workers were used." For the results, did the times reported in the paper represent running the ESRNN-GPU implementation with multiple workers in aggregate (CPU Time), multiple workers concurrently (Wall Clock Time) or was the time reported for a single worker?
What GPU did you test on? Testing the M4 data set with CUDA enabled PyTorch on a notebook graphics cards (Nvidia GeForce GTX 1050) vs non-CUDA enabled PyTorch showed the CPU only version to be faster (i7 8550) by about 3x. It is likely that the PyTorch CPU enabled version is still faster the the original ESRNN, but I have not confirmed that.
Is there any plan to implemented the future work for Variable Length Series mentioned in Section 8.1? What would be required?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.