Giter Club home page Giter Club logo

time-series-forecasting-pytorch's Introduction

Predicting Stock Prices with Deep Learning

Project Overview

Deep learning is part of a broader family of machine learning methods based on artificial neural networks, which are inspired by our brain's own network of neurons. Among the popular deep learning paradigms, Long Short-Term Memory (LSTM) is a specialized architecture that can "memorize" patterns from historical sequences of data and extrapolate such patterns for future events.

Since the financial market is naturally comprised of historical sequences of equity prices, more and more quantitative researchers and finance professionals are using LTSM to model and predict market price movements. In this project, we will go through the end-to-end machine learning workflow of developing an LTSM model to predict stock market prices using PyTorch and Alpha Vantage APIs.

The project is grouped into the following sections, which are representative of a typical machine learning workflow:

  • Installing Python dependencies
  • Data preparation: acquiring financial market data from Alpha Vantage
  • Data preparation: normalizing raw data
  • Data preparation: generating training and validation datasets
  • Defining the LSTM model
  • Model training
  • Model evaluation
  • Predicting future stock prices

By the end of this project, you will have a fully functional LSTM model that predicts future stock prices based on historical price movements, all in a single Python file. This tutorial has been written in a way such that all the essential code snippets have been embedded inline. You should be able to develop, train, and test your machine learning model without referring to other external pages or documents.

Let's get started!

Installing Python dependencies

We recommend using Python 3.6 or higher for this project. If you do not have Python installed in your local environment, please visit python.org for the latest download instruction.

With Python installed, please go to the Command Line interface of your operating system and use the "pip install" prompts below to install Numpy, PyTorch, Matplotlib, and Alpha Vantage, respectively.

Create your Python file and follow along the codes in this tutorial

Now, create a new Python file named project.py and paste the following code into the file:

import numpy as np

import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import Dataset
from torch.utils.data import DataLoader

import matplotlib.pyplot as plt
from matplotlib.pyplot import figure

from alpha_vantage.timeseries import TimeSeries 

print("All libraries loaded")

If your have succesfully installed all the Python dependencies above, you should see the text "All libraries loaded" after running the project.py file.

Now append the following code to the project.py file. Don't forget to replace "YOUR_API_KEY" with your actual Alpha Vantage API key, which can be obtained from the support page.

config = {
    "alpha_vantage": {
        "key": "YOUR_API_KEY", # Claim your free API key here: https://www.alphavantage.co/support/#api-key
        "symbol": "IBM",
        "outputsize": "full",
        "key_adjusted_close": "5. adjusted close",
    },
    "data": {
        "window_size": 20,
        "train_split_size": 0.80,
    }, 
    "plots": {
        "xticks_interval": 90, # show a date every 90 days
        "color_actual": "#001f3f",
        "color_train": "#3D9970",
        "color_val": "#0074D9",
        "color_pred_train": "#3D9970",
        "color_pred_val": "#0074D9",
        "color_pred_test": "#FF4136",
    },
    "model": {
        "input_size": 1, # since we are only using 1 feature, close price
        "num_lstm_layers": 2,
        "lstm_size": 32,
        "dropout": 0.2,
    },
    "training": {
        "device": "cpu", # "cuda" or "cpu"
        "batch_size": 64,
        "num_epoch": 100,
        "learning_rate": 0.01,
        "scheduler_step_size": 40,
    }
}

Over the course of this project, we will continue adding new code blocks to the project.py file. By the time you reach the end of the tutorial, you should have a fully functional LSTM machine learning model to predict stock market price movements, all in a single Python script. Please feel free to compare your project.py with the official copy if you would like to have a "sanity check" anytime during the project.

Data preparation: acquiring financial market data from Alpha Vantage

In this project, we will train an LSTM model to predict stock price movements. Before we can build the "crystal ball" to predict the future, we need historical stock price data to train our deep learning model. To this end, we will query the Alpha Vantage stock data API via a popular Python wrapper. For this project, we will obtain over 20 years of daily close prices for IBM from November 1999 to April 29, 2021.

historical prices

Append the following code block to your project.py file. If you re-run the file now, it should generate a graph similar to above thanks to the powerful matplotlib library.

View codes
def download_data(config):
    ts = TimeSeries(key=config["alpha_vantage"]["key"])
    data, meta_data = ts.get_daily_adjusted(config["alpha_vantage"]["symbol"], outputsize=config["alpha_vantage"]["outputsize"])

    data_date = [date for date in data.keys()]
    data_date.reverse()

    data_close_price = [float(data[date][config["alpha_vantage"]["key_adjusted_close"]]) for date in data.keys()]
    data_close_price.reverse()
    data_close_price = np.array(data_close_price)

    num_data_points = len(data_date)
    display_date_range = "from " + data_date[0] + " to " + data_date[num_data_points-1]
    print("Number data points", num_data_points, display_date_range)

    return data_date, data_close_price, num_data_points, display_date_range

data_date, data_close_price, num_data_points, display_date_range = download_data(config)

# plot

fig = figure(figsize=(25, 5), dpi=80)
fig.patch.set_facecolor((1.0, 1.0, 1.0))
plt.plot(data_date, data_close_price, color=config["plots"]["color_actual"])
xticks = [data_date[i] if ((i%config["plots"]["xticks_interval"]==0 and (num_data_points-i) > config["plots"]["xticks_interval"]) or i==num_data_points-1) else None for i in range(num_data_points)] # make x ticks nice
x = np.arange(0,len(xticks))
plt.xticks(x, xticks, rotation='vertical')
plt.title("Daily close price for " + config["alpha_vantage"]["symbol"] + ", " + display_date_range)
plt.grid(b=None, which='major', axis='y', linestyle='--')
plt.show()

Please note that we are using the adjusted close field of Alpha Vantage's daily adjusted API to remove any artificial price turbulences due to stock splits and dividend payout events. It is generally considered an industry best practice to use split/dividend-adjusted prices instead of raw prices to model stock price movements.

Data preparation: normalizing raw data

Machine learning algorithms (such as our LSTM algorithm) that use gradient descent as the optimization technique require data to be scaled. This is due to the fact that the feature values in the model will affect the step size of the gradient descent, potentially skewing the LSTM model in unexpected ways.

This is where data normalization comes in. Normalization can increase the accuracy of your model and help the gradient descent algorithm converge more quickly towards the target minima. By bringing the input data on the same scale and reducing its variance, none of the weights in the artificial neural network will be wasted on normalizing tasks, which means the LSTM model can more efficiently learn from the data and store patterns in the network. Furthermore, LSTMs are intrinsically sensitive to the scale of the input data. For the above reasons, it is crucial to normalize the data.

Since stock prices can range from tens to hundreds and thousands - $40 to $160 in the case of IBM - we will perform normalization on the stock prices to standardize the range of these values before feeding the data to the LSTM model. The following code snippet rescales the data to have a mean of 0 and the standard deviation is 1.

Append the following code block to your project.py file.

View codes
class Normalizer():
    def __init__(self):
        self.mu = None
        self.sd = None

    def fit_transform(self, x):
        self.mu = np.mean(x, axis=(0), keepdims=True)
        self.sd = np.std(x, axis=(0), keepdims=True)
        normalized_x = (x - self.mu)/self.sd
        return normalized_x

    def inverse_transform(self, x):
        return (x*self.sd) + self.mu

# normalize
scaler = Normalizer()
normalized_data_close_price = scaler.fit_transform(data_close_price)

Data preparation: generating training and validation datasets

Supervised machine learning methods such as LSTM learns the mapping function from input variables (X) to the output variable (Y). Learning from the training dataset can be thought of as a teacher supervising the learning process, where the teacher knows all the right answers.

In this project, we will train the model to predict the 21st day price based on the past 20 days' close prices. The number of days, 20, was selected based on a few reasons:

  • When LSTM models are used in natural language processing, the number of words in a sentence typically ranges from 15 to 20 words
  • Gradient descent considerations: attempting to back-propagate across very long input sequences may result in vanishing gradients
  • Longer sequences tend to have much longer training times

After transforming the dataset into input features and output labels, the shape of our X is (5388, 20), 5388 for the number of rows, each row containing a sequence of past 20 days' prices. The corresponding Y data shape is (5388,), which matches the number of rows in X.

We also split the dataset into two parts, for training and validation. We split the data into 80:20 - 80% of the data is used for training, with the remaining 20% to verify our model's performance in predicting future prices. (Alternatively, another common practice is to split the initial data into train, validation, and test set (70/20/10), where the test dataset is not used at all during the training process.) This graph shows the portion of data for training and validation, approximately data before 2017 are used for training and after for verifying our model's performance.

dataset split

Append the following code block to your project.py file. If you re-run the file now, it should generate a graph similar to above, where the training data is colored in green and validation data is colored in blue.

View codes
def prepare_data_x(x, window_size):
    # perform windowing
    n_row = x.shape[0] - window_size + 1
    output = np.lib.stride_tricks.as_strided(x, shape=(n_row, window_size), strides=(x.strides[0], x.strides[0]))
    return output[:-1], output[-1]


def prepare_data_y(x, window_size):
    # # perform simple moving average
    # output = np.convolve(x, np.ones(window_size), 'valid') / window_size

    # use the next day as label
    output = x[window_size:]
    return output

data_x, data_x_unseen = prepare_data_x(normalized_data_close_price, window_size=config["data"]["window_size"])
data_y = prepare_data_y(normalized_data_close_price, window_size=config["data"]["window_size"])

# split dataset

split_index = int(data_y.shape[0]*config["data"]["train_split_size"])
data_x_train = data_x[:split_index]
data_x_val = data_x[split_index:]
data_y_train = data_y[:split_index]
data_y_val = data_y[split_index:]

# prepare data for plotting

to_plot_data_y_train = np.zeros(num_data_points)
to_plot_data_y_val = np.zeros(num_data_points)

to_plot_data_y_train[config["data"]["window_size"]:split_index+config["data"]["window_size"]] = scaler.inverse_transform(data_y_train)
to_plot_data_y_val[split_index+config["data"]["window_size"]:] = scaler.inverse_transform(data_y_val)

to_plot_data_y_train = np.where(to_plot_data_y_train == 0, None, to_plot_data_y_train)
to_plot_data_y_val = np.where(to_plot_data_y_val == 0, None, to_plot_data_y_val)

## plots

fig = figure(figsize=(25, 5), dpi=80)
fig.patch.set_facecolor((1.0, 1.0, 1.0))
plt.plot(data_date, to_plot_data_y_train, label="Prices (train)", color=config["plots"]["color_train"])
plt.plot(data_date, to_plot_data_y_val, label="Prices (validation)", color=config["plots"]["color_val"])
xticks = [data_date[i] if ((i%config["plots"]["xticks_interval"]==0 and (num_data_points-i) > config["plots"]["xticks_interval"]) or i==num_data_points-1) else None for i in range(num_data_points)] # make x ticks nice
x = np.arange(0,len(xticks))
plt.xticks(x, xticks, rotation='vertical')
plt.title("Daily close prices for " + config["alpha_vantage"]["symbol"] + " - showing training and validation data")
plt.grid(b=None, which='major', axis='y', linestyle='--')
plt.legend()
plt.show()

We will train our models using the PyTorch framework, a machine learning library written in Python. At the heart of PyTorch's data loading utility is the DataLoader class, an efficient data generation scheme that leverages the full potential of your computer's Graphics Processing Unit (GPU) during the training process where applicable. DataLoader requires the Dataset object to define the loaded data. Dataset is a map-style dataset that implements the __getitem__() and __len__() protocols, and represents a map from indices to data samples.

Append the following code block to your project.py file.

View codes
class TimeSeriesDataset(Dataset):
    def __init__(self, x, y):
        x = np.expand_dims(x, 2) # in our case, we have only 1 feature, so we need to convert `x` into [batch, sequence, features] for LSTM
        self.x = x.astype(np.float32)
        self.y = y.astype(np.float32)
        
    def __len__(self):
        return len(self.x)

    def __getitem__(self, idx):
        return (self.x[idx], self.y[idx])

dataset_train = TimeSeriesDataset(data_x_train, data_y_train)
dataset_val = TimeSeriesDataset(data_x_val, data_y_val)

print("Train data shape", dataset_train.x.shape, dataset_train.y.shape)
print("Validation data shape", dataset_val.x.shape, dataset_val.y.shape)

train_dataloader = DataLoader(dataset_train, batch_size=config["training"]["batch_size"], shuffle=True)
val_dataloader = DataLoader(dataset_val, batch_size=config["training"]["batch_size"], shuffle=True)

Defining the LSTM model

With the training and evaluation data now fully normalized and prepared, we are ready to build our LSTM model!

As mentioned before, LSTM is a specialized artificial neural network architecture that can "memorize" patterns from historical sequences of data and extrapolate such patterns for future events. Specifically, it belongs to a group of artificial neural networks called Recurring Neural Networks (RNNs).

LSTM is a popular artificial neural network because it manages to overcome many technical limitations of RNNs. For example, RNNs fail to learn when the data sequence is greater than 5 to 10 due to the vanishing gradients problem, where the gradients are vanishingly small, effectively preventing the model from learning. LSTMs can learn long sequences of data by enforcing constant error flow through self-connected hidden layers, which contain memory cells and corresponding gate units. If you are interested in learning more about the inner workings of LSTM and RNNs, this is an excellent explainer for your reference.

Our artificial neural network will have three main layers, with each layer designed with a specific logical purpose:

  • linear layer 1 (linear_1): to map input values into a high dimensional feature space, transforming the features for the LSTM layer
  • LSTM (lstm): to learn the data in sequence
  • linear layer 2 (linear_2): to produce the predicted value based on LSTM's output

We added Dropout, where randomly selected neurons are ignored during training; therefore regularizing the network to prevent overfitting and improving overall model performance. As an optional step, we also initialize the LSTM's model weights, as some researchers have observed that it could help the model learn better.

Append the following code block to your project.py file.

View codes
class LSTMModel(nn.Module):
    def __init__(self, input_size=1, hidden_layer_size=32, num_layers=2, output_size=1, dropout=0.2):
        super().__init__()
        self.hidden_layer_size = hidden_layer_size

        self.linear_1 = nn.Linear(input_size, hidden_layer_size)
        self.relu = nn.ReLU()
        self.lstm = nn.LSTM(hidden_layer_size, hidden_size=self.hidden_layer_size, num_layers=num_layers, batch_first=True)
        self.dropout = nn.Dropout(dropout)
        self.linear_2 = nn.Linear(num_layers*hidden_layer_size, output_size)
        
        self.init_weights()

    def init_weights(self):
        for name, param in self.lstm.named_parameters():
            if 'bias' in name:
                 nn.init.constant_(param, 0.0)
            elif 'weight_ih' in name:
                 nn.init.kaiming_normal_(param)
            elif 'weight_hh' in name:
                 nn.init.orthogonal_(param)

    def forward(self, x):
        batchsize = x.shape[0]

        # layer 1
        x = self.linear_1(x)
        x = self.relu(x)
        
        # LSTM layer
        lstm_out, (h_n, c_n) = self.lstm(x)

        # reshape output from hidden cell into [batch, features] for `linear_2`
        x = h_n.permute(1, 0, 2).reshape(batchsize, -1) 
        
        # layer 2
        x = self.dropout(x)
        predictions = self.linear_2(x)
        return predictions[:,-1]

Model training

The LSTM model learns by iteratively making predictions given the training data X. We use mean squared error as the cost function, which measures the difference between the predicted values and the actual values. When the model is making bad predictions, the error value returned by the cost function will be relatively high. The model will fine-tune its weights through backpropagation, improving its ability to make better predictions. Learning stops when the algorithm achieves an acceptable level of performance, where the cost function on the validation dataset is no longer showing incremental improvements.

We use the Adam optimizer that updates the model's parameters based on the learning rate through its step() method. This is how the model learns and fine-tunes its predictions. The learning rate controls how quickly the model converges. A learning rate that is too large can cause the model to converge too quickly to a suboptimal solution, whereas smaller learning rates require more training iterations and may result in prolonged duration for the model to find the optimal solution. We also use the StepLR scheduler to reduce the learning rate during the training process. You may also try the ReduceLROnPlateau scheduler, which reduces the learning rate when a cost function has stopped improving for a "patience" number of epochs. Choosing the proper learning rate for your project is both art and science, and is a heavily researched topic in the machine learning community.

Using mean squared error as the loss function to optimize our model, we calculate the loss on training and validation based on how well the model is doing in these two sets. After every epoch, a smaller loss value indicates that the model is learning, and 0.0 means that no mistakes were made. From the console's logs, loss train gives an idea of how well the model is learning, while loss test shows how well the model generalizes the validation dataset. A well-trained model is identified by a training and validation loss that decreases to the point of stability with relatively small differences between the two final loss values (at this stage, we say the model has "converged"). Generally, the loss of the model will be lower on the training than on the validation dataset.

Append the following code block to your project.py file and re-run the file to start the model training process.

View codes
def run_epoch(dataloader, is_training=False):
    epoch_loss = 0

    if is_training:
        model.train()
    else:
        model.eval()

    for idx, (x, y) in enumerate(dataloader):
        if is_training:
            optimizer.zero_grad()

        batchsize = x.shape[0]

        x = x.to(config["training"]["device"])
        y = y.to(config["training"]["device"])

        out = model(x)
        loss = criterion(out.contiguous(), y.contiguous())

        if is_training:
            loss.backward()
            optimizer.step()

        epoch_loss += (loss.detach().item() / batchsize)

    lr = scheduler.get_last_lr()[0]

    return epoch_loss, lr

train_dataloader = DataLoader(dataset_train, batch_size=config["training"]["batch_size"], shuffle=True)
val_dataloader = DataLoader(dataset_val, batch_size=config["training"]["batch_size"], shuffle=True)

model = LSTMModel(input_size=config["model"]["input_size"], hidden_layer_size=config["model"]["lstm_size"], num_layers=config["model"]["num_lstm_layers"], output_size=1, dropout=config["model"]["dropout"])
model = model.to(config["training"]["device"])

criterion = nn.MSELoss()
optimizer = optim.Adam(model.parameters(), lr=config["training"]["learning_rate"], betas=(0.9, 0.98), eps=1e-9)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=config["training"]["scheduler_step_size"], gamma=0.1)

for epoch in range(config["training"]["num_epoch"]):
    loss_train, lr_train = run_epoch(train_dataloader, is_training=True)
    loss_val, lr_val = run_epoch(val_dataloader)
    scheduler.step()
    
    print('Epoch[{}/{}] | loss train:{:.6f}, test:{:.6f} | lr:{:.6f}'
              .format(epoch+1, config["training"]["num_epoch"], loss_train, loss_val, lr_train))

Console showing the loss and learning rate during training:

Epoch[1/100] | loss train:0.063952, test:0.001398 | lr:0.010000
Epoch[2/100] | loss train:0.011749, test:0.002024 | lr:0.010000
Epoch[3/100] | loss train:0.009831, test:0.001156 | lr:0.010000
Epoch[4/100] | loss train:0.008264, test:0.001022 | lr:0.010000
Epoch[5/100] | loss train:0.009385, test:0.001015 | lr:0.010000
Epoch[6/100] | loss train:0.008516, test:0.003903 | lr:0.010000
Epoch[7/100] | loss train:0.008886, test:0.001576 | lr:0.010000
Epoch[8/100] | loss train:0.007902, test:0.001232 | lr:0.010000
Epoch[9/100] | loss train:0.007905, test:0.001029 | lr:0.010000
Epoch[10/100] | loss train:0.007072, test:0.000994 | lr:0.010000
Epoch[11/100] | loss train:0.008044, test:0.001322 | lr:0.010000
Epoch[12/100] | loss train:0.008069, test:0.000992 | lr:0.010000
Epoch[13/100] | loss train:0.007636, test:0.001639 | lr:0.010000
Epoch[14/100] | loss train:0.007856, test:0.001927 | lr:0.010000
Epoch[15/100] | loss train:0.007416, test:0.001013 | lr:0.010000
Epoch[16/100] | loss train:0.008047, test:0.002594 | lr:0.010000
Epoch[17/100] | loss train:0.008144, test:0.002044 | lr:0.010000
Epoch[18/100] | loss train:0.007806, test:0.001147 | lr:0.010000
Epoch[19/100] | loss train:0.007478, test:0.002960 | lr:0.010000
Epoch[20/100] | loss train:0.007797, test:0.001557 | lr:0.010000
Epoch[21/100] | loss train:0.008145, test:0.001245 | lr:0.010000
Epoch[22/100] | loss train:0.007863, test:0.001053 | lr:0.010000
Epoch[23/100] | loss train:0.007435, test:0.001079 | lr:0.010000
Epoch[24/100] | loss train:0.007417, test:0.001010 | lr:0.010000
Epoch[25/100] | loss train:0.007635, test:0.001027 | lr:0.010000
Epoch[26/100] | loss train:0.007400, test:0.001297 | lr:0.010000
Epoch[27/100] | loss train:0.007509, test:0.001041 | lr:0.010000
Epoch[28/100] | loss train:0.008337, test:0.001614 | lr:0.010000
Epoch[29/100] | loss train:0.007826, test:0.001279 | lr:0.010000
Epoch[30/100] | loss train:0.007793, test:0.001495 | lr:0.010000
Epoch[31/100] | loss train:0.007048, test:0.001033 | lr:0.010000
Epoch[32/100] | loss train:0.008357, test:0.001570 | lr:0.010000
Epoch[33/100] | loss train:0.007990, test:0.001980 | lr:0.010000
Epoch[34/100] | loss train:0.007471, test:0.001482 | lr:0.010000
Epoch[35/100] | loss train:0.007357, test:0.001051 | lr:0.010000
Epoch[36/100] | loss train:0.006991, test:0.001138 | lr:0.010000
Epoch[37/100] | loss train:0.007270, test:0.001125 | lr:0.010000
Epoch[38/100] | loss train:0.007387, test:0.000984 | lr:0.010000
Epoch[39/100] | loss train:0.006867, test:0.001046 | lr:0.010000
Epoch[40/100] | loss train:0.007030, test:0.001595 | lr:0.010000
Epoch[41/100] | loss train:0.006688, test:0.001046 | lr:0.001000
Epoch[42/100] | loss train:0.006341, test:0.000997 | lr:0.001000
Epoch[43/100] | loss train:0.006123, test:0.000978 | lr:0.001000
Epoch[44/100] | loss train:0.006261, test:0.000995 | lr:0.001000
Epoch[45/100] | loss train:0.005989, test:0.001014 | lr:0.001000
Epoch[46/100] | loss train:0.005994, test:0.001145 | lr:0.001000
Epoch[47/100] | loss train:0.006461, test:0.000987 | lr:0.001000
Epoch[48/100] | loss train:0.006413, test:0.000999 | lr:0.001000
Epoch[49/100] | loss train:0.006014, test:0.000994 | lr:0.001000
Epoch[50/100] | loss train:0.006043, test:0.001018 | lr:0.001000
Epoch[51/100] | loss train:0.005996, test:0.000981 | lr:0.001000
Epoch[52/100] | loss train:0.006212, test:0.001185 | lr:0.001000
Epoch[53/100] | loss train:0.006276, test:0.000977 | lr:0.001000
Epoch[54/100] | loss train:0.006167, test:0.001004 | lr:0.001000
Epoch[55/100] | loss train:0.006218, test:0.001005 | lr:0.001000
Epoch[56/100] | loss train:0.006108, test:0.001009 | lr:0.001000
Epoch[57/100] | loss train:0.006190, test:0.000995 | lr:0.001000
Epoch[58/100] | loss train:0.005824, test:0.001067 | lr:0.001000
Epoch[59/100] | loss train:0.006158, test:0.001027 | lr:0.001000
Epoch[60/100] | loss train:0.005736, test:0.000982 | lr:0.001000
Epoch[61/100] | loss train:0.006156, test:0.001029 | lr:0.001000
Epoch[62/100] | loss train:0.006316, test:0.001019 | lr:0.001000
Epoch[63/100] | loss train:0.006164, test:0.001000 | lr:0.001000
Epoch[64/100] | loss train:0.005888, test:0.001150 | lr:0.001000
Epoch[65/100] | loss train:0.006050, test:0.000976 | lr:0.001000
Epoch[66/100] | loss train:0.006155, test:0.000984 | lr:0.001000
Epoch[67/100] | loss train:0.006651, test:0.001187 | lr:0.001000
Epoch[68/100] | loss train:0.006552, test:0.001027 | lr:0.001000
Epoch[69/100] | loss train:0.006071, test:0.000986 | lr:0.001000
Epoch[70/100] | loss train:0.006078, test:0.000993 | lr:0.001000
Epoch[71/100] | loss train:0.006517, test:0.001004 | lr:0.001000
Epoch[72/100] | loss train:0.006313, test:0.000989 | lr:0.001000
Epoch[73/100] | loss train:0.006478, test:0.000990 | lr:0.001000
Epoch[74/100] | loss train:0.005913, test:0.000989 | lr:0.001000
Epoch[75/100] | loss train:0.006185, test:0.000995 | lr:0.001000
Epoch[76/100] | loss train:0.006031, test:0.000975 | lr:0.001000
Epoch[77/100] | loss train:0.006308, test:0.000994 | lr:0.001000
Epoch[78/100] | loss train:0.006097, test:0.001066 | lr:0.001000
Epoch[79/100] | loss train:0.006468, test:0.001050 | lr:0.001000
Epoch[80/100] | loss train:0.006486, test:0.001007 | lr:0.001000
Epoch[81/100] | loss train:0.006180, test:0.000985 | lr:0.000100
Epoch[82/100] | loss train:0.006095, test:0.000979 | lr:0.000100
Epoch[83/100] | loss train:0.006220, test:0.000981 | lr:0.000100
Epoch[84/100] | loss train:0.006348, test:0.000986 | lr:0.000100
Epoch[85/100] | loss train:0.006135, test:0.000978 | lr:0.000100
Epoch[86/100] | loss train:0.005958, test:0.000975 | lr:0.000100
Epoch[87/100] | loss train:0.006027, test:0.000996 | lr:0.000100
Epoch[88/100] | loss train:0.006102, test:0.000966 | lr:0.000100
Epoch[89/100] | loss train:0.005950, test:0.000970 | lr:0.000100
Epoch[90/100] | loss train:0.005957, test:0.000970 | lr:0.000100
Epoch[91/100] | loss train:0.006242, test:0.000995 | lr:0.000100
Epoch[92/100] | loss train:0.005978, test:0.000968 | lr:0.000100
Epoch[93/100] | loss train:0.005997, test:0.000999 | lr:0.000100
Epoch[94/100] | loss train:0.006033, test:0.000981 | lr:0.000100
Epoch[95/100] | loss train:0.005783, test:0.000975 | lr:0.000100
Epoch[96/100] | loss train:0.006214, test:0.001000 | lr:0.000100
Epoch[97/100] | loss train:0.006143, test:0.000972 | lr:0.000100
Epoch[98/100] | loss train:0.006267, test:0.000974 | lr:0.000100
Epoch[99/100] | loss train:0.006168, test:0.000985 | lr:0.000100
Epoch[100/100] | loss train:0.006102, test:0.000972 | lr:0.000100

Model evaluation

To visually inspect our model's performance, we will use the newly trained model to make predictions on the training and validation datasets we've created earlier in this project. If we see that the model can predict values that closely mirror the training dataset, it shows that the model managed to memorize the data. And if the model can predict values that resemble the validation dataset, it has managed to learn the patterns in our sequential data and generalize the patterns to unseen data points.

actual vs predicted

Append the following code block to your project.py file. Re-running the file should generate a graph similar to the figure above.

View codes
# here we re-initialize dataloader so the data doesn't shuffled, so we can plot the values by date

train_dataloader = DataLoader(dataset_train, batch_size=config["training"]["batch_size"], shuffle=False)
val_dataloader = DataLoader(dataset_val, batch_size=config["training"]["batch_size"], shuffle=False)

model.eval()

# predict on the training data, to see how well the model managed to learn and memorize

predicted_train = np.array([])

for idx, (x, y) in enumerate(train_dataloader):
    x = x.to(config["training"]["device"])
    out = model(x)
    out = out.cpu().detach().numpy()
    predicted_train = np.concatenate((predicted_train, out))

# predict on the validation data, to see how the model does

predicted_val = np.array([])

for idx, (x, y) in enumerate(val_dataloader):
    x = x.to(config["training"]["device"])
    out = model(x)
    out = out.cpu().detach().numpy()
    predicted_val = np.concatenate((predicted_val, out))

# prepare data for plotting

to_plot_data_y_train_pred = np.zeros(num_data_points)
to_plot_data_y_val_pred = np.zeros(num_data_points)

to_plot_data_y_train_pred[config["data"]["window_size"]:split_index+config["data"]["window_size"]] = scaler.inverse_transform(predicted_train)
to_plot_data_y_val_pred[split_index+config["data"]["window_size"]:] = scaler.inverse_transform(predicted_val)

to_plot_data_y_train_pred = np.where(to_plot_data_y_train_pred == 0, None, to_plot_data_y_train_pred)
to_plot_data_y_val_pred = np.where(to_plot_data_y_val_pred == 0, None, to_plot_data_y_val_pred)

# plots

fig = figure(figsize=(25, 5), dpi=80)
fig.patch.set_facecolor((1.0, 1.0, 1.0))
plt.plot(data_date, data_close_price, label="Actual prices", color=config["plots"]["color_actual"])
plt.plot(data_date, to_plot_data_y_train_pred, label="Predicted prices (train)", color=config["plots"]["color_pred_train"])
plt.plot(data_date, to_plot_data_y_val_pred, label="Predicted prices (validation)", color=config["plots"]["color_pred_val"])
plt.title("Compare predicted prices to actual prices")
xticks = [data_date[i] if ((i%config["plots"]["xticks_interval"]==0 and (num_data_points-i) > config["plots"]["xticks_interval"]) or i==num_data_points-1) else None for i in range(num_data_points)] # make x ticks nice
x = np.arange(0,len(xticks))
plt.xticks(x, xticks, rotation='vertical')
plt.grid(b=None, which='major', axis='y', linestyle='--')
plt.legend()
plt.show()

From our results, we can see that the model has managed to learn and predict on both training (green) and validation (blue) datasets very well, as the Predicted prices lines significantly overlap with the Actual prices values.

Let's zoom into the chart and look closely at the blue Predicted price (validation) segment by comparing it against its actual prices values.

predicted validation zoom in

Append the following code block to your project.py file and re-run the script.

View codes
# prepare data for plotting the zoomed in view of the predicted prices (on validation set) vs. actual prices

to_plot_data_y_val_subset = scaler.inverse_transform(data_y_val)
to_plot_predicted_val = scaler.inverse_transform(predicted_val)
to_plot_data_date = data_date[split_index+config["data"]["window_size"]:]

# plots

fig = figure(figsize=(25, 5), dpi=80)
fig.patch.set_facecolor((1.0, 1.0, 1.0))
plt.plot(to_plot_data_date, to_plot_data_y_val_subset, label="Actual prices", color=config["plots"]["color_actual"])
plt.plot(to_plot_data_date, to_plot_predicted_val, label="Predicted prices (validation)", color=config["plots"]["color_pred_val"])
plt.title("Zoom in to examine predicted price on validation data portion")
xticks = [to_plot_data_date[i] if ((i%int(config["plots"]["xticks_interval"]/5)==0 and (len(to_plot_data_date)-i) > config["plots"]["xticks_interval"]/6) or i==len(to_plot_data_date)-1) else None for i in range(len(to_plot_data_date))] # make x ticks nice
xs = np.arange(0,len(xticks))
plt.xticks(xs, xticks, rotation='vertical')
plt.grid(b=None, which='major', axis='y', linestyle='--')
plt.legend()
plt.show()

What a beautiful graph! You can see that the predicted prices (blue) significantly overlap with the actual prices (black) of IBM.

It is also worth noting that model training & evaluation is an iterative process. Please feel free to go back to the "model training" step to fine-tune the model and re-evaluate the model to see if there is a further performance boost.

Predicting future stock prices

By now, we have trained an LSTM model that can (fairly accurately) predict the next day's price based on the past 20 days' close prices. This means we now have a crystal ball in hand! Let's supply the past 20 days' close prices to the model and see what it predicts for the next trading day (i.e., the future!). Append the following code to your project.py file and re-run the script one last time.

predicted tomorrow price

View codes
# predict the closing price of the next trading day

model.eval()

x = torch.tensor(data_x_unseen).float().to(config["training"]["device"]).unsqueeze(0).unsqueeze(2) # this is the data type and shape required, [batch, sequence, feature]
prediction = model(x)
prediction = prediction.cpu().detach().numpy()

# prepare plots

plot_range = 10
to_plot_data_y_val = np.zeros(plot_range)
to_plot_data_y_val_pred = np.zeros(plot_range)
to_plot_data_y_test_pred = np.zeros(plot_range)

to_plot_data_y_val[:plot_range-1] = scaler.inverse_transform(data_y_val)[-plot_range+1:]
to_plot_data_y_val_pred[:plot_range-1] = scaler.inverse_transform(predicted_val)[-plot_range+1:]

to_plot_data_y_test_pred[plot_range-1] = scaler.inverse_transform(prediction)

to_plot_data_y_val = np.where(to_plot_data_y_val == 0, None, to_plot_data_y_val)
to_plot_data_y_val_pred = np.where(to_plot_data_y_val_pred == 0, None, to_plot_data_y_val_pred)
to_plot_data_y_test_pred = np.where(to_plot_data_y_test_pred == 0, None, to_plot_data_y_test_pred)

# plot

plot_date_test = data_date[-plot_range+1:]
plot_date_test.append("tomorrow")

fig = figure(figsize=(25, 5), dpi=80)
fig.patch.set_facecolor((1.0, 1.0, 1.0))
plt.plot(plot_date_test, to_plot_data_y_val, label="Actual prices", marker=".", markersize=10, color=config["plots"]["color_actual"])
plt.plot(plot_date_test, to_plot_data_y_val_pred, label="Past predicted prices", marker=".", markersize=10, color=config["plots"]["color_pred_val"])
plt.plot(plot_date_test, to_plot_data_y_test_pred, label="Predicted price for next day", marker=".", markersize=20, color=config["plots"]["color_pred_test"])
plt.title("Predicted close price of the next trading day")
plt.grid(b=None, which='major', axis='y', linestyle='--')
plt.legend()
plt.show()

print("Predicted close price of the next trading day:", round(to_plot_data_y_test_pred[plot_range-1], 2))

The red dot in the graph is what our model predicts for IBM's close price on the next trading day. Is the prediction good enough? How about other stocks such as TSLA, APPL, or the hugely popular Gamestop stock (GME)? What about other asset classes such as forex or cryptocurrencies? Beyond the close prices, are there any other external data we can feed to the LSTM model to make it even more robust - for example, one of the 50+ technical indicators from the Alpha Vantage APIs?

We will now pass the baton to you, our fearless reader!

Disclaimer: This content is for educational purposes only and is NOT investment advice.

For Jupyter Notebook Users:

Running the codes on Google Colab

Colab is a free hosted Jupyter notebook with access to GPU, you can examine and run the code here.

Running the codes on your local Jupyter Notebook

Download the files with:

git clone https://github.com/jinglescode/time-series-forecasting-pytorch.git

Then, you can install all the packages used in this project with:

pip install -r requirements.txt

time-series-forecasting-pytorch's People

Contributors

alphavantagesupport avatar jinglescode avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

time-series-forecasting-pytorch's Issues

free API key not sufficient to download the price chart?

Hello,

I'm a student from the Technical University of Munich and I would like to try out this code for educational reasons. With my free API key I get this error:

ValueError: Thank you for using Alpha Vantage! This is a premium endpoint and there are multiple ways to unlock premium endpoints: (1) become a holder of Alpha Vantage Coin (AVC), an Ethereum-based cryptocurrency that provides various utility & governance functions within the Alpha Vantage ecosystem (AVC mining guide: https://www.alphatournament.com/avc_mining_guide/) to unlock all premium endpoints, (2) subscribe to any of the premium plans at https://www.alphavantage.co/premium/ to instantly unlock all premium endpoints

Since I'm insterested more in learning than using it for real investment, does alpha vantage also provide randomised data for feeding the LSTM model?
It would really help, as it would take some time and understanding to adapt other free data sources to mach exactly to the structure of the downloaded price chart data structure from alpha vantage API.

thank you!

Additional features

Hello,

Please show how to add additional features (such as a technical indicator) to the script to help in prediction.

Is it possible to test with another stock than IBM ?

If I try Amazon (AMZ) in config, I get the following error message:

Traceback (most recent call last):
File "c:\daten\Python\boerse_deep_learning\boerse_deep_learning.py", line 72, in
data_date, data_close_price, num_data_points, display_date_range = download_data(config)
File "c:\daten\Python\boerse_deep_learning\boerse_deep_learning.py", line 54, in download_data
data, meta_data = ts.get_daily_adjusted(config["alpha_vantage"]["symbol"],
call_response, data_key, meta_data_key = func(
File "C:\daten\programme\python\python310\lib\site-packages\alpha_vantage\alphavantage.py", line 160, in _call_wrapper
return self._handle_api_call(url), data_key, meta_data_key
File "C:\daten\programme\python\python310\lib\site-packages\alpha_vantage\alphavantage.py", line 359, in _handle_api_call
raise ValueError(json_response["Error Message"])
ValueError: Invalid API call. Please retry or visit the documentation (https://www.alphavantage.co/documentation/) for TIME_SERIES_DAILY_ADJUSTED.

Regards
Lu

It seems like predicted price is off 1 day on IBM or maybe i missunderstood something

Hi,
I was very impressed with the result of prediction graph of IBM but than i noticed that actual prediction is going after next day of actual data, so i decide to print out some piece of data with this :

print("Actual/Predicted prices:") for i in range(len(data_close_price)): print(str(data_close_price[i]) + '\t ' + str(to_plot_data_y_train_pred[i]) + '\t' + str(to_plot_data_y_val_pred[i]))

and here is a piece of output ( i grabbed a piece where the stock went up to make it clear):
data_close_price to_plot_data_y_train_pred to_plot_data_y_val_pred
61.5541637158 62.48051591375582 None
63.1859297539 61.81105258680695 None
65.5087283847 63.22269362984531 None
65.2532205353 65.42579108472087 None
70.2646585813 65.24432286147622 None
69.5387840091 70.2044966634442 None

lets check 2nd row, date_close_price showing 63.1859 while prediction price is 61.8110
and then on next day data_close_price 65.508 and prediction 63.222 ,
so it seems that prediction price is one day behind ???
Please advice, maybe i dont understand it

Lookahead Bias

The Normalizer class computes the mean and standard deviation over the full data set.

Doesn't this introduce lookahead bias in the model essentially feeding information about future stock prices into you past price data?

is this model any good?

If I were to build a model that predicts yesterdays closing price as today's closing price we would also get similar graph, but the model is useless. Where are prediction accuracy numbers? I bet they would be terrible.

Small bug

All cases of:
plt.grid(b=None, which='major', axis='y', linestyle='--')
should be replaced by:
plt.grid(visible=None, which='major', axis='y', linestyle='--')

Predicted price is far different from actual price

Hi Hong,

I tried the project with IBM, the result is amazing. However, when I tried with other stock, e.g. AAPL, the predict price is much lower than actual price.
What would I do to make the predict price is closer to actual price? what line of code should I take a look at to make the adjustment?

Thank you,
Mark

Future prediction beyond 1 day

Hi,
I am curious how the code needs to be adapted so that one is able to look e.g. 5 or more days into the future?
Can you help me with this?
Thanks,
Alex

Error keyword grid_b is not recognized

I'm getting this error following your steps

All libraries loaded
Number data points 5868 from 1999-11-01 to 2023-02-27
Traceback (most recent call last):
File "project.py", line 80, in
plt.grid(b=None, which='major', axis='y', linestyle='--')
File "C:\Python38\lib\site-packages\matplotlib\pyplot.py", line 2589, in grid
return gca().grid(visible=visible, which=which, axis=axis, **kwargs)
File "C:\Python38\lib\site-packages\matplotlib\axes_base.py", line 3198, in grid
self.yaxis.grid(visible, which=which, **kwargs)
File "C:\Python38\lib\site-packages\matplotlib\axis.py", line 1655, in grid
self.set_tick_params(which='major', **gridkw)
File "C:\Python38\lib\site-packages\matplotlib\axis.py", line 927, in set_tick_params
kwtrans = self._translate_tick_params(kwargs)
File "C:\Python38\lib\site-packages\matplotlib\axis.py", line 1071, in _translate_tick_params
raise ValueError(
ValueError: keyword grid_b is not recognized; valid keywords are ['size', 'width', 'color', 'tickdir', 'pad', 'labelsize', 'labelcolor', 'zorder', 'gridOn', 'tick1On', 'tick2On', 'label1On', 'label2On', 'length', 'direction', 'left', 'bottom', 'right', 'top', 'labelleft', 'labelbottom', 'labelright', 'labeltop', 'labelrotation', 'grid_agg_filter', 'grid_alpha', 'grid_animated', 'grid_antialiased', 'grid_clip_box', 'grid_clip_on', 'grid_clip_path', 'grid_color', 'grid_dash_capstyle', 'grid_dash_joinstyle', 'grid_dashes', 'grid_data', 'grid_drawstyle', 'grid_figure', 'grid_fillstyle', 'grid_gapcolor', 'grid_gid', 'grid_in_layout', 'grid_label', 'grid_linestyle', 'grid_linewidth', 'grid_marker', 'grid_markeredgecolor', 'grid_markeredgewidth', 'grid_markerfacecolor', 'grid_markerfacecoloralt', 'grid_markersize', 'grid_markevery', 'grid_mouseover', 'grid_path_effects', 'grid_picker', 'grid_pickradius', 'grid_rasterized', 'grid_sketch_params', 'grid_snap', 'grid_solid_capstyle', 'grid_solid_joinstyle', 'grid_transform', 'grid_url', 'grid_visible', 'grid_xdata', 'grid_ydata', 'grid_zorder', 'grid_aa', 'grid_c', 'grid_ds', 'grid_ls', 'grid_lw', 'grid_mec', 'grid_mew', 'grid_mfc', 'grid_mfcalt', 'grid_ms']

Alternative for get_daily_adjusted?

I have no issue eventually paying for premium endpoints but I am just now learning - your code is MUCH appreciated! - and I would like to avoid paying for learning. Whenever I run the ts.get_daily_adjusted call, I am told it is a premium endpoint - what free endpoint can I use that would not break the rest of your code?

Problems with free api

I would like to know that I am writing projects according to documentation when using features such as ai to predict prices, but I have a problem with the first stage, as shown in the figure below

@image

According to the error content, if I use the free api, will I not be able to do relevant operations? Is it related to the fact that I have to use the premium subscription.

question about normalization

hey

How would you adopt the z normalization for multiple stocks ?

Do you think it would be more suitable to adopt the normlization for 20+1 values before and after entering to the NN,

or
would you consider to z normalize the each stocks price seperatly

regards

S

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.