Giter Club home page Giter Club logo

trading-bitcoin-with-reinforcement-learning's Introduction

Trading Bitcoin with Reinforcement Learning

This post describes how to apply reinforcement learning algorithm to trade Bitcoin. This repository provides an implementation aims to reproduce the result.

  • BnH

    A buy-and-hold strategy that always hold 2 Bitcoins starting from the beginning of the test period.

  • RL

    A trained RL agent making trading decisions to hold 0~4 Bitcoins given the current market condition.

  • MMT

    A momentum strategy that holds 4 Bitcoins when the 30-period SMA cross-over than the current closing price and 0 Bitcoin otherwise.

Dependencies

  • Python3.6
  • NumPy 1.17.1
  • Pandas 0.25.1
  • Matplotlib 3.1.1
  • PyTorch 1.2.0 (CPU only)

Data

The minute-by-minute data is downloaded from Kaggle. I resample them into 15-minute interval and compute all the features we need. Then I save the two dataframes under bitcoin-historical-data.

Note that,

  • I delete the row indexed 2017-04-15 23:00:00 after resampling since there is a clear error. This is done in the remove_outlier() method under the Data class.

  • Due to request, I include the 15-minute data in bitcoin-historical-data (due to size constraint on GitHub, I cannot update the 1-minute data and the feature dataframe generated from the 15-minute data.)

How to run

# E.g. clone to local (say to Downloads)
cd ~/Downloads/trading-bitcoin-with-reinforcement-learning/

# Usage: python main.py <path-to-one-minute-data>
# If argument not provided, the default file path
# './bitcoin-historical-data/coinbaseUSD_1-min_data.csv' is given
python main.py ./bitcoin-historical-data/coinbaseUSD_1-min_data.csv

Note: I observed substantial variability in the test result therefore the equity curve you got may not be 100% the same as mine.

trading-bitcoin-with-reinforcement-learning's People

Contributors

croilu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

trading-bitcoin-with-reinforcement-learning's Issues

Zero Cum log returns

Thank you for sharing this repo and pytorch implementation of a trading example.

I encountered a problem with Zero Cum log returns after model was trained with your original codes.
Would you point out which might be wrong with it?
Thank you for considering my question and request.

image

Why zero/change init of model?

Hi there, another question - during the actor's initialization, you change it so that theres normal init for the hidden layer and zero/one weight/bias init for the fc layer, why is this?

` nn.init.normal(self.hidden[0].weight.data,
mean=0., std=math.sqrt(2 / self.hidden[0].in_features))
nn.init.constant(self.hidden[0].bias.data, 0.)

    # zeroing output layer
    nn.init.constant(self.out.weight.data, 0.)
    nn.init.constant(self.out.bias.data, 1.)`

Where to tweak num of bitcoins?

As Bitcoin price has gone up so much more, buy/hold up to 4 bitcoins are difficult for retail investors. Can someone help me find the line of code to tweak that number? Also, this will help applied the algorithm to other coins.

Thank you.

Test data is trained in the model

` def roll_out(env, model, train_mode):
model.eval()

ret = 0      # episode return
r_lst = []   # store reward
p_lst = []   # store price
P_lst = []   # store position
buffer = []

s = env.reset()
done = False
while not done:

    # sample action
    A_Pr = model.forward(Variable(s))
    #print('model:', A_Pr.data)
    act = torch.multinomial(A_Pr.data, num_samples=1)
    i_act = act[0][0]
    #print('act:', i_act)

    # apply action
    s_, r, done = env.step(i_act)

    # tracker
    ret += r
    r_lst.append(r)
    p_lst.append(env.curr_OHLCV()[3])
    P_lst.append(i_act)

    # Save transitions
    buffer.append((s, act, r))

    if done: break

    # Swap states
    s = s_

# Learning when episode finishes
model.train()

S, A, R = zip(*buffer)
del buffer[:]

S = Variable(torch.cat(S))
A = Variable(torch.cat(A))

# Compute target
Q = []
ret = 0
for r in reversed(R):
    ret = r + .9 * ret
    Q.append(ret)
Q.reverse()

# standardize Q
Q = np.array(Q).astype(np.float32)
Q -= Q.mean()
Q /= Q.std() + 1e-6
Q.clip(min=-10, max=10)
Q = np.expand_dims(Q, axis=1)
Q = Variable(torch.from_numpy(Q))

# PG update
if train_mode: #modified.................add a judge
    A_Pr = model.forward(S).gather(1, A).clamp(min=1e-7, max=1 - 1e-7)

    loss = -(Q * torch.log(A_Pr)).mean()
    model.optim.zero_grad()
    loss.backward()
    model.optim.step()

model.eval()
return ret, r_lst, p_lst, P_lst`

The code should make a judge that the mode is training or testing.

Why standardize Q values?

On main.py, you standardize the Q-vector (I assume this is the discounted cumulative reward), via mean and std. Why do you do this? Aren't you normalizing out any notion of positive/negative returns?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.