Giter Club home page Giter Club logo

tradinggym's Introduction

TradingGym

Build Status

TradingGym is a toolkit for training and backtesting the reinforcement learning algorithms. This was inspired by OpenAI Gym and imitated the framework form. Not only traning env but also has backtesting and in the future will implement realtime trading env with Interactivate Broker API and so on.

This training env originally design for tickdata, but also support for ohlc data format. WIP.

Installation

git clone https://github.com/Yvictor/TradingGym.git
cd TradingGym
python setup.py install

Getting Started

import random
import numpy as np
import pandas as pd
import trading_env

df = pd.read_hdf('dataset/SGXTW.h5', 'STW')

env = trading_env.make(env_id='training_v1', obs_data_len=256, step_len=128,
                       df=df, fee=0.1, max_position=5, deal_col_name='Price', 
                       feature_names=['Price', 'Volume', 
                                      'Ask_price','Bid_price', 
                                      'Ask_deal_vol','Bid_deal_vol',
                                      'Bid/Ask_deal', 'Updown'])

env.reset()
env.render()

state, reward, done, info = env.step(random.randrange(3))

### randow choice action and show the transaction detail
for i in range(500):
    print(i)
    state, reward, done, info = env.step(random.randrange(3))
    print(state, reward)
    env.render()
    if done:
        break
env.transaction_details
  • obs_data_len: observation data length
  • step_len: when call step rolling windows will + step_len
  • df exmaple
index datetime bid ask price volume serial_number dealin
0 2010-05-25 08:45:00 7188.0 7188.0 7188.0 527.0 0.0 0.0
1 2010-05-25 08:45:00 7188.0 7189.0 7189.0 1.0 1.0 1.0
2 2010-05-25 08:45:00 7188.0 7189.0 7188.0 1.0 2.0 -1.0
3 2010-05-25 08:45:00 7188.0 7189.0 7188.0 4.0 3.0 -1.0
4 2010-05-25 08:45:00 7188.0 7189.0 7188.0 2.0 4.0 -1.0
  • df: dataframe that contain data for trading

serial_number -> serial num of deal at each day recalculating

  • fee: when each deal will pay the fee, set with your product.
  • max_position: the max market position for you trading share.
  • deal_col_name: the column name for cucalate reward used.
  • feature_names: list contain the feature columns to use in trading status.

gif

Training

simple dqn

  • WIP

policy gradient

  • WIP

actor-critic

  • WIP

A3C with RNN

  • WIP

Backtesting

  • loading env just like training
env = trading_env.make(env_id='backtest_v1', obs_data_len=1024, step_len=512,
                       df=df, fee=0.1, max_position=5, deal_col_name='Price', 
                        feature_names=['Price', 'Volume', 
                                       'Ask_price','Bid_price', 
                                       'Ask_deal_vol','Bid_deal_vol',
                                       'Bid/Ask_deal', 'Updown'])
  • load your own agent
class YourAgent:
    def __init__(self):
        # build your network and so on
        pass
    def choice_action(self, state):
        ## your rule base conditon or your max Qvalue action or Policy Gradient action
         # action=0 -> do nothing
         # action=1 -> buy 1 share
         # action=2 -> sell 1 share
        ## in this testing case we just build a simple random policy 
        return np.random.randint(3)
  • start to backtest
agent = YourAgent()

transactions = []
while not env.backtest_done:
    state = env.backtest()
    done = False
    while not done:
        state, reward, done, info = env.step(agent.choice_action(state))
        #print(state, reward)
        #env.render()
        if done:
            transactions.append(info)
            break
transaction = pd.concate(transactions)
transaction
step datetime transact transact_type price share price_mean position reward_fluc reward reward_sum color rotation
2 1537 2013-04-09 10:58:45 Buy new 277.1 1.0 277.100000 1.0 0.000000e+00 0.000000e+00 0.000000 1 1
5 3073 2013-04-09 11:47:26 Sell cover 276.8 -1.0 277.100000 0.0 -4.000000e-01 -4.000000e-01 -0.400000 2 2
10 5633 2013-04-09 13:23:40 Sell new 276.9 -1.0 276.900000 -1.0 0.000000e+00 0.000000e+00 -0.400000 2 1
11 6145 2013-04-09 13:30:36 Sell new 276.7 -1.0 276.800000 -2.0 1.000000e-01 0.000000e+00 -0.400000 2 1
... ... ... ... ... ... ... ... ... ... ... ... ... ...
211 108545 2013-04-19 13:18:32 Sell new 286.7 -1.0 286.525000 -2.0 -4.500000e-01 0.000000e+00 30.650000 2 1
216 111105 2013-04-19 16:02:01 Sell new 289.2 -1.0 287.416667 -3.0 -5.550000e+00 0.000000e+00 30.650000 2 1
217 111617 2013-04-19 17:54:29 Sell new 289.2 -1.0 287.862500 -4.0 -5.650000e+00 0.000000e+00 30.650000 2 1
218 112129 2013-04-19 21:36:21 Sell new 288.0 -1.0 287.890000 -5.0 -9.500000e-01 0.000000e+00 30.650000 2 1
219 112129 2013-04-19 21:36:21 Buy cover 288.0 5.0 287.890000 0.0 0.000000e+00 -1.050000e+00 29.600000 1 2

128 rows ร— 13 columns

exmaple of rule base usage

  • ma crossover and crossunder
env = trading_env.make(env_id='backtest_v1', obs_data_len=10, step_len=1,
                       df=df, fee=0.1, max_position=5, deal_col_name='Price', 
                       feature_names=['Price', 'MA'])
class MaAgent:
    def __init__(self):
        pass
        
    def choice_action(self, state):
        if state[-1][0] > state[-1][1] and state[-2][0] <= state[-2][1]:
            return 1
        elif state[-1][0] < state[-1][1] and state[-2][0] >= state[-2][1]:
            return 2
        else:
            return 0
# then same as above

tradinggym's People

Contributors

20chase avatar xfrah avatar yvictor avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tradinggym's Issues

Is this a bug or I miss something?

Hi Yvictor

Thanks for your TradingGym project which is really interesting and helpful.

Iโ€™m a bit unclear with two things in trading_env.py.

(1)Will code line [next_index = self.step_st+self.obs_len+1] in self.step function result in a blank trading day?

Suppose obs_len = 10 and step_len =5, the initial self.obs_res = self.obs_features[0:10], next_index = self.step_st+self.obs_len+1 =11, where is the 10th day info? Considering python list design exclude the last element. Is this a bug or I miss something?

(2)Would it be nicer if reward_ret value is a percent return rather than a absolute value?
Something like
self.reward_fluctuant = (self.price_current*self.position_share - self.transaction_details.iloc[-1]['price_mean']self.position_share - self.feeabs_pos) / self.transaction_details.iloc[-1]['price_mean']

By the way, I notice that, every step return a reward which is actually a stock value (cumulative return) rather than a flow value (the interval return). I doubt which is reasonable. Would you mind explaining something about this, it would be really appreciated. Thanks for your code and time.

Have a good day.

step_len is added twice to check it exceeds len(self.price)

self.step_st += self.step_len
# observation part
self.obs_state = self.obs_features[self.step_st: self.step_st+self.obs_len]
self.obs_posi = self.posi_arr[self.step_st: self.step_st+self.obs_len]
# position variation
self.obs_posi_var = self.posi_variation_arr[self.step_st: self.step_st+self.obs_len]
# position entry or cover :new_entry->1 increase->2 cover->-1 decrease->-2
self.obs_posi_entry_cover = self.posi_entry_cover_arr[self.step_st: self.step_st+self.obs_len]
self.obs_price = self.price[self.step_st: self.step_st+self.obs_len]
self.obs_price_mean = self.price_mean_arr[self.step_st: self.step_st+self.obs_len]
self.obs_reward_fluctuant = self.reward_fluctuant_arr[self.step_st: self.step_st+self.obs_len]
self.obs_makereal = self.reward_makereal_arr[self.step_st: self.step_st+self.obs_len]
self.obs_reward = self.reward_arr[self.step_st: self.step_st+self.obs_len]
# change part
self.chg_posi = self.obs_posi[-self.step_len:]
self.chg_posi_var = self.obs_posi_var[-self.step_len:]
self.chg_posi_entry_cover = self.obs_posi_entry_cover[-self.step_len:]
self.chg_price = self.obs_price[-self.step_len:]
self.chg_price_mean = self.obs_price_mean[-self.step_len:]
self.chg_reward_fluctuant = self.obs_reward_fluctuant[-self.step_len:]
self.chg_makereal = self.obs_makereal[-self.step_len:]
self.chg_reward = self.obs_reward[-self.step_len:]
done = False
if self.step_st+self.obs_len+self.step_len >= len(self.price):

Maybe line 207 should be changed from

 if self.step_st+self.obs_len+self.step_len >= len(self.price):

to

 if self.step_st+self.obs_len >= len(self.price):

Need Help

Hey brother i am new to git hub and i saw you that you are discussing about trading algo software bro can you make me a algo for gold trading fir trading view

AttributeError: 'trading_env' object has no attribute 'backtest_done'

Hey! Great project, really excited about it but having a small problem.

Can't run the agent, env.backtest is causing a problem, so in the trading_env.make function I added the parameter backtest=1, and now the backtest_done is popping up as having to attribute.

This seems to be a problem with threading, from what I can read online.

Has anyone had success beyond the point of this error? Is it a package version I can downgrade?

Attached my notebook.
trading-gym-andy.zip

no matter what I add to trading_env.make, including backtest_done=0 or 1, it fails to see it as an attribute... wtf!

Error while rendering environment?

Thank you for this I was looking for something like this. when I am trying out this I am getting a assertion error like this

self.ax.lines.remove(self.price_plot[0])
AttributeError: 'trading_env' object has no attribute 'ax'

data frame sample

>>> df.head()
   serial_number        Date   Open   High    Low  Close      Volume
0              0  1998-01-02  13.63  16.25  13.50  16.25   6411700.0
1              1  1998-01-05  16.50  16.56  15.19  15.88   5820300.0
2              2  1998-01-06  15.94  20.00  14.75  18.94  16182800.0
3              3  1998-01-07  18.81  19.00  17.31  17.50   9300200.0
4              4  1998-01-08  17.44  18.62  16.94  18.19   6910900.0

I want to use Volume column alone as the my observation state. This is the whole code

import random
import numpy as np
import pandas as pd
import trading_env

df = pd.read_csv('./dataset/AAPL.csv')

df.rename(columns={'Unnamed: 0':'serial_number'},inplace=True)

env = trading_env.make(env_id='training_v1', obs_data_len=50, step_len=14,
                       df=df, fee=0.1, max_position=5, deal_col_name='Close', 
                       feature_names=['Volume'])


env.reset()
env.render()

state, reward, done, info = env.step(random.randrange(3))

### randow choice action and show the transaction detail
for i in range(500):
    print(i)
    state, reward, done, info = env.step(random.randrange(3))
    print(state, reward)
    env.render()
    if done:
        break

env.transaction_details

IMPORTANT: Can you please tell me what is the action space index for buy, sell and hold. I can see action space consists of three integers but can't find where its mapped to.

Envrionment explanation

May I know if there is any description about the environment such as what is the obs, reward, action?

Also, may I know what is the difference between backtest and training environments?

thanks!

AttributeError: 'trading_env' object has no attribute 'backtest'

Backtesting
loading env just like training
env = trading_env.make(env_id='backtest_v1', obs_data_len=1024, step_len=512,
df=df, fee=0.1, max_position=5, deal_col_name='Price',
feature_names=['Price', 'Volume',
'Ask_price','Bid_price',
'Ask_deal_vol','Bid_deal_vol',
'Bid/Ask_deal', 'Updown'])

start to backtest
agent = YourAgent()

transactions = []
while not env.backtest_done:
state = env.backtest()
done = False
while not done:
state, reward, done, info = env.step(agent.choice_action(state))
#print(state, reward)
#env.render()
if done:
transactions.append(info)
break
transaction = pd.concate(transactions)
transaction

Making new env: backtest_v1
Traceback (most recent call last):
File "\TradingGym-master\BackTest.py", line 32, in
state = env.backtest()
AttributeError: 'trading_env' object has no attribute 'backtest'

how could I fix this.

RL example strategy

Hi, thanks for another new great gym environment!

Its not actual issue, more like question:
I am curious about actual RL trained strategies examples (not necessary profitable), because I didn't find one and OpenAI didn't pay much attention to financial gyms. It will be much easier to learn on existed examples, how tune architecture or parameters. I would be very grateful if you point me out useful links if you know some. Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.