davisvaughan / flyingfox Goto Github PK

View Code? Open in Web Editor NEW

25.0 8.0 8.0 126 KB

An R Interface to the Quantopian Zipline Financial Backtester

Home Page: https://rstudio.cloud/project/38291

License: Other

R 96.61% Python 3.39%

flyingfox's Introduction

flyingfox

The goal of flyingfox is to connect Quantopian’s zipline financial backtesting package with R.

Installation

You can install the released version of flyingfox from CRAN with:

# NO YOU CANNOT
install.packages("flyingfox")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("DavisVaughan/flyingfox")

Setup

(Using reticulate >= 1.7.1)

To get started with zipline, you’ll need the zipline Python module. Install it with:

install_zipline()

By default zipline will be installed into the virtualenv, r-reticulate, as recommended by reticulate.

Next, you’ll need data to run the backtest on. The easiest way to do this is to:

Create a free account on Quandl and find your API Key in Account Settings.
Add the API key as the R environment variable, QUANDL_API_KEY (access your .Renviron file with usethis::edit_r_environ()).
“Ingest” the Quandl data with flyingfox::fly_ingest().

fly_ingest()

Example

flyingfox backtests are run using a combination of two main functions. fly_initialize() sets up variables you might need during the backtest along with giving you a chance to schedule functions to run periodically. fly_handle_data() is called at a daily/minutely frequency and runs your algorithm, orders assets, and records data for future inspection.

Below, we are going to create a basic mean reversion strategy to demonstrate the basics of running an algorithm.

library(flyingfox)

First, set up an initialize function. It must take context as the argument. Think of context as a persistent environment where you can store variables and assets that you want to access at any point in the simulation.

fly_initialize <- function(context) {

  # We want to track what day we are on. The mean reversion algo we use
  # should have at least 300 days of data before doing anything
  context$i = 0L

  # We want to trade apple stock
  context$asset = fly_symbol("AAPL")
}

Next, create a data handling function that accepts context and data. Think of data as an environment containing functions for accessing historical and current price data about the assets you are using in your simulation.

The below implementation of fly_handle_data() demonstrates a mean reversion algorithm.

fly_handle_data <- function(context, data) {

  # Increment day
  context$i <- context$i + 1L

  # While < 300 days of data, return
  if(context$i < 300L) {
    return()
  }

  # Calculate a short term (100 day) moving average
  # by pulling history for the asset (apple) and taking an average
  short_hist <- fly_data_history(data, context$asset, "price", bar_count = 100L, frequency = "1d")
  short_mavg <- mean(short_hist)

  # Calculate a long term (300 day) moving average
  long_hist <- fly_data_history(data, context$asset, "price", bar_count = 300L, frequency = "1d")
  long_mavg <- mean(long_hist)

  # If short > long, go 100% in apple
  if(short_mavg > long_mavg) {
    fly_order_target_percent(asset = context$asset, target = 1)
  }
  # Else if we hit the crossover, dump all of apple
  else if (short_mavg < long_mavg) {
    fly_order_target_percent(asset = context$asset, target = 0)
  }

  # Record today's data
  # We record the current apple price, along with the value of the short and long
  # term moving average
  fly_record(
    AAPL = fly_data_current(data, context$asset, "price"),
    short_mavg = short_mavg,
    long_mavg = long_mavg
  )

}

Run the algo over a certain time period.

performance <- fly_run_algorithm(
  initialize  = fly_initialize,
  handle_data = fly_handle_data,
  start       = as.Date("2013-01-01"),
  end         = as.Date("2016-01-01")
)

tail(performance)
#> # A tibble: 6 x 41
#>   date      AAPL algo_volatility algorithm_period… alpha benchmark_period…
#>   <chr>    <dbl>           <dbl>             <dbl> <dbl>             <dbl>
#> 1 2015-12…  109.           0.181             0.550 0.161          -0.0121 
#> 2 2015-12…  108.           0.181             0.550 0.161          -0.0137 
#> 3 2015-12…  107.           0.181             0.550 0.161          -0.0159 
#> 4 2015-12…  109.           0.181             0.550 0.159          -0.00545
#> 5 2015-12…  107.           0.181             0.550 0.160          -0.0125 
#> 6 2015-12…  105.           0.181             0.550 0.162          -0.0224 
#> # ... with 35 more variables: benchmark_volatility <dbl>, beta <dbl>,
#> #   capital_used <dbl>, ending_cash <dbl>, ending_exposure <dbl>,
#> #   ending_value <dbl>, excess_return <dbl>, gross_leverage <dbl>,
#> #   long_exposure <dbl>, long_mavg <dbl>, long_value <dbl>,
#> #   longs_count <dbl>, max_drawdown <dbl>, max_leverage <dbl>,
#> #   net_leverage <dbl>, orders <list>, period_close <dttm>,
#> #   period_label <chr>, period_open <dttm>, pnl <dbl>,
#> #   portfolio_value <dbl>, positions <list>, returns <dbl>, sharpe <dbl>,
#> #   short_exposure <dbl>, short_mavg <dbl>, short_value <dbl>,
#> #   shorts_count <dbl>, sortino <dbl>, starting_cash <dbl>,
#> #   starting_exposure <dbl>, starting_value <dbl>, trading_days <dbl>,
#> #   transactions <list>, treasury_period_return <dbl>

From the performance tibble, we can look at the recorded value of Apple’s stock price.

library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following object is masked from 'package:ggplot2':
#> 
#>     vars
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

performance <- performance %>%
  mutate(date = as.POSIXct(date, "UTC"))

performance %>%
  filter(date >= "2014-08-01") %>%
  ggplot(aes(x = date, y = AAPL)) +
  geom_line()

We can also look at the value of our portfolio over time.

first_order <- performance %>%
  filter(row_number() == 300) %>%
  pull(date)

performance %>%
  mutate(date = as.POSIXct(date, "UTC")) %>%
  ggplot(aes(x = date, y = portfolio_value)) +
  geom_line() +
  geom_vline(xintercept = first_order, color = "red") +
  annotate("text", x = first_order - days(50), y = 10500, 
           label = "First Order", color = "red")

flyingfox's People

Contributors

Stargazers

Watchers

Forkers

forked-oilgains zhaimac englianhu zfy1989lee afcarl anhmike joehenres cjtexas

flyingfox's Issues

error in install_zipline

hi David,

thank you for the fantastic job. One quick question. you mention

To get started with zipline, you’ll need the zipline Python module. Install it with:
install_zipline()

i tried the following:

library(flyingfox)
install_zipline()
Solving environment: ...working... failed

PackagesNotFoundError: The following packages are not available from current channels:

zipline

Current channels:

however, install zipline in anaconda worked.

here is anaconda info:

(base) C:\Windows\system32>conda info

 active environment : base
active env location : C:\ProgramData\Anaconda3
        shell level : 1
   user config file : C:\Users\mark\.condarc

populated config files : C:\Users\mark.condarc
conda version : 4.5.4
conda-build version : 3.10.9
python version : 3.5.5.final.0
base environment : C:\ProgramData\Anaconda3 (writable)
channel URLs : https://conda.anaconda.org/Qantopian/win-64
https://conda.anaconda.org/Qantopian/noarch
https://repo.anaconda.com/pkgs/main/win-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/free/win-64
https://repo.anaconda.com/pkgs/free/noarch
https://repo.anaconda.com/pkgs/r/win-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/pro/win-64
https://repo.anaconda.com/pkgs/pro/noarch
https://repo.anaconda.com/pkgs/msys2/win-64
https://repo.anaconda.com/pkgs/msys2/noarch
package cache : C:\ProgramData\Anaconda3\pkgs
C:\Users\mark\AppData\Local\conda\conda\pkgs
envs directories : C:\ProgramData\Anaconda3\envs
C:\Users\mark\AppData\Local\conda\conda\envs
C:\Users\mark.conda\envs
platform : win-64
user-agent : conda/4.5.4 requests/2.19.1 CPython/3.5.5 Windows/7 Wi
ndows/6.1.7601
administrator : True
netrc file : None
offline mode : False

(base) C:\Windows\system32>

Default data bundle changed to quantopian quandl

Hi,

first of all, thank you for introducing the quantopian backtesting module into R! While running your basic example classic dual moving average crossover I get the following error after running the chunk

``
performance <- fly_run_algorithm(
initialize = fly_initialize,
handle_data = fly_handle_data,
start = as.Date("2013-01-01"),
end = as.Date("2016-01-01")
)

``
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: no data for bundle 'quandl' on or before 2018-08-12 22:48:44.663246+00:00
maybe you need to run: $ zipline ingest -b quandl

Detailed traceback:
File "", line 50, in py_run
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\utils\run_algo.py", line 430, in run_algorithm
blotter=blotter,
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\utils\run_algo.py", line 141, in _run
bundle_timestamp,
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\core.py", line 521, in load
timestr = most_recent_data(name, timestamp, environ=environ)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\core.py", line 497, in most_recent_data
timestamp=timestamp,

After trying zipline ingest -b quandl the following error emerges:

Traceback (most recent call last):
File "C:\Users\Simon\Anaconda3\Scripts\zipline-script.py", line 11, in
load_entry_point('zipline==1.3.0', 'console_scripts', 'zipline')()
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 722, in call
return self.main(*args, **kwargs)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 697, in main
rv = self.invoke(ctx)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 1066, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 535, in invoke
return callback(*args, **kwargs)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline_main.py", line 348, in ingest
show_progress,
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\core.py", line 451, in ingest
pth.data_path([name, timestr], environ=environ),
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\quandl.py", line 209, in quandl_bundle
environ.get('QUANDL_DOWNLOAD_ATTEMPTS', 5)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\quandl.py", line 113, in fetch_data_tabl
"Failed to download Quandl data after %d attempts." % (retries)
ValueError: Failed to download Quandl data after 5 attempts.

Googling around a bit it seems the quantopian guys have changed the default bundle to quantopian quandl as the command zipline ingest -b quantopian-quandl works.

What needs to be changed in calling you example code, so that it searches for data in quantopian quandl and not quandl dataset?

security list

Hi ,

This is not a bug. but I don't know where to put my question.
i tried to assign a security list to the context, like:

python:
context.security_list = [sid(24), sid(5061), sid(39840), sid(21435)]

flyingfox:

fly_initialize <- function(context) {
context$i <- 0L
sList <- c("ARNC","AAPL","ABT","ADSK","TAP","ADBE","ADI","ADM")
context$asset <-  sList
}

fly_handle_data <- function(context, data) {
  
  # Increment day
  context$i <- context$i + 1L
  
  # While < 300 days of data, return
  if(context$i < 21L) {
    return()
  }

  price_hist <- fly_data_history(data, context$asset, "price", bar_count = 21L, frequency = "1d")

}

Error:

debugSource('~/ShortTermMeanReversion.R')
Error in py_call_impl(callable, dots$args, dots$keywords) :
RuntimeError: Evaluation error: ValueError: invalid literal for int() with base 10: 'ARNC'.

Can you help? Or do you have some example that I can follow?

thanks

Mark

davisvaughan / flyingfox Goto Github PK

flyingfox's Introduction

flyingfox

Installation

Setup

Example

flyingfox's People

Contributors

Stargazers

Watchers

Forkers

flyingfox's Issues

error in install_zipline

Default data bundle changed to quantopian quandl

security list

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent