Giter Club home page Giter Club logo

flyingfox's Introduction

lifecycle

flyingfox

The goal of flyingfox is to connect Quantopian’s zipline financial backtesting package with R.

Installation

You can install the released version of flyingfox from CRAN with:

# NO YOU CANNOT
install.packages("flyingfox")

And the development version from GitHub with:

# install.packages("devtools")
devtools::install_github("DavisVaughan/flyingfox")

Setup

(Using reticulate >= 1.7.1)

To get started with zipline, you’ll need the zipline Python module. Install it with:

install_zipline()

By default zipline will be installed into the virtualenv, r-reticulate, as recommended by reticulate.

Next, you’ll need data to run the backtest on. The easiest way to do this is to:

  1. Create a free account on Quandl and find your API Key in Account Settings.

  2. Add the API key as the R environment variable, QUANDL_API_KEY (access your .Renviron file with usethis::edit_r_environ()).

  3. “Ingest” the Quandl data with flyingfox::fly_ingest().

fly_ingest()

Example

flyingfox backtests are run using a combination of two main functions. fly_initialize() sets up variables you might need during the backtest along with giving you a chance to schedule functions to run periodically. fly_handle_data() is called at a daily/minutely frequency and runs your algorithm, orders assets, and records data for future inspection.

Below, we are going to create a basic mean reversion strategy to demonstrate the basics of running an algorithm.

library(flyingfox)

First, set up an initialize function. It must take context as the argument. Think of context as a persistent environment where you can store variables and assets that you want to access at any point in the simulation.

fly_initialize <- function(context) {

  # We want to track what day we are on. The mean reversion algo we use
  # should have at least 300 days of data before doing anything
  context$i = 0L

  # We want to trade apple stock
  context$asset = fly_symbol("AAPL")
}

Next, create a data handling function that accepts context and data. Think of data as an environment containing functions for accessing historical and current price data about the assets you are using in your simulation.

The below implementation of fly_handle_data() demonstrates a mean reversion algorithm.

fly_handle_data <- function(context, data) {

  # Increment day
  context$i <- context$i + 1L

  # While < 300 days of data, return
  if(context$i < 300L) {
    return()
  }

  # Calculate a short term (100 day) moving average
  # by pulling history for the asset (apple) and taking an average
  short_hist <- fly_data_history(data, context$asset, "price", bar_count = 100L, frequency = "1d")
  short_mavg <- mean(short_hist)

  # Calculate a long term (300 day) moving average
  long_hist <- fly_data_history(data, context$asset, "price", bar_count = 300L, frequency = "1d")
  long_mavg <- mean(long_hist)

  # If short > long, go 100% in apple
  if(short_mavg > long_mavg) {
    fly_order_target_percent(asset = context$asset, target = 1)
  }
  # Else if we hit the crossover, dump all of apple
  else if (short_mavg < long_mavg) {
    fly_order_target_percent(asset = context$asset, target = 0)
  }

  # Record today's data
  # We record the current apple price, along with the value of the short and long
  # term moving average
  fly_record(
    AAPL = fly_data_current(data, context$asset, "price"),
    short_mavg = short_mavg,
    long_mavg = long_mavg
  )

}

Run the algo over a certain time period.

performance <- fly_run_algorithm(
  initialize  = fly_initialize,
  handle_data = fly_handle_data,
  start       = as.Date("2013-01-01"),
  end         = as.Date("2016-01-01")
)

tail(performance)
#> # A tibble: 6 x 41
#>   date      AAPL algo_volatility algorithm_period… alpha benchmark_period…
#>   <chr>    <dbl>           <dbl>             <dbl> <dbl>             <dbl>
#> 1 2015-12…  109.           0.181             0.550 0.161          -0.0121 
#> 2 2015-12…  108.           0.181             0.550 0.161          -0.0137 
#> 3 2015-12…  107.           0.181             0.550 0.161          -0.0159 
#> 4 2015-12…  109.           0.181             0.550 0.159          -0.00545
#> 5 2015-12…  107.           0.181             0.550 0.160          -0.0125 
#> 6 2015-12…  105.           0.181             0.550 0.162          -0.0224 
#> # ... with 35 more variables: benchmark_volatility <dbl>, beta <dbl>,
#> #   capital_used <dbl>, ending_cash <dbl>, ending_exposure <dbl>,
#> #   ending_value <dbl>, excess_return <dbl>, gross_leverage <dbl>,
#> #   long_exposure <dbl>, long_mavg <dbl>, long_value <dbl>,
#> #   longs_count <dbl>, max_drawdown <dbl>, max_leverage <dbl>,
#> #   net_leverage <dbl>, orders <list>, period_close <dttm>,
#> #   period_label <chr>, period_open <dttm>, pnl <dbl>,
#> #   portfolio_value <dbl>, positions <list>, returns <dbl>, sharpe <dbl>,
#> #   short_exposure <dbl>, short_mavg <dbl>, short_value <dbl>,
#> #   shorts_count <dbl>, sortino <dbl>, starting_cash <dbl>,
#> #   starting_exposure <dbl>, starting_value <dbl>, trading_days <dbl>,
#> #   transactions <list>, treasury_period_return <dbl>

From the performance tibble, we can look at the recorded value of Apple’s stock price.

library(ggplot2)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following object is masked from 'package:ggplot2':
#> 
#>     vars
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(lubridate)
#> 
#> Attaching package: 'lubridate'
#> The following object is masked from 'package:base':
#> 
#>     date

performance <- performance %>%
  mutate(date = as.POSIXct(date, "UTC"))

performance %>%
  filter(date >= "2014-08-01") %>%
  ggplot(aes(x = date, y = AAPL)) +
  geom_line()

We can also look at the value of our portfolio over time.

first_order <- performance %>%
  filter(row_number() == 300) %>%
  pull(date)

performance %>%
  mutate(date = as.POSIXct(date, "UTC")) %>%
  ggplot(aes(x = date, y = portfolio_value)) +
  geom_line() +
  geom_vline(xintercept = first_order, color = "red") +
  annotate("text", x = first_order - days(50), y = 10500, 
           label = "First Order", color = "red")

flyingfox's People

Contributors

zhaimac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

flyingfox's Issues

error in install_zipline

hi David,

thank you for the fantastic job. One quick question. you mention

To get started with zipline, you’ll need the zipline Python module. Install it with:
install_zipline()

i tried the following:

library(flyingfox)
install_zipline()
Solving environment: ...working... failed

PackagesNotFoundError: The following packages are not available from current channels:

  • zipline

Current channels:

however, install zipline in anaconda worked.

here is anaconda info:

(base) C:\Windows\system32>conda info

 active environment : base
active env location : C:\ProgramData\Anaconda3
        shell level : 1
   user config file : C:\Users\mark\.condarc

populated config files : C:\Users\mark.condarc
conda version : 4.5.4
conda-build version : 3.10.9
python version : 3.5.5.final.0
base environment : C:\ProgramData\Anaconda3 (writable)
channel URLs : https://conda.anaconda.org/Qantopian/win-64
https://conda.anaconda.org/Qantopian/noarch
https://repo.anaconda.com/pkgs/main/win-64
https://repo.anaconda.com/pkgs/main/noarch
https://repo.anaconda.com/pkgs/free/win-64
https://repo.anaconda.com/pkgs/free/noarch
https://repo.anaconda.com/pkgs/r/win-64
https://repo.anaconda.com/pkgs/r/noarch
https://repo.anaconda.com/pkgs/pro/win-64
https://repo.anaconda.com/pkgs/pro/noarch
https://repo.anaconda.com/pkgs/msys2/win-64
https://repo.anaconda.com/pkgs/msys2/noarch
package cache : C:\ProgramData\Anaconda3\pkgs
C:\Users\mark\AppData\Local\conda\conda\pkgs
envs directories : C:\ProgramData\Anaconda3\envs
C:\Users\mark\AppData\Local\conda\conda\envs
C:\Users\mark.conda\envs
platform : win-64
user-agent : conda/4.5.4 requests/2.19.1 CPython/3.5.5 Windows/7 Wi
ndows/6.1.7601
administrator : True
netrc file : None
offline mode : False

(base) C:\Windows\system32>

Default data bundle changed to quantopian quandl

Hi,

first of all, thank you for introducing the quantopian backtesting module into R! While running your basic example classic dual moving average crossover I get the following error after running the chunk

``
performance <- fly_run_algorithm(
initialize = fly_initialize,
handle_data = fly_handle_data,
start = as.Date("2013-01-01"),
end = as.Date("2016-01-01")
)

``
Error in py_call_impl(callable, dots$args, dots$keywords) :
ValueError: no data for bundle 'quandl' on or before 2018-08-12 22:48:44.663246+00:00
maybe you need to run: $ zipline ingest -b quandl

Detailed traceback:
File "", line 50, in py_run
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\utils\run_algo.py", line 430, in run_algorithm
blotter=blotter,
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\utils\run_algo.py", line 141, in _run
bundle_timestamp,
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\core.py", line 521, in load
timestr = most_recent_data(name, timestamp, environ=environ)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\core.py", line 497, in most_recent_data
timestamp=timestamp,

After trying zipline ingest -b quandl the following error emerges:

Traceback (most recent call last):
File "C:\Users\Simon\Anaconda3\Scripts\zipline-script.py", line 11, in
load_entry_point('zipline==1.3.0', 'console_scripts', 'zipline')()
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 722, in call
return self.main(*args, **kwargs)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 697, in main
rv = self.invoke(ctx)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 1066, in invoke
return process_result(sub_ctx.command.invoke(sub_ctx))
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\click\core.py", line 535, in invoke
return callback(*args, **kwargs)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline_main
.py", line 348, in ingest
show_progress,
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\core.py", line 451, in ingest
pth.data_path([name, timestr], environ=environ),
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\quandl.py", line 209, in quandl_bundle
environ.get('QUANDL_DOWNLOAD_ATTEMPTS', 5)
File "C:\Users\Simon\Anaconda3\Lib\site-packages\zipline\data\bundles\quandl.py", line 113, in fetch_data_tabl
"Failed to download Quandl data after %d attempts." % (retries)
ValueError: Failed to download Quandl data after 5 attempts.

Googling around a bit it seems the quantopian guys have changed the default bundle to quantopian quandl as the command zipline ingest -b quantopian-quandl works.

What needs to be changed in calling you example code, so that it searches for data in quantopian quandl and not quandl dataset?

security list

Hi ,

This is not a bug. but I don't know where to put my question.
i tried to assign a security list to the context, like:

python:
context.security_list = [sid(24), sid(5061), sid(39840), sid(21435)]

flyingfox:

fly_initialize <- function(context) {
context$i <- 0L
sList <- c("ARNC","AAPL","ABT","ADSK","TAP","ADBE","ADI","ADM")
context$asset <-  sList
}

fly_handle_data <- function(context, data) {
  
  # Increment day
  context$i <- context$i + 1L
  
  # While < 300 days of data, return
  if(context$i < 21L) {
    return()
  }

  price_hist <- fly_data_history(data, context$asset, "price", bar_count = 21L, frequency = "1d")

}

Error:

debugSource('~/ShortTermMeanReversion.R')
Error in py_call_impl(callable, dots$args, dots$keywords) :
RuntimeError: Evaluation error: ValueError: invalid literal for int() with base 10: 'ARNC'.

Can you help? Or do you have some example that I can follow?

thanks

Mark

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.