Giter Club home page Giter Club logo

future.apply's Introduction

future.apply: Apply Function to Elements in Parallel using Futures

Introduction

The purpose of this package is to provide worry-free parallel alternatives to base-R functions lapply() and friends. The goal is that in most cases one should be able to just replace lapply() in the code with the futurized equivalent future_lapply() and things will just work. For example, instead of doing:

library("stats")
x <- 1:10
y <- lapply(x, FUN = quantile, probs = 1:3/4)

one can do:

library("future")
plan(multiprocess)
library("stats")
x <- 1:10
y <- future_lapply(x, FUN = quantile, probs = 1:3/4)

Reproducibility is part of the core design, which means that perfect, parallel random number generation (RNG) is supported regardless of the amount of chunking, type of load balancing, and future backend being used. To enable parallel RNG, use argument future.seed = TRUE.

Role

Where does the future.apply package fit in the software stack? You can think of it as a sibling to foreach, BiocParallel, plyr, etc. Just as parallel provides parLapply(), foreach provides foreach(), BiocParallel provides bplapply(), and plyr provides llply(), future.apply provides future_lapply(). Below is a table summarizing this idea:

Package Functions Backends
future.apply

Future-versions of common goto *apply() functions available in core R (the 'base' package and some 'stats' package):
future_apply(), future_eapply(), future_lapply(), future_mapply(), future_rapply(), future_sapply(), future_tapply(), future_vapply(), future_aggregate(), ...

Note, several of these functions are yet to be implemented.
All future backends
parallel mclapply(), parApply(), parLapply(), parSapply(), ... Built-in and conditional on operating system
foreach foreach(), times() All future backends via doFuture
BiocParallel Bioconductor's parallel mappers:
bpaggregate(), bpiterate(), bplapply(), and bpvec()
All future backends via doFuture (because it supports foreach) or via BiocParallel.FutureParam (direct BiocParallelParam support; prototype)
plyr **ply(..., .parallel = TRUE) functions:
aaply(), ddply(), dlply(), llply(), ...
All future backends via doFuture (because it uses foreach internally)

Note that, except for the parallel package, none of these higher-level APIs implement their own parallel backends, but they rather enhance existing ones. The foreach framework leverages backends such as doParallel, doMC and doFuture, and the future.apply framework leverages the future ecosystem and therefore backends such as built-in parallel and future.batchtools.

By separating future_lapply() and friends from the future package, it helps clarifying the purpose of the future package, which is to define and provide the core Future API, which higher-level parallel APIs can build on and for which any futurized parallel backends can be plugged into.

Roadmap

  1. Implement future_*apply() versions for all common *apply() functions that exist in base R. This also involves writing a large set of package tests asserting the correctness and the same behavior as the corresponding *apply() functions.

  2. Harmonize all future_*apply() functions with each other, e.g. the future-specific arguments.

  3. Consider additional future_*apply() functions and features that fit in this package but don't necessarily have a corresponding function in base R. Examples of this may be "apply" functions that return futures rather than values, mechanisms for benchmarking, and richer control over load balancing.

The API and identity of the future.apply package will be kept close to the *apply() functions in base R. In other words, it will neither keep growing nor be expanded with new, more powerful apply-like functions beyond those core ones in base R. Such extended functionality should be part of a separate package.

Installation

R package future.apply is available on CRAN and can be installed in R as:

install.packages('future.apply')

Contributions

This Git repository uses the Git Flow branching model (the git flow extension is useful for this). The develop branch contains the latest contributions and other code that will appear in the next release, and the master branch contains the code of the latest release, which is exactly what is currently on CRAN.

Contributing to this package is easy. Just send a pull request. When you send your PR, make sure develop is the destination branch on the future.apply repository. Your PR should pass R CMD check --as-cran, which will also be checked by Travis CI and AppVeyor CI when the PR is submitted.

Software status

Resource: CRAN Travis CI Appveyor
Platforms: Multiple Linux & macOS Windows
R CMD check CRAN version Build status Build status
Test coverage Coverage Status

future.apply's People

Contributors

henrikbengtsson avatar

Watchers

James Cloos avatar Davis Vaughan avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.