Giter Club home page Giter Club logo

standardize's Introduction

standardize

Codecov test coverage R build status

The goal of standardize is to ease the creation of data frame and array [ methods by providing a way to standardize i, j, and ... and provide information about the user’s [ call.

Installation

You can install the development version from GitHub with:

# install.packages("remotes")
remotes::install_github("DavisVaughan/standardize")

Example

library(standardize)

When you create a [ method for data frames, it generally has a signature like this:

`[.foo_df` <- function(x, i, j, drop = FALSE) {
  
}

This seems straightforward, but get’s a little confusing when you need to tell if i, j, and drop have been supplied by the user, or are missing. For example, if i is supplied but j is missing, like in x[i], then you probably want to perform some kind of column subsetting operation, rather than row subsetting. But if x[i,] is supplied, this is actually a row subset, and j is considered to be present but empty.

The goal of standardize is to help with this by standardizing these arguments in such a way that it makes figuring out what to do with them trivial. To do this, add collect_subscripts() as the first line of your [ method and pass it all the subscript related information.

`[.foo_df` <- function(x, i, j, drop = FALSE) {
  info <- collect_subscripts(i, j)
  str(info)
}

collect_subscripts() will do all of the counting for you, and will interpret calls like x[i] as x[,j] so that you as the developer just have to focus on the values of i and j. You’ll interpret NULL values to mean that your user didn’t supply that subscript. For example:

df <- data.frame(x = 1:5)
class(df) <- c("foo_df", class(df))

# Column subsetting, standardized to `i = NULL, j = 1`
df[1]
#> List of 4
#>  $ i          : NULL
#>  $ j          : num 1
#>  $ dots       : list()
#>  $ transformed: logi TRUE
#> NULL

# Row subsetting with implicit `j = NULL`
df[1,]
#> List of 4
#>  $ i          : num 1
#>  $ j          : NULL
#>  $ dots       : list()
#>  $ transformed: logi FALSE
#> NULL

# Still considered column subsetting
df[1, drop = FALSE]
#> List of 4
#>  $ i          : NULL
#>  $ j          : num 1
#>  $ dots       : list()
#>  $ transformed: logi TRUE
#> NULL

It also helps with more complex array subsetting. In this case, you also need to count higher dimensional subsetting in the .... Again, implicit dimensions should be returned as NULL to imply that they were counted but are missing. With array subsetting, you probably don’t want x[i] to be interpreted as x[,j], so you can turn that off with column_transform = FALSE.

`[.foo_array` <- function(x, i, j, ..., drop = FALSE) {
  info <- collect_subscripts(i, j, ..., column_transform = FALSE)
  str(info)
}
x <- array(1:5)
class(x) <- c("foo_array", class(x))

# Not interpreted as `x[,j]`
x[1]
#> List of 4
#>  $ i          : num 1
#>  $ j          : NULL
#>  $ dots       : list()
#>  $ transformed: logi FALSE
#> NULL

# Compare the following results, see the implicit `NULL` in the 3rd dimension?
x[1, 2]
#> List of 4
#>  $ i          : num 1
#>  $ j          : num 2
#>  $ dots       : list()
#>  $ transformed: logi FALSE
#> NULL
x[1, 2,]
#> List of 4
#>  $ i          : num 1
#>  $ j          : num 2
#>  $ dots       :List of 1
#>   ..$ : NULL
#>  $ transformed: logi FALSE
#> NULL

# Things can get a little crazy in high dimensional space, but this should
# be fairly interpretable.
x[1, 2, , 3, , 5, drop = TRUE]
#> List of 4
#>  $ i          : num 1
#>  $ j          : num 2
#>  $ dots       :List of 4
#>   ..$ : NULL
#>   ..$ : num 3
#>   ..$ : NULL
#>   ..$ : num 5
#>  $ transformed: logi FALSE
#> NULL

standardize's People

Contributors

davisvaughan avatar

Stargazers

Paul Hoffman avatar mikefc avatar Hiroaki Yutani avatar Kirill Müller avatar

Watchers

James Cloos avatar  avatar  avatar

standardize's Issues

Enforce that `I` and `j` are adjacent

Because if the signature is x[i, k, j] for some strange reason, then x[i,] won’t count correctly.

Would be efficient with a single match(ij, fml_names) that detect if they are present, and then you subtract the integer positions and ensure the distance between them is 1

Integrate in ellipsis?

Why:

  • Downstream are accessing tibble's internals for implementing [.foo() 🙃

Why there:

  • Out of scope for rlang?
  • A borderline fit for ellipsis?
  • Doesn't warrant a separate package?
  • Faster time to CRAN
  • Don't want to rewrite the code to use rlang's internal API

Return `transformed` indicator

To let you know if x[i] was transformed to x[,j] since otherwise you can’t tell them apart. Tibble would use this to throw a warning when x[i, drop=T] is specified

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.