winvector / wrapr Goto Github PK

View Code? Open in Web Editor NEW

135.0 8.0 11.0 13.17 MB

Wrap R for Sweet R Code

Home Page: https://winvector.github.io/wrapr/

License: Other

R 91.57% HTML 8.43%

wrapr's Introduction

wrapr is an R package that supplies powerful tools for writing and debugging R code.

Introduction

Primary wrapr services include:

%.>% (dot arrow pipe)
unpack/to (assign to multiple values)
as_named_list (build up a named list quickly)
build_frame() / draw_frame() ( data.frame builders and formatters )
bc() (blank concatenate)
qc() (quoting concatenate)
:= (named map builder)
%?% (coalesce)
%.|% (reduce/expand args)
uniques() (safe unique() replacement)
partition_tables() / execute_parallel()
DebugFnW() (function debug wrappers)
λ() (anonymous function builder)
let() (let block)
evalb()/si() (evaluate with bquote / string interpolation)
sortv() (sort a data.frame by a set of columns).
stop_if_dot_args() (check for unexpected arguments)

library(wrapr)
packageVersion("wrapr")
 #  [1] '2.1.0'
date()
 #  [1] "Sat Aug 19 09:06:13 2023"

`%.>%` (dot pipe or dot arrow)

%.>% dot arrow pipe is a pipe with intended semantics:

“a %.>% b” is to be treated approximately as if the user had written “{ . <- a; b };” with “%.>%” being treated as left-associative.

Other R pipes include magrittr and pipeR.

The following two expressions should be equivalent:

cos(exp(sin(4)))
 #  [1] 0.8919465

4 %.>% sin(.) %.>% exp(.) %.>% cos(.)
 #  [1] 0.8919465

The notation is quite powerful as it treats pipe stages as expression parameterized over the variable “.”. This means you do not need to introduce functions to express stages. The following is a valid dot-pipe:

1:4 %.>% .^2 
 #  [1]  1  4  9 16

The notation is also very regular as we show below.

1:4 %.>% sin
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025
1:4 %.>% sin(.)
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025
1:4 %.>% base::sin
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025
1:4 %.>% base::sin(.)
 #  [1]  0.8414710  0.9092974  0.1411200 -0.7568025

1:4 %.>% function(x) { x + 1 }
 #  [1] 2 3 4 5
1:4 %.>% (function(x) { x + 1 })
 #  [1] 2 3 4 5

1:4 %.>% { .^2 } 
 #  [1]  1  4  9 16
1:4 %.>% ( .^2 )
 #  [1]  1  4  9 16

Regularity can be a big advantage in teaching and comprehension. Please see “In Praise of Syntactic Sugar” for more details. Some formal documentation can be found here.

Some obvious “dot-free”” right-hand sides are rejected. Pipelines are meant to move values through a sequence of transforms, and not just for side-effects. Example: `5 %.>% 6` deliberately stops as `6` is a right-hand side that obviously does not use its incoming value. This check is only applied to values, not functions on the right-hand side.
Trying to pipe into a an “zero argument function evaluation expression” such as `sin()` is prohibited as it looks too much like the user declaring `sin()` takes no arguments. One must pipe into either a function, function name, or an non-trivial expression (such as `sin(.)`). A useful error message is returned to the user: `wrapr::pipe does not allow direct piping into a no-argument function call expression (such as "sin()" please use sin(.))`.
Some reserved words can not be piped into. One example is `5 %.>% return(.)` is prohibited as the obvious pipe implementation would not actually escape from user functions as users may intend.
Obvious de-references (such as `$`, `::`, `@`, and a few more) on the right-hand side are treated performed (example: `5 %.>% base::sin(.)`).
Outer parenthesis on the right-hand side are removed (example: `5 %.>% (sin(.))`).
Anonymous function constructions are evaluated so the function can be applied (example: `5 %.>% function(x) {x+1}` returns 6, just as `5 %.>% (function(x) {x+1})(.)` does).
Checks and transforms are not performed on items inside braces (example: `5 %.>% { function(x) {x+1} }` returns `function(x) {x+1}`, not 6).
The dot arrow pipe has S3/S4 dispatch (please see ). However as the right-hand side of the pipe is normally held unevaluated, we don’t know the type except in special cases (such as the rigth-hand side being referred to by a name or variable). To force the evaluation of a pipe term, simply wrap it in `.()`.

The dot pipe is also user configurable through standard S3/S4 methods.

The dot pipe has been formally written up in the R Journal.

@article{RJ-2018-042,
  author = {John Mount and Nina Zumel},
  title = {{Dot-Pipe: an S3 Extensible Pipe for R}},
  year = {2018},
  journal = {{The R Journal}},
  url = {https://journal.r-project.org/archive/2018/RJ-2018-042/index.html}
}

`unpack`/`to` multiple assignments

Unpack a named list into the current environment by name (for a positional based multiple assignment operator please see zeallot, for another named base multiple assigment please see vadr::bind).

d <- data.frame(
  x = 1:9,
  group = c('train', 'calibrate', 'test'),
  stringsAsFactors = FALSE)

unpack[
  train_data = train,
  calibrate_data = calibrate,
  test_data = test
  ] := split(d, d$group)

knitr::kable(train_data)

	x	group
1	1	train
4	4	train
7	7	train

`as_named_list`

Build up named lists. Very convenient for managing workspaces when used with used with unpack/to.

as_named_list(train_data, calibrate_data, test_data)
 #  $train_data
 #    x group
 #  1 1 train
 #  4 4 train
 #  7 7 train
 #  
 #  $calibrate_data
 #    x     group
 #  2 2 calibrate
 #  5 5 calibrate
 #  8 8 calibrate
 #  
 #  $test_data
 #    x group
 #  3 3  test
 #  6 6  test
 #  9 9  test

`build_frame()` / `draw_frame()`

build_frame() is a convenient way to type in a small example data.frame in natural row order. This can be very legible and saves having to perform a transpose in one’s head. draw_frame() is the complimentary function that formats a given data.frame (and is a great way to produce neatened examples).

x <- build_frame(
   "measure"                   , "training", "validation" |
   "minus binary cross entropy", 5         , -7           |
   "accuracy"                  , 0.8       , 0.6          )
print(x)
 #                       measure training validation
 #  1 minus binary cross entropy      5.0       -7.0
 #  2                   accuracy      0.8        0.6
str(x)
 #  'data.frame':   2 obs. of  3 variables:
 #   $ measure   : chr  "minus binary cross entropy" "accuracy"
 #   $ training  : num  5 0.8
 #   $ validation: num  -7 0.6
cat(draw_frame(x))
 #  x <- wrapr::build_frame(
 #     "measure"                     , "training", "validation" |
 #       "minus binary cross entropy", 5         , -7           |
 #       "accuracy"                  , 0.8       , 0.6          )

`qc()` (quoting concatenate)

qc() is a quoting variation on R’s concatenate operator c(). This code such as the following:

qc(a = x, b = y)
 #    a   b 
 #  "x" "y"

qc(one, two, three)
 #  [1] "one"   "two"   "three"

qc() also allows bquote() driven .()-style argument escaping.

aname <- "I_am_a"
yvalue <- "six"

qc(.(aname) := x, b = .(yvalue))
 #  I_am_a      b 
 #     "x"  "six"

Notice the := notation is required for syntacitic reasons.

`:=` (named map builder)

:= is the “named map builder”. It allows code such as the following:

'a' := 'x'
 #    a 
 #  "x"

The important property of named map builder is it accepts values on the left-hand side allowing the following:

name <- 'variableNameFromElsewhere'
name := 'newBinding'
 #  variableNameFromElsewhere 
 #               "newBinding"

A nice property is := commutes (in the sense of algebra or category theory) with R’s concatenation function c(). That is the following two statements are equivalent:

c('a', 'b') := c('x', 'y')
 #    a   b 
 #  "x" "y"

c('a' := 'x', 'b' := 'y')
 #    a   b 
 #  "x" "y"

The named map builder is designed to synergize with seplyr.

`%?%` (coalesce)

The coalesce operator tries to replace elements of its first argument with elements from its second argument. In particular %?% replaces NULL vectors and NULL/NA entries of vectors and lists.

Example:

c(1, NA) %?% list(NA, 20)
 #  [1]  1 20

`%.|%` (reduce/expand args)

x %.|% f stands for f(x[[1]], x[[2]], ..., x[[length(x)]]). v %|.% x also stands for f(x[[1]], x[[2]], ..., x[[length(x)]]). The two operators are the same, the variation just allowing the user to choose the order they write things. The mnemonic is: “data goes on the dot-side of the operator.”

args <- list('prefix_', c(1:3), '_suffix')

args %.|% paste0
 #  [1] "prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"
# prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"

paste0 %|.% args
 #  [1] "prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"
# prefix_1_suffix" "prefix_2_suffix" "prefix_3_suffix"

`DebugFnW()`

DebugFnW() wraps a function for debugging. If the function throws an exception the execution context (function arguments, function name, and more) is captured and stored for the user. The function call can then be reconstituted, inspected and even re-run with a step-debugger. Please see our free debugging video series and vignette('DebugFnW', package='wrapr') for examples.

`λ()` (anonymous function builder)

λ() is a concise abstract function creator or “lambda abstraction”. It is a placeholder that allows the use of the -character for very concise function abstraction.

Example:

# Make sure lambda function builder is in our enironment.
wrapr::defineLambda()

# square numbers 1 through 4
sapply(1:4, λ(x, x^2))
 #  [1]  1  4  9 16

`let()`

let() allows execution of arbitrary code with substituted variable names (note this is subtly different than binding values for names as with base::substitute() or base::with()).

The function is simple and powerful. It treats strings as variable names and re-writes expressions as if you had used the denoted variables. For example the following block of code is equivalent to having written “a + a”.

a <- 7

let(
  c(VAR = 'a'),
  
  VAR + VAR
)
 #  [1] 14

This is useful in re-adapting non-standard evaluation interfaces (NSE interfaces) so one can script or program over them.

We are trying to make let() self teaching and self documenting (to the extent that makes sense). For example try the arguments “eval=FALSE” prevent execution and see what would have been executed, or debug=TRUE to have the replaced code printed in addition to being executed:

let(
  c(VAR = 'a'),
  eval = FALSE,
  {
    VAR + VAR
  }
)
 #  {
 #      a + a
 #  }

let(
  c(VAR = 'a'),
  debugPrint = TRUE,
  {
    VAR + VAR
  }
)
 #  $VAR
 #  [1] "a"
 #  
 #  {
 #      a + a
 #  }
 #  [1] 14

Please see vignette('let', package='wrapr') for more examples. Some formal documentation can be found here. wrapr::let() was inspired by gtools::strmacro() and base::bquote(), please see here for some notes on macro methods in R.

`evalb()`/`si()` (evaluate with `bquote` / string interpolation)

wrapr supplies unified notation for quasi-quotation and string interpolation.

angle = 1:10
variable <- "angle"

# # execute code
# evalb(
#   plot(x = .(-variable), y = sin(.(-variable)))
# )

# alter string
si("plot(x = .(variable), y = .(variable))")
 #  [1] "plot(x = \"angle\", y = \"angle\")"

The extra .(-x) form is a shortcut for .(as.name(x)).

`sortv()` (sort a data.frame by a set of columns)

This is the sort command that is missing from R: sort a data.frame by a chosen set of columns specified in a variable.

d <- data.frame(
  x = c(2, 2, 3, 3, 1, 1), 
  y = 6:1,
  z = 1:6)
order_cols <- c('x', 'y')

sortv(d, order_cols)
 #    x y z
 #  6 1 1 6
 #  5 1 2 5
 #  2 2 5 2
 #  1 2 6 1
 #  4 3 3 4
 #  3 3 4 3

Installation

Install with:

install.packages("wrapr")

More Information

More details on wrapr capabilities can be found in the following two technical articles:

Note

Note: wrapr is meant only for “tame names”, that is: variables and column names that are also valid simple (without quotes) R variables names.

wrapr's People

Contributors

Stargazers

Watchers

Forkers

dy-kim ybj2004 mikebesso guhjy xtmgah hal2001 werthpadoh makarevichy brodieg zpeng1989 minghao2016

wrapr's Issues

Licensing : possible to use LGPL instead?

Sorry to be a nuisance, but is it possible for wrapr to be on a slightly more 'permissive' license, like LGPL or perhaps GPL with 'loading/importing/linking' exception? (the terminology for what happens when one R package 'imports' or 'loads' a function from another R package is so confusing....)

For what it is worth, magrittr has an MIT license ... but wrapr seems to have strong advantages.

add international character tests

Hi again,

So, it is working great so far with bc in the admittedly not that frequent situations it's needed for me, but in those cases it's really nice to have and save alot of tediuous editing.

Here are some tests would be nice to have for bc() - I have no experience in using git, so right now I'll have to write it like this:



library(wrapr)
library(tinytest)
# test of lowercase non-english letters (Danish: æ, ø and å)
expect_equal(
	bc('person_id, geography, danish_letter_æ, danish_letter_ø, danish_letter_å'),
  c("person_id", "geography", "danish_letter_æ", "danish_letter_ø", "danish_letter_å")
)

# test of mix of upcase non-english letters (Danish: Æ, Ø and Å)
expect_equal(
	bc('person_id, geography, danish_letter_Æ, danish_letter_æ, danish_letter_Ø, danish_letter_Å'),
  c("person_id", "geography", "danish_letter_Æ", "danish_letter_æ",  "danish_letter_Ø", "danish_letter_Å")
)

Originally posted by @emilBeBri in #12 (comment)

Character encoding causes order to error-out

From: WinVector/cdata#6 .

Only happens in knitr or reprex contexts:

ct = data.frame(
  variable = c("privée", "publique"),
  value = c("privée", "publique"),
  stringsAsFactors = FALSE
)
wrapr::has_no_dup_rows(ct)
#> Error in (function (..., na.last = TRUE, decreasing = FALSE, method = c("auto", : Character encoding must be UTF-8, Latin-1 or bytes

^{Created on 2019-03-25 by the reprex package (v0.2.1)}

Macros in R, as.call

Macros in R article is a well made analysis. Thank you very much for sharing.
As a feedback I would like to note that AFAIK the eval(as.call(c(as.name("fun"), ...))) should be preferred over do.call. Latter one may impose an overhead when constructing arguments for do.call.

Feature Request: Option to enable benchmarking on pipes

Hi all!, I really like this lib, I practically replaced magrittr with it.

Now, after some time playing and everything, I think is very important in languages like R be able to know the timing on the lines.

R is not "uniform", has a lot of operations where you change a little how to do something and impact a lot on the performance.

So, I was thinking would be great, a way to get the benchmarks, get the time of every line, would be great so measure performance and know which lines took more time to focus on improve the code.

I was thinking like:

options(wrapr_pipe_benchmark = TRUE)

This is the basis of the idea, there is still some points.

How to collect/show the data
There must be a way to handle deep, benchmarks on one function, more functions, per function

I know in order for a full benchmark, there is one missed point, and is create an operator to replace <- to handle a similar behavior, but, we can also do funny things like f(x) %.>% . and get the same values :3

Thx!

unexpected behaviour: qc() removes leading zeroes

Hi,

qc(), which is a favorite of mine and an integral part of my workflow, produces one unexpected result - given that the name of the function is "quoted concenate":

# produces '0' as expected
wrapr::qc(0)
# produces '1' as expected
wrapr::qc(1)
# produces '1' - not as expected
wrapr::qc(01)

The background is I have some identification numbers that are numbers-only, but some contain leading zeroes. I can't use qc() to select them :)

> sessionInfo()
R version 4.1.1 (2021-08-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=en_DK.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_DK.UTF-8        LC_COLLATE=en_DK.UTF-8    
 [5] LC_MONETARY=en_DK.UTF-8    LC_MESSAGES=en_DK.UTF-8   
 [7] LC_PAPER=en_DK.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_DK.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] wrapr_2.0.8

loaded via a namespace (and not attached):
[1] compiler_4.1.1

Unexpected Behavior

John,
long time user of your products but the updated qc is not acting like I believe it should.

t1 <- c('ABC-PB','DEF-PB')
t2 <- qc(ABC-PB,DEF-PB)
identical(t1,t2)

t2 is appearing as ('ABC - PB','DEF - PB') - notice the spaces around the "-"
thanks for all your efforts!

Some inconsistencies with `%.>%` and parens use

in the couple of cases below wrapr is not consistent where alternatives are.

library(magrittr)
library(pipeR)
library(wrapr)

1:5 %>% mean
#> [1] 3
1:5 %.>% mean
#> [1] 3
1:5 %>>% mean
#> [1] 3

mean <- "foo"

1:5 %>% mean
#> [1] 3
1:5 %.>% mean
#> Error: wrapr::apply_right_S4 default called with classes:
#>   integer 
#>  mean character 
#>   must have a more specific S4 method defined to dispatch
1:5 %>>% mean
#> [1] 3

1:5 %>% mean()
#> [1] 3
1:5 %.>% mean(.)
#> [1] 3
1:5 %>>% mean()
#> [1] 3

test %>% substitute
#> Error in eval(lhs, parent, parent): objet 'test' introuvable
test %.>% substitute
#> test
test %>>% substitute
#> Error in test %>>% substitute: objet 'test' introuvable

`%.%` <- function(e1,e2){
  eval.parent(eval(substitute(substitute(e2,list(. = substitute(e1))))))
}
test %>% substitute(.)
#> Error in eval(lhs, parent, parent): objet 'test' introuvable
test %.>% substitute(.)
#> Error in eval(pipe_left_arg, envir = pipe_environment, enclos = pipe_environment): objet 'test' introuvable
test %>>% substitute(.)
#> Error in test %>>% substitute(.): objet 'test' introuvable
test %.% substitute(.)
#> test

Created on 2018-12-16 by the reprex package (v0.2.0).

Invalid names in alias values

I'm writing a script where all operations are done over a user-supplied CSV. The column names might not be standard -- most often, a column might begin with a number (such as "4GS"). At one point I iterate over the columns with wrapr::let:

lapply(colnames(df), function(col_name) {
  wrapr::let(
    alias = list(COL_NAME = col_name),
    exprs = {
      df %>%
        dplyr::group_by(COL_NAME)
        .
        .
        .
    }
  )
})

Which leads to an "alias value not a valid name".

I understand that this is part of the wrap::let design philosophy (*let deliberately checks that it is mapping only to legal R names; this is to discourage the use of let to make names to arbitrary values, as that is the more properly left to R's environment systems). Do you have any plans to change that philosophy and allow invalid names? Alternatively, could you recommend another approach to dealing with this data?

Thank you!

Unexpected Behaviour with Aliased List Variables

There seems to be inconsistency in referencing named items in a list passed in as a parameter:

params.list <- list(
  MQT_TBL_1_ = 'dir_p_pts',
  MQT_AGG_TYPE_ = 'max'
)

applyAgg <- function(params.list.agg){
  wrapr::let(
    alias=list(MQT_TBL_1_=params.list.agg$MQT_TBL_1_),
    expr={
      print(params.list.agg$MQT_TBL_1_)         # This doesn't work (aliased)
      print(params.list.agg[["MQT_TBL_1_"]])   # This does work     (aliased)
      
      print(params.list.agg$MQT_AGG_TYPE_)       # This does work (not aliased)     
      print(params.list.agg[["MQT_AGG_TYPE_"]]) # This does work (not aliased)
    })
}

applyAgg(params.list)

Feature Request: wrapr::let map a name to something 'other than a name.'

If the 'methods of' mapping from desired names to names used in the data were liberalized, then this liberalization would be very useful. For for example, a name could map to 'other than a name.'

# sessionInfo() # [1] wrapr_0.2.0

let( alias=list(SORT_COLUMNS = "cyl"), { 
  head(plyr::arrange(mtcars, SORT_COLUMNS)) }, 
subsMethod = 'stringsubs')

   mpg cyl  disp hp drat    wt  qsec vs am gear carb
1 22.8   4 108.0 93 3.85 2.320 18.61  1  1    4    1
2 24.4   4 146.7 62 3.69 3.190 20.00  1  0    4    2
3 22.8   4 140.8 95 3.92 3.150 22.90  1  0    4    2
4 32.4   4  78.7 66 4.08 2.200 19.47  1  1    4    1
5 30.4   4  75.7 52 4.93 1.615 18.52  1  1    4    2
6 33.9   4  71.1 65 4.22 1.835 19.90  1  1    4    1

let( alias=list(SORT_COLUMNS = "cyl, disp"), {  
  head(plyr::arrange(mtcars, SORT_COLUMNS )) }
, subsMethod = 'stringsubs') 

Error in prepareAlias(alias) :
  wrapr:let alias value not a valid name: " cyl, disp "

eval(parse(text=stringr::str_interp("

  require(magrittr)
  head(plyr::arrange(mtcars, ${SORT_COLUMNS} )) %>%
  print
  
", list(SORT_COLUMNS = "cyl, disp"))))

WORKS

   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
1 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
2 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
3 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
4 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
5 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
6 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1

LET <- function(alias = NULL, string_expr = NULL) {
  eval(parse(text=stringr::str_interp(string = string_expr, env = alias)))
}
LET(alias = list(SORT_COLUMNS = "cyl, disp")
  , string_expr = "head(plyr::arrange(mtcars, ${SORT_COLUMNS} ))"  
)

WORKS

Without the hard-coded data.frame mtcars
Good enough for piping

LETP <- function(payload = NULL, alias = NULL, string_expr = NULL) {
  x <- payload
  eval(parse(text=stringr::str_interp(string = string_expr, env = alias)))
}

library(magrittr)

mtcars %>%
 { LETP(payload = .
, alias = list(SORT_COLUMNS = "cyl, disp")
, string_expr = "head(plyr::arrange(x, ${SORT_COLUMNS} ))"  ) }

WORKS

shortest working way to write

mtcars %>% LETP(., list(SORT_COLUMNS = "cyl, disp"), "head(plyr::arrange(x, ${SORT_COLUMNS} ))")

WORKS

   mpg cyl  disp  hp drat    wt  qsec vs am gear carb
1 33.9   4  71.1  65 4.22 1.835 19.90  1  1    4    1
2 30.4   4  75.7  52 4.93 1.615 18.52  1  1    4    2
3 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
4 27.3   4  79.0  66 4.08 1.935 18.90  1  1    4    1
5 30.4   4  95.1 113 3.77 1.513 16.90  1  1    5    2
6 22.8   4 108.0  93 3.85 2.320 18.61  1  1    4    1

suggestion for qc function

Hi, I love the qc() function, thank you for that. I was wondering if you wanted to add another qc-like function for the following use case:

You're copy-pasting something into R, where the spaces are the markers between different elements in the vector, like so:

259 289 287

You'd like to quickly turn it into this:

vector_x <- c(259, 289, 287)

By doing something like this:
vector_x <- bc(259 289 287)(

where bc() (blank concenate), is a function that understands that the spaces marks different elements.

or perhaps the best that could be done would be to put the whole vector in quotes and then let the function convert the spaces into commas and send it to c() (not sure how to approach this problem in programming)

vector_x <- bc("259 289 2872")

I'm aware that it would be bad form to put something like that into stable code, of course. But when you're working with something you're just trying out, it would actually save a lot time in the long run.

It could also be elements for a vector where something with tab or line changes denotes the different elements, but where the common form is that something like:

323                           9813                          3  
           234

should be translated into standard c() arguments like this:

vector_y <- c(323, 9813, 3, 234)

While I have made my first package a while ago, I don't have the expertise to do this, otherwise I would.

Hope you find the idea usefull, otherwise just disregard it!

How can you use dplyr::case_when progromatically with wrapr::let?

I want to be able use dplyr::case_when to dynamically cut a database column similar to how base::cut might work. I can generate a function to do this with rlang (below), I find wrapr:let much more readable. How would this similar approach be done with let? Both the construction of the case_expr list and passing that list as the argument to case_when?

library(RSQLite)

cut_column_from_vector <- function(column_name, cut_vector){
    # get names in various formats 
    new_column_name   <- paste0(column_name, '_filter__')
    s_column_name     <- rlang::sym(column_name)
    s_new_column_name <- rlang::sym(new_column_name)
    
    # the vector shouldn't have names, but if it has them, use those names instead of the
    # canned ones then NULL out the names
    if (!is.null(names(cut_vector))){
        cut_names <- names(cut_vector)
        cut_vector <- unname(cut_vector)
    } else {
        cut_names <- cut_vector
    }
    
    # construct the object case_when needs to work 
    case_expr <- lapply(c(0, seq_along(cut_vector)), function(i){
        if (i == 0){
            lab <- sprintf('x<=%s', cut_names[i+1]) # a label
            rlang::expr(!!s_column_name <= cut_vector[!!i+1] ~ !!lab) # the expression
        } else if (i == length(cut_vector)) {
            lab <- sprintf('x>%s', cut_names[i])
            rlang::expr(!!s_column_name > cut_vector[!!i] ~ !!lab)
        } else {
            lab <- sprintf('%s<x<=%s', cut_names[i], cut_names[i+1])
            rlang::expr(!!s_column_name > cut_vector[!!i] & !!s_column_name <= cut_vector[!!i+1] ~ !!lab)
        }
    })
    
    # return the function
    return(function(data){
        dplyr::mutate(data, !!s_new_column_name := dplyr::case_when(!!!case_expr))
    })
}

# reprex
db <- dbConnect(SQLite(), ':memory:')
dbWriteTable(db, 'tbl_mtcars',  mtcars)
tbl_mtcars <- dplyr::tbl(db, 'tbl_mtcars')

cut_fn <- cut_column_from_vector('hp', c(100,200,300))
cut_fn(tbl_mtcars) # creates column hp_filter__

Apologies for the strange approach to generating a function that operates on the whole table--it makes sense in the context of the project.

wrapr::qc doesnt replace c always

Certain things work with a c() but fails with qc()

Pls checkout SO question here

wrapr::let: "stringsubs" as the subsMethod default

IMHO, keep "stringsubs" as the subsMethod default. People are already using it in this way. The method is not bad.

IMHO, keep all three subsMethod methods. People exist that need 'let' to work in different(new) ways.

winvector / wrapr Goto Github PK

wrapr's Introduction

Introduction

unpack/to multiple assignments

build_frame() / draw_frame()

evalb()/si() (evaluate with bquote / string interpolation)

sortv() (sort a data.frame by a set of columns)

Installation

More Information

Note

wrapr's People

Contributors

Stargazers

Watchers

Forkers

wrapr's Issues

Recommend Projects

Recommend Topics

Recommend Org

`unpack`/`to` multiple assignments

`build_frame()` / `draw_frame()`

`evalb()`/`si()` (evaluate with `bquote` / string interpolation)

`sortv()` (sort a data.frame by a set of columns)