Giter Club home page Giter Club logo

ggtrace's Introduction

I am a Ph.D. candidate in Linguistics at the University of Pennsylvania, studying psycholinguistics and language acquisition.

I'm also an R enthusiast and data visualization hobbyist. Outside of linguistics research, I develop open source software for statistical computing and graphics, data quality assurance, and (interfaces to) data APIs.

ggtrace's People

Contributors

yjunechoe avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

ggtrace's Issues

ggedit

New function ggedit() should work similarly ggtrace except it only takes the method/obj and call trace() inside with edit = TRUE.

ggtrace_aes wrapper

takes an aesthetic and gets its value at stage, after_stat, after_scale

stage could be nice if data is inherited but data arg in layer is a function, so value of aes at stage is not transparent

#19 can be motivation

early returns means trace at a particular step may not get triggered

Step 2 returns empty df if data is null, so breaks early here:

> ggtrace(
  ggplot2:::Layer$map_statistic,
  seq_len(length(ggbody(ggplot2:::Layer$map_statistic))),
  quote(1 + 1)
)
> ggplot()
Triggering trace on ggplot2:::Layer$map_statistic

[Step 1]> 1 + 1
[1] 2

[Step 2]> 1 + 1
[1] 2

Call `last_ggtrace()` to get the trace dump.
Untracing ggplot2:::Layer$map_statistic on exit.

Step 8 returns data if there are no calculated or staged aesthetics, so it returns early for StatIdentity as well

> ggtrace(
  ggplot2:::Layer$map_statistic,
  seq_len(length(ggbody(ggplot2:::Layer$map_statistic))),
  quote(1 + 1)
)
> ggplot(mtcars, aes(mpg, hp)) + geom_point()
Triggering trace on ggplot2:::Layer$map_statistic

[Step 1]> 1 + 1
[1] 2

[Step 2]> 1 + 1
[1] 2

[Step 3]> 1 + 1
[1] 2

[Step 4]> 1 + 1
[1] 2

[Step 5]> 1 + 1
[1] 2

[Step 6]> 1 + 1
[1] 2

[Step 7]> 1 + 1
[1] 2

[Step 8]> 1 + 1
[1] 2

Call `last_ggtrace()` to get the trace dump.
Untracing ggplot2:::Layer$map_statistic on exit.

remove all instances of ~line where it's not standalone

head(~line) appears in a couple places and should instead be passed in as ~line with .print = FALSE

Should also change the wording of ggtrace() documentation as well to say that only ~line alone will be substituted for the expression at the current step:

To simply run a step (or reference the expression at a step), you can use the ~line keyword. All instances of ~line will get substituted by the expression inside the debugging environment.

Better string conversion for ggproto objects

ggproto objects with long names get truncated. Full names (+ not enclosed in <>) would be nice for messages (ex: returning the corresponding gguntrace() code as a message when using ggtrace(once = FALSE))

The offending line from ggtrace(): obj_name <- rlang::as_label(obj)

Reprex:

rlang::as_label(ggplot2::StatBoxplot)
[1] "<SttBxplt>"
rlang::expr_deparse(ggplot2::StatBoxplot, width = Inf)
[1] "<SttBxplt>"

rlang::as_label(ggplot2::StatBin)
[1] "<StatBin>"
rlang::expr_deparse(ggplot2::StatBin, width = Inf)
[1] "<StatBin>"

The solution probably exists somewhere in ggplot2 docs(?)

Finish first draft of ggtrace tests

Should minimally cover the documented usecases

  • General (naming, print & message, persistent trace)
  • Untracing (gguntrace())
  • Tracedumps (last_ggtrace()/global_ggtrace())
  • Inspect (expressions return values)
  • Capture (expressions return environments)
  • Inject (expressions modify the runtime environment)
  • Error handling (all explicit rlang::abort() cases)
  • Options (ggtrace.as_tibble, ggtrace.suppressMessages --- use {withr})

document injection + bang-bang combo

If new_data is a modified form of data retrieved from the same location, you inject it by ovverriding data with assign in the next trace

So this works:

ggtrace(
    method = PositionJitter$compute_layer,
    trace_steps = 12,
    trace_exprs = rlang::expr(data <- !!new_data),
    .print = FALSE
)

add test for conditional injection

Can you ensure that the injection expression is only evaluated when a condition is met?

For example in this example from tests, if order of the two layers are switched and you want to change behavior of how staged aes are handled via Layer, you need once = FALSE to reach the second layer where this is relevant. But can you do this without modifying self$stat of the first layer by making the injection conditional?

p <- ggplot(data.frame(value = 16)) +
geom_point(aes(stage(value, after_stat = x), 0), colour = "black", size = 10) +
geom_point(aes(value, 0), colour = "red", size = 10) +
scale_x_sqrt(limits = c(0, 16), breaks = c(0, 4, 16))

(also this should be another test but worth considering while resolving this one -- can you uniquely identify a self/layer by its position in the plot (code) w/o relying on its content? also should check whether the injection in the linked test is actually ephemeral by checking the state of p$layers[[2]]$stat$retransform or something like that, however you access layers from ggplot object. If you want it to be truly ephemeral, might as well copy the geom_point layer environment, change its stat property, and assign that whole thing to self)

wrap common trace workflows

Something like this?:

  • powertrace(template = "<<name>>", ...)
  • register_powertrace(trace_fn = ... )

Ex1: track down how aes gets resolved in stage(), after_stat(), after_scale() (#19)

Ex2: wraps this workflow - returns the data every time data changes inside ggplot_build.ggplot:

library(ggtrace)
library(ggplot2)
library(rlang)

# Bar plot using computed/"mapped" aesthetics with `after_stat()` and `after_scale()`
barplot_plot <- ggplot(data = palmerpenguins::penguins) +
  geom_bar(
    mapping = aes(
      x = species,                           # Discrete x-axis representing species
      y = after_stat(count / sum(count)),    # Bars represent count of species as proportions
      color = species,                       # The outline of the bars are colored by species
      fill = after_scale(alpha(color, 0.5))  # The fill of the bars are lighter than the outline color
    ),
    size = 3
  )
barplot_plot

ggbody(ggplot2:::ggplot_build.ggplot)

data_assigns <- vapply(ggbody(ggplot2:::ggplot_build.ggplot), function(x) {
  is_call(x) && !is.null(call_name(x)) && call_name(x) == "<-" && call_args(x)[[1]] == "data"
}, logical(1))

which(data_assigns)
inspection_exprs <- lapply(ggbody(ggplot2:::ggplot_build.ggplot)[data_assigns], function(x) { call_args(x)[[2]] })

ggtrace(
  method = ggplot2:::ggplot_build.ggplot,
  trace_steps = which(data_assigns),
  trace_exprs = inspection_exprs,
  use_names = FALSE,
  print_output = FALSE
)
is_traced(ggplot2:::ggplot_build.ggplot)

barplot_plot

tracedump <- last_ggtrace()
tracedump_layer1 <- lapply(tracedump, `[[`, 1)

names(tracedump)
ggbody(ggplot2:::ggplot_build.ggplot)[data_assigns]

turning printing on causes expr to be evaluated twice

library(ggtrace) # v0.4.1

aaa <- function() {
  a <- 1
  b <- 1
  c <- 1
  a + b + c
}
original <- aaa()

ggtrace(aaa, -1, quote(a <- a + 10), verbose = FALSE)
#> aaa now being traced.
no_print <- aaa()
#> Triggering trace on aaa
#> Untracing aaa on exit.

ggtrace(aaa, -1, quote(a <- a + 10))
#> aaa now being traced.
yes_print <- aaa()
#> Triggering trace on aaa
#> 
#> [Step 5]> a <- a + 10
#> [1] 11
#> 
#> Call `last_ggtrace()` to get the trace dump.
#> Untracing aaa on exit.

original
#> [1] 3
no_print
#> [1] 13
yes_print
#> [1] 23

ggtrace_generic for tracing s3/s4 methods

ggplot2:::ggplot_add.Layer
get("ggplot_add.Layer", envir = asNamespace("ggplot2"))
trace("ggplot_add.Layer", where = asNamespace("ggplot2"))

Some indirect heuristics

  • :: or ::: present
  • LHS of ^ passes rlang::is_installed()
  • $ absent

step_expr evaluating to NULL are removed or fail to be named

Removed when last element evaluates to NULL

> ggtrace(Stat$compute_layer, c(1, 1), list(hi = quote(1), bye = quote(NULL)))
> boxplot_plot

[Step 1]> 1
[1] 1

[Step 1]> NULL
NULL

> last_ggtrace()
$hi
[1] 1

if NULL is in middle, gets ignored and names are shifted up

> ggtrace(Stat$compute_layer, c(1, 2, 3), list(hi = quote(1), bye = quote(NULL), byebye = quote(2)), verbose = FALSE)
> boxplot_plot
> last_ggtrace()
$hi
[1] 1

$byebye
NULL

[[3]]
[1] 2

Problematic for conditional statements if you only care about the if case and you're returning NULL silently in else

add a warning about making assignments to environments/closures

For the Inject workflow, assigning to self$... while tracing will make modifications to that layer object, for example.

Interacting with self should be reserved for Inspect, like retrieving the name of the geom/stat/position that called the method (ex: ggplot2:::snakeize(class(self$geom)[[1]])) or an (inherited) property (ex: self$stat$retransform)

As an aside, if you really want to modify self$..., you could make a deep copy of the ggproto object (which is essentialy an environment) and give it the same classes. Then make changes to the method/properties of the copy and assign the copy to self

> rlang::env_label(geom_point()$stat)
[1] "0000019D1EC29B68"
> rlang::env_label(geom_text()$stat)
[1] "0000019D1EC29B68"
> identical(geom_point()$stat, geom_text()$stat)
[1] TRUE

> StatIdentity2 <- rlang::env_clone(geom_point()$stat)
> StatIdentity2
<environment: 0x0000019d2609cd90>

> class(geom_point()$stat)
[1] "StatIdentity" "Stat"         "ggproto"      "gg"          
> class(StatIdentity2) <- class(geom_point()$stat)

> StatIdentity2
<ggproto object: Class StatIdentity, Stat, gg>
    aesthetics: function
    compute_group: function
    compute_layer: function
    compute_panel: function
    default_aes: uneval
    extra_params: na.rm
    finish_layer: function
    non_missing_aes: 
    optional_aes: 
    parameters: function
    required_aes: 
    retransform: FALSE
    setup_data: function
    setup_params: function
    super:  <ggproto object: Class Stat, gg>

> StatIdentity
<ggproto object: Class StatIdentity, Stat, gg>
    aesthetics: function
    compute_group: function
    compute_layer: function
    compute_panel: function
    default_aes: uneval
    extra_params: na.rm
    finish_layer: function
    non_missing_aes: 
    optional_aes: 
    parameters: function
    required_aes: 
    retransform: FALSE
    setup_data: function
    setup_params: function
    super:  <ggproto object: Class Stat, gg>

> identical(StatIdentity, StatIdentity2)
[1] FALSE

Evaluate tracer function conditionally

An additional argument ggtrace() which takes an expression that evalutes to TRUE/FALSE.

This expression should just get evaluated at top in an if clause and just cause the tracer function to break if it fails, so as to not change the scope where the rest of the function gets evaluated

Refactor the fallback case when only method is provided

Is it possible to refactor this part of the code?

if (rlang::is_missing(obj)) {
  method_expr <- rlang::enexpr(method)
  split <- eval(rlang::expr(split_ggproto_method(!!method_expr)))
  method <- split[[1]]
  obj <- split[[2]]
}

To be something like this?

if (rlang::is_missing(obj)) {
  split <- some_function(method)
  method <- split[[1]]
  obj <- split[[2]]
}

Or is that too much metaprogramming enexpr-ception? Especially since it needs to wrap around the split_ggproto_method() helper

persistent tracing suppprt

untracing on exit is safe and a good default but it'd be nice to have an option to not untrace on exit.

would benefit from a mechanism like gguntrace(method = , obj = ) and gguntrace_all(), as well as something like ggcurtrace() to keep track.

this perhaps also calls for a bulkier last_ggtrace() if there's gonna be multiple trace dumps happening in same ggproto object/plot.

Minor readme edits

  • example 3 step 4 should showcase use_names = TRUE by actually using the names from to subset tracedump
  • example 4 step 4 should use ggplotGrob() to capture the output plot and store it into a variable, and demonstrate ability to render modified plot later

function to rebuild source code from callstack

To make reprex code from experimenting with ggedit().

This is close enough, could be slightly better:

cat(paste0(unlist(lapply(ggbody(StatSmooth$compute_group)[-1], rlang::expr_deparse, width = Inf)), collapse = "\n"))

data <- flip_data(data, flipped_aes)
if (length(unique(data$x)) < 2) {
  return(new_data_frame())
}
if (is.null(data$weight)) data$weight <- 1
if (is.null(xseq)) {
  if (is.integer(data$x)) {
    if (fullrange) {
      xseq <- scales$x$dimension()
    } else {
      xseq <- sort(unique(data$x))
    }
  } else {
    if (fullrange) {
      range <- scales$x$dimension()
    } else {
      range <- range(data$x, na.rm = TRUE)
    }
    xseq <- seq(range[1], range[2], length.out = n)
  }
}
if (identical(method, "loess")) {
  method.args$span <- span
}
if (is.character(method)) {
  if (identical(method, "gam")) {
    method <- mgcv::gam
  } else {
    method <- match.fun(method)
  }
}
if (identical(method, mgcv::gam) && is.null(method.args$method)) {
  method.args$method <- "REML"
}
base.args <- list(quote(formula), data = quote(data), weights = quote(weight))
model <- do.call(method, c(base.args, method.args))
prediction <- predictdf(model, xseq, se, level)
prediction$flipped_aes <- flipped_aes
flip_data(prediction, flipped_aes)

Document behavior of invisible()

invisible(ggplot-object) doesn't trigger trace.

Maybe there should be a formal option to suppress printing anything when trace is triggered? Although idk how comfortable I am with that --- tracing is dangerous and untracing is informative, so I'm sorta fine with messages being forced on people

option for ggbody to fetch inherited method

  • ggbody(..., inhert = FALSE) by default
  • Recurse through class(obj) and trycatch ggbody() until it returns something
  • Also return the corresponding ggbody() code like ggbody(Stat$compute_layer)

More robust test of the ~line keyword

Should it be evaluated differently if the line involves assignment?

Does it break when you try to do assignment? (trace_exprs = { temp <- ~line })

(re)move the obj argument

No where in the docs so we ever use the form ggtrace(method = , obj = ) where both are specified. I'm always just showing ggtrace(method = , trace_steps = , trace_exprs = )

Is it cumbersome that obj gets in the way between the trace_* arguments? ggtrace() is the only function that has more than method and obj as args., so it won't break much if I move obj to the end, maybe?

ggtrace(method, trace_steps, trace_exprs, obj, once, .print)?

This would allow really short code like ggtrace(StatBoxplot$compute_group, 2:4), which would just run steps 2-4 (requires #13 )

Makes more sense to me from convenience pov so should decide on this ASAP

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.