Posted on Categories Opinion, Programming, StatisticsTags , , , , ,

Using wrapr::let() with tidyeval

While going over some of the discussion related to my last post I came up with a really neat way to use wrapr::let() and rlang/tidyeval together.

Please read on to see the situation and example.Suppose we want to parameterize over a couple of names, one denoting a variable coming from the current environment and one denoting a column name. Further suppose we are worried the two names may be the same.

We can actually handle this quite neatly, using rlang/tidyeval to denote intent (in this case using “!!” to specify “take from environment instead of the data frame”) and allowing wrapr::let() to perform the substitutions.

suppressPackageStartupMessages(library("dplyr"))
library("wrapr")

mass_col_name = 'mass'
mass_const_name = 'mass'

mass <- 100

let(
  c(MASS_COL = mass_col_name,
    MASS_CONST = mass_const_name),
  
  starwars %>%
    transmute(height,
              (!! MASS_CONST), # `mass` from environment
              MASS_COL,        # `mass` from data.frame
              h100 = height * (!! MASS_CONST),  # env
              hm = height * MASS_COL            # data
    ) %>%
    head()
)

#> # A tibble: 6 x 5
#>   height `(100)`  mass  h100    hm
#>          
#> 1    172     100    77 17200 13244
#> 2    167     100    75 16700 12525
#> 3     96     100    32  9600  3072
#> 4    202     100   136 20200 27472
#> 5    150     100    49 15000  7350
#> 6    178     100   120 17800 21360

All in all, that is pretty neat.

(Note: rlang/tidyeval uses “(!! )” deference notation in a number of ways, here we are only using it to specify environment, not for substitution.)

2 thoughts on “Using wrapr::let() with tidyeval”

  1. Some new wrapr::let() features (found in the development version) include eval=FALSE and debugPrint=TRUE modes. With these options you can see what would be executed or what is being executed. These are great for learning wrapr::let() and debugging.

    For example in our above example we could run:

    let(
      c(MASS_COL = mass_col_name,
        MASS_CONST = mass_const_name),
      
      starwars %>%
        transmute(height,
                  (!! MASS_CONST), # `mass` from environment
                  MASS_COL,        # `mass` from data.frame
                  h100 = height * (!! MASS_CONST),  # env
                  hm = height * MASS_COL            # data
        ) %>%
        head(),
      
      eval = FALSE
    )
    

    This results in:

    # starwars %>% transmute(height, (!(!mass)), mass, h100 = height * 
    #    (!(!mass)), hm = height * mass) %>% head()
    

    which is exactly the code re-written by wrapr::let() has prepared for execution (one can even pass it to eval() for execution). This is an excellent way to see what wrapr::let() does, and work out if it does what you want. The “(!(!mass))” is just how R represents (!! mass), and as you see executes the same.

    With debugPrint=TRUE wrapr::let() both prints the replaced expression and then executes as usual.

  2. If you want to be very strict (and completely unambiguous) you can use the .data$ pronoun form to force references to the data.frame. We show this below.

    suppressPackageStartupMessages(library("dplyr"))
    library("wrapr")
    
    mass_col_name = 'mass'
    mass_const_name = 'mass'
    
    mass <- 100
    
    let(
      c(MASS_COL = mass_col_name,
        MASS_CONST = mass_const_name),
      debugPrint = TRUE,
      
      starwars %>%
        transmute(height,
                  (!! MASS_CONST),        # `mass` from environment
                  const = .data$MASS_COL, # `mass` from data.frame
                  h100 = height * (!! MASS_CONST),  # env
                  hm = height * .data$MASS_COL      # data
        ) %>%
        head()
    )
    

    We do not currently recommend using the pronoun in the form .data[[my_var]]. If you use `rlang`/`tidyeval` to perform substitutions *always* write something such as .data[[!!my_var]] (some details here). This is due to complications described in `dplyr` issues 2904 and 2916.

    This is one of the reasons we advise using `wrapr::let()` for substitution, even if you are using `rlang`/`tidyeval` (hence why you might end up using them together).

    The `rlang`/`tidyeval` substitution issues can be subtle and are possibly why the data-pronoun example in the actual `June 13, 2017 dplyr 0.7.0` announcement is not correct even using the development version of `dplyr` and `rlang`/`tidyeval` as of June 30, 2017.

    Notice when we re-run the start of example the data.frame is altered in an unexpected way (an extra column named “my_var” is added) and the data is grouped by the column “my_var“, and not by the column “homeworld” as in the earlier non-pronoun example (which presumably this example was supposed to match). This will be an issue if one tries to use or join this data after a `summarize()` step, as only named variables and the grouping variable survive `summarize()` (so the “homeworld” will not be present for downstream code expecting to use it).

    suppressPackageStartupMessages(library("dplyr"))
    packageVersion("dplyr")
    #> [1] '0.7.1.9000'
    packageVersion("rlang")
    #> [1] '0.1.1.9000'
    
    my_var <- "homeworld"
    
    starwars %>%
      group_by(.data[[my_var]]) 
    #> # A tibble: 87 x 14
    #> # Groups:   my_var [49]
    #>                  name height  mass    hair_color  skin_color eye_color
    #>                 <chr>  <int> <dbl>         <chr>       <chr>     <chr>
    #>  1     Luke Skywalker    172    77         blond        fair      blue
    #>  2              C-3PO    167    75          <NA>        gold    yellow
    #>  3              R2-D2     96    32          <NA> white, blue       red
    #>  4        Darth Vader    202   136          none       white    yellow
    #>  5        Leia Organa    150    49         brown       light     brown
    #>  6          Owen Lars    178   120   brown, grey       light      blue
    #>  7 Beru Whitesun lars    165    75         brown       light      blue
    #>  8              R5-D4     97    32          <NA>  white, red       red
    #>  9  Biggs Darklighter    183    84         black       light     brown
    #> 10     Obi-Wan Kenobi    182    77 auburn, white        fair blue-gray
    #> # ... with 77 more rows, and 8 more variables: birth_year <dbl>,
    #> #   gender <chr>, homeworld <chr>, species <chr>, films <list>,
    #> #   vehicles <list>, starships <list>, my_var <chr>
    

Comments are closed.