Posted on Categories Coding, Programming, TutorialsTags , , , 8 Comments on R Tip: Use drop = FALSE with data.frames

R Tip: Use drop = FALSE with data.frames

Another R tip. Get in the habit of using drop = FALSE when indexing (using [ , ] on) data.frames.

NewImage

Prince Rupert’s drops (img: Wikimedia Commons)

Continue reading R Tip: Use drop = FALSE with data.frames

Posted on Categories Coding, Statistics, TutorialsTags , , , , 7 Comments on R Tip: Force Named Arguments

R Tip: Force Named Arguments

R tip: force the use of named arguments when designing function signatures.

R’s named function argument binding is a great aid in writing correct programs. It is a good idea, if practical, to force optional arguments to only be usable by name. To do this declare the additional arguments after “...” and enforce that none got lost in the “... trap” by using a checker such as wrapr::stop_if_dot_args().

Example:

#' Increment x by inc.
#' 
#' @param x item to add to
#' @param ... not used for values, forces later arguments to bind by name
#' @param inc (optional) value to add
#' @return x+inc
#'
#' @examples
#'
#' f(7) # returns 8
#'
f <- function(x, ..., inc = 1) {
   wrapr::stop_if_dot_args(substitute(list(...)), "f")
   x + inc
}

f(7)
#> [1] 8

f(7, inc = 2)
#> [1] 9


f(7, q = mtcars)
#> Error: f unexpected arguments: q = mtcars

f(7, 2)
#> Error: f unexpected arguments: 2 

By R function evaluation rules: any unexpected/undeclared arguments are captured by the “...” argument. Then “wrapr::stop_if_dot_args()” inspects for such values and reports an error if there are such. The "f" string is returned as part of the error, I chose the name of the function as in this case. The “substitute(list(…))” part is R’s way of making the contents of “…” available for inspection.

You can also use the technique on required arguments. wrapr::stop_if_dot_args() is a simple low-dependency helper function intended to make writing code such as the above easier. This is under the rubric that hidden errors are worse than thrown exceptions. It is best to find and signal problems early, and near the cause.

The idea is that you should not expect a user to remember the positions of more than 1 to 3 arguments, the rest should only be referable by name. Do not make your users count along large sequences of arguments, the human brain may have special cases for small sequences.

If you have a procedure with 10 parameters, you probably missed some.

Alan Perlis, “Epigrams on Programming”, ACM SIGPLAN Notices 17 (9), September 1982, pp. 7–13.

Note that the “substitute(list(...))” part is the R idiom for capturing the unevaluated contents of “...“, I felt it best to use standard R as much a possible in favor of introducing any additional magic invocations.

Posted on Categories Administrativia, Coding, Statistics, TutorialsTags , , , 7 Comments on R Tip: Use [[ ]] Wherever You Can

R Tip: Use [[ ]] Wherever You Can

R tip: use [[ ]] wherever you can.

In R the [[ ]] is the operator that (when supplied a simple scalar argument) pulls a single element out of lists (and the [ ] operator pulls out sub-lists).

For vectors [[ ]] and [ ] appear to be synonyms (modulo the issue of names). However, for a vector [[ ]] checks that the indexing argument is a scalar, so if you intend to retrieve one element this is a good way of getting an extra check and documenting intent. Also, when writing reusable code you may not always be sure if your code is going to be applied to a vector or list in the future.

It is safer to get into the habit of always using [[ ]] when you intend to retrieve a single element.

Example with lists:

list("a", "b")[1]
#> [[1]]
#> [1] "a"

list("a", "b")[[1]]
#> [1] "a"

Example with vectors:

c("a", "b")[1]
#> [1] "a"

c("a", "b")[[1]]
#> [1] "a"

The idea is: in situations where both [ ] and [[ ]] apply we rarely see [[ ]] being the worse choice.


Note on this article series.

This R tips series is short simple notes on R best practices, and additional packaged tools. The intent is to show both how to perform common tasks, and how to avoid common pitfalls. I hope to share about 20 of these about every other day to learn from the community which issues resonate and to also introduce some of features from some of our packages. It is an opinionated series and will sometimes touch on coding style, and also try to showcase appropriate Win-Vector LLC R tools.

Posted on Categories Coding, TutorialsTags , , 8 Comments on R Tip: Use seq_len() to Avoid The Backwards Sequence Trap

R Tip: Use seq_len() to Avoid The Backwards Sequence Trap

Another R tip. Use seq_len() to avoid the backwards sequence trap.

Many R users use the “colon sequence” notation to build sequences. For example:

for(i in 1:5) {
  print(paste(i, i*i))
}
#> [1] "1 1"
#> [1] "2 4"
#> [1] "3 9"
#> [1] "4 16"
#> [1] "5 25"

However, the colon notation can be unsafe as it does not properly handle the empty sequence case:

n <- 0

1:n
#> [1] 1 0

Notice the above example built a reversed sequence, instead of an empty sequence.

This leads to the backwards sequence trap: writing code of the form “1:length(x)” is often wrong. For example “for(i in 1:length(x)) { statements involving x[[i]] }“, which will fail for length-zero x.

To avoid this use seq_len() or seq_along():

seq_len(5)
#> [1] 1 2 3 4 5

n <- 0
seq_len(n)
#> integer(0)

integer(0)” is a length zero sequence of integers (not a sequence containing the value zero).

Posted on Categories Coding, StatisticsTags , , , , 1 Comment on R Tip: Use qc() For Fast Legible Quoting

R Tip: Use qc() For Fast Legible Quoting

Here is an R tip. Need to quote a lot of names at once? Use qc().

This is particularly useful in selecting columns from data.frames:

library("wrapr")  # get qc() definition

head(mtcars[, qc(mpg, cyl, wt)])

#                    mpg cyl    wt
# Mazda RX4         21.0   6 2.620
# Mazda RX4 Wag     21.0   6 2.875
# Datsun 710        22.8   4 2.320
# Hornet 4 Drive    21.4   6 3.215
# Hornet Sportabout 18.7   8 3.440
# Valiant           18.1   6 3.460

Or even to install many packages at once:

install.packages(qc(vtreat, cdata, WVPlots))
# shorter than the alternative:
#  install.packages(c("vtreat", "cdata", "WVPlots"))