Posted on Categories Coding, TutorialsTags , , 8 Comments on R Tip: Use seq_len() to Avoid The Backwards Sequence Trap

R Tip: Use seq_len() to Avoid The Backwards Sequence Trap

Another R tip. Use seq_len() to avoid the backwards sequence trap.

Many R users use the “colon sequence” notation to build sequences. For example:

for(i in 1:5) {
  print(paste(i, i*i))
}
#> [1] "1 1"
#> [1] "2 4"
#> [1] "3 9"
#> [1] "4 16"
#> [1] "5 25"

However, the colon notation can be unsafe as it does not properly handle the empty sequence case:

n <- 0

1:n
#> [1] 1 0

Notice the above example built a reversed sequence, instead of an empty sequence.

This leads to the backwards sequence trap: writing code of the form “1:length(x)” is often wrong. For example “for(i in 1:length(x)) { statements involving x[[i]] }“, which will fail for length-zero x.

To avoid this use seq_len() or seq_along():

seq_len(5)
#> [1] 1 2 3 4 5

n <- 0
seq_len(n)
#> integer(0)

integer(0)” is a length zero sequence of integers (not a sequence containing the value zero).

Posted on Categories Coding, StatisticsTags , , , , 1 Comment on R Tip: Use qc() For Fast Legible Quoting

R Tip: Use qc() For Fast Legible Quoting

Here is an R tip. Need to quote a lot of names at once? Use qc().

This is particularly useful in selecting columns from data.frames:

library("wrapr")  # get qc() definition

head(mtcars[, qc(mpg, cyl, wt)])

#                    mpg cyl    wt
# Mazda RX4         21.0   6 2.620
# Mazda RX4 Wag     21.0   6 2.875
# Datsun 710        22.8   4 2.320
# Hornet 4 Drive    21.4   6 3.215
# Hornet Sportabout 18.7   8 3.440
# Valiant           18.1   6 3.460

Or even to install many packages at once:

install.packages(qc(vtreat, cdata, WVPlots))
# shorter than the alternative:
#  install.packages(c("vtreat", "cdata", "WVPlots"))
Posted on Categories data science, Practical Data Science, Pragmatic Data Science, Programming, Statistics, TutorialsTags , , , , 16 Comments on The Zero Bug

The Zero Bug

I am going to write about an insidious statistical, data analysis, and presentation fallacy I call “the zero bug” and the habits you need to cultivate to avoid it.


The zero bug

The zero bug

Here is the zero bug in a nutshell: common data aggregation tools often can not “count to zero” from examples, and this causes problems. Please read on for what this means, the consequences, and how to avoid the problem. Continue reading The Zero Bug