Posted on Categories Opinion, Programming, StatisticsTags , , ,

# Neglected R Super Functions

`R` has a lot of under-appreciated super powerful functions. I list a few of our favorites below.

Atlas, carrying the sky. Royal Palace (Paleis op de Dam), Amsterdam.

• `stats::approx()`: approximate a curve/function.
• `base::cumsum()`: cumulative ordered sum.
• `stats::ecdf()`: estimate the cumulative distribution function.
• `base::findInterval()`: assign values to bins.
• `base::match()`: bulk computation of first match. Can lookup and sort data and even find non-duplicate data.
• `base::Reduce()`: nifty functional method to combine multiple function evaluations.
• `base::tapply()`: grouped summary function.
• `base::unlist()`: build arrays of atomic values from more complicated nested structures.
• `base::Vectorize()`: Convert scalar functions into functions ready to operate on arrays.

## 14 thoughts on “Neglected R Super Functions”

1. Nathan says:

Not sure if they’re “neglected,” but I use `setdiff` and `intersect` a lot. They’re both just calls to `match` with a little extra logic, but they definitely improve readability.

1. Good point `setdiff()`, `intersect()`, and `unique()` are good to keep in mind.

2. Jeffrey Magouirk says:

table() is a function I use a great deal

3. Anonymous says:

base::aggregate() works like similar to tapply gut returns a data frame
base::mapply() can substitute some nested for loops by taking multiple arguments
base::assign() is very useful when creating several variables from a for loop

4. Bill Venables says:

Why “neglected”? I use all of these on a regular basis. The ones you may have missed are stats::approxfun() and stats::splinefun(). I find the functional versions of approx() and spline() much easier to get my head around.

1. “neglected” may be a stretch- but these functions are so great they definitely deserve an extra call-out.

5. My favorites, not in your list are the hdquantile function of the Hmisc package, sapply from base, and probably Matrix from the Matrix package, with its compressed matrix formats.

As general techniques, splines seemed undermentioned, whether Akima interpolating via akima package and aspline, or pspline package and its smooth.Pspline function.

6. Forgot good old `rank()`.

1. Bill Venables says:

You mean I don’t need to use match(x, sort(x)) any more?
Next you’ll be telling me there’s a function for match(sort(x), x) called ‘order(x)’ or something…
Sheesh!

1. Again, more in the spirit of: I remembered an odd sorting application that `rank()` is convenient for (in contrast to `rank()`‘s obvious utility in ranking things).

7. Alan Haynes says:

For plotting, I find graphics::grconvertX() and graphics::grconvertY() very useful, particularly with boxplots. graphics::layout() is also very handy.

8. I like `pretty` especially in base R plots. Setting `ylim=range(pretty(X))` makes plots (boxplots, barplots, scatterplots)… prettier :)

9. Scott Locklin says:

It’s funny, all of these are essentially base primitives in APL languages. It’s kind of amazing Iverson thought of everything before he even had a repl to work with.

1. Iverson punched way above his weight.