Posted on Categories Coding, OpinionTags , , 1 Comment on Quasiquotation in R via bquote()

Quasiquotation in R via bquote()

In August of 2003 Thomas Lumley added bquote() to R 1.8.1. This gave R and R users an explicit Lisp-style quasiquotation capability. bquote() and quasiquotation are actually quite powerful. Professor Thomas Lumley should get, and should continue to receive, a lot of credit and thanks for introducing the concept into R.

In fact bquote() is already powerful enough to build a version of dplyr 0.5.0 with quasiquotation semantics quite close (from a user perspective) to what is now claimed in tidyeval/rlang.

Let’s take a look at that.

Continue reading Quasiquotation in R via bquote()

Posted on Categories Coding, OpinionTags , , 13 Comments on Running the Same Task in Python and R

Running the Same Task in Python and R

According to a KDD poll fewer respondents (by rate) used only R in 2017 than in 2016. At the same time more respondents (by rate) used only Python in 2017 than in 2016.

Let’s take this as an excuse to take a quick look at what happens when we try a task in both systems.

Continue reading Running the Same Task in Python and R

Posted on Categories Coding, data science, Programming, TutorialsTags , , , , 15 Comments on Using a Column as a Column Index

Using a Column as a Column Index

We recently saw a great recurring R question: “how do you use one column to choose a different value for each row?” That is: how do you use a column as an index? Please read on for some idiomatic base R, data.table, and dplyr solutions.

Continue reading Using a Column as a Column Index

Posted on Categories Coding, TutorialsTags , , , ,

R Tip: Be Wary of “…”

R Tip: be wary of “...“.

The following code example contains an easy error in using the R function unique().

vec1 <- c("a", "b", "c")
vec2 <- c("c", "d")
unique(vec1, vec2)
# [1] "a" "b" "c"

Notice none of the novel values from vec2 are present in the result. Our mistake was: we (improperly) tried to use unique() with multiple value arguments, as one would use union(). Also notice no error or warning was signaled. We used unique() incorrectly and nothing pointed this out to us. What compounded our error was R‘s “...” function signature feature.

In this note I will talk a bit about how to defend against this kind of mistake. I am going to apply the principle that a design that makes committing mistakes more difficult (or even impossible) is a good thing, and not a sign of carelessness, laziness, or weakness. I am well aware that every time I admit to making a mistake (I have indeed made the above mistake) those who claim to never make mistakes have a laugh at my expense. Honestly I feel the reason I see more mistakes is I check a lot more.

Continue reading R Tip: Be Wary of “…”

Posted on Categories Administrativia, Coding, ProgrammingTags , 1 Comment on wrapr 1.5.0 available on CRAN

wrapr 1.5.0 available on CRAN

The R package wrapr 1.5.0 is now available on CRAN.

wrapr includes a lot of tools for writing better R code:

I’ll be writing articles on a number of the new capabilities. For now I just leave you with the nifty operator coalesce notation.

Continue reading wrapr 1.5.0 available on CRAN

Posted on Categories Coding, Opinion, TutorialsTags , , , 4 Comments on magrittr and wrapr Pipes in R, an Examination

magrittr and wrapr Pipes in R, an Examination

Let’s consider piping in R both using the magrittr package and using the wrapr package.

Continue reading magrittr and wrapr Pipes in R, an Examination

Posted on Categories Coding, Opinion, Pragmatic Data Science, Statistics, TutorialsTags , , , , , , ,

R Tip: Think in Terms of Values

R tip: first organize your tasks in terms of data, values, and desired transformation of values, not initially in terms of concrete functions or code.

I know I write a lot about coding in R. But it is in the service of supporting statistics, analysis, predictive analytics, and data science.

R without data is like going to the theater to watch the curtain go up and down.

(Adapted from Ben Katchor’s Julius Knipl, Real Estate Photographer: Stories, Little, Brown, and Company, 1996, page 72, “Excursionist Drama 2”.)

Usually you come to R to work with data. If you think and plan in terms of data and values (including introducing more data to control processing) you will usually work in much faster, explainable, and maintainable fashion.

Continue reading R Tip: Think in Terms of Values