Posted on Categories Programming, TutorialsTags , , , , , Leave a comment on Piping into ggplot2

Piping into ggplot2

In our wrapr pipe RJournal article we used piping into ggplot2 layers/geoms/items as an example.

Being able to use the same pipe operator for data processing steps and for ggplot2 layering is a question that comes up from time to time (for example: Why can’t ggplot2 use %>%?). In fact the primary ggplot2 package author wishes that magrittr piping was the composing notation for ggplot2 (though it is obviously too late to change).

There are some fundamental difficulties in trying to use the magrittr pipe in such a way. In particular magrittr looks for its own pipe by name in un-evaluated code, and thus is difficult to engineer over (though it can be hacked around). The general concept is: pipe stages are usually functions or function calls, and ggplot2 components are objects (verbs versus nouns); and at first these seem incompatible.

However, the wrapr dot-arrow-pipe was designed to handle such distinctions.

Let’s work an example.

Continue reading Piping into ggplot2

Posted on Categories Opinion, ProgrammingTags , 4 Comments on A Better Example of the Confused By The Environment Issue

A Better Example of the Confused By The Environment Issue

Our interference from then environment issue was a bit subtle. But there are variations that can be a bit more insidious.

Please consider the following.

Continue reading A Better Example of the Confused By The Environment Issue

Posted on Categories Opinion, Programming, TutorialsTags , 5 Comments on A Subtle Flaw in Some Popular R NSE Interfaces

A Subtle Flaw in Some Popular R NSE Interfaces

It is no great secret: I like value oriented interfaces that preserve referential transparency. It is the side of the public debate I take in R programming.

"One of the most useful properties of expressions is that called by Quine referential transparency. In essence this means that if we wish to find the value of an expression which contains a sub-expression, the only thing we need to know about the sub-expression is its value."

Christopher Strachey, "Fundamental Concepts in Programming Languages", Higher-Order and Symbolic Computation, 13, 1149, 2000, Kluwer Academic Publishers (lecture notes written by Christopher Strachey for the International Summer School in Computer Programming at Copenhagen in August, 1967).

Please read on for discussion of a subtle bug shared by a few popular non-standard evaluation interfaces.

Continue reading A Subtle Flaw in Some Popular R NSE Interfaces

Posted on Categories Opinion, Programming, TutorialsTags , , , Leave a comment on Timing Column Indexing in R

Timing Column Indexing in R

I’ve ended up (almost accidentally) collecting a number of different solutions to the “use a column to choose values from other columns in R” problem.

Please read on for a brief benchmark comparing these methods/solutions.

Continue reading Timing Column Indexing in R

Posted on Categories Coding, data science, Programming, TutorialsTags , , , , 15 Comments on Using a Column as a Column Index

Using a Column as a Column Index

We recently saw a great recurring R question: “how do you use one column to choose a different value for each row?” That is: how do you use a column as an index? Please read on for some idiomatic base R, data.table, and dplyr solutions.

Continue reading Using a Column as a Column Index

Posted on Categories Administrativia, Exciting Techniques, ProgrammingTags , , , 2 Comments on Dot-Pipe Paper Accepted by the R Journal!!!

Dot-Pipe Paper Accepted by the R Journal!!!

We are thrilled to announce our (my and Nina Zumel’s) paper on the dot-pipe has been accepted by the R-Journal!

Untitled

Continue reading Dot-Pipe Paper Accepted by the R Journal!!!

Posted on Categories Opinion, Programming, TutorialsTags , , , , 3 Comments on Parameterizing with bquote

Parameterizing with bquote

One thing that is sure to get lost in my long note on macros in R is just how concise and powerful macros are. The problem is macros are concise, but they do a lot for you. So you get bogged down when you explain the joke.

Let’s try to be concise.

Continue reading Parameterizing with bquote

Posted on Categories Opinion, ProgrammingTags , , 2 Comments on Better R Code with wrapr Dot Arrow

Better R Code with wrapr Dot Arrow

Our R package wrapr supplies a "piping operator" that we feel is a real improvement in R code piped-style coding.

The idea is: with wrapr‘s "dot arrow" pipe "%.>%" the expression "A %.>% B" is treated very much like "{. <- A; B}". In particular this lets users think of "A %.>% B(.)" as a left-to-right way to write "B(A)" (i.e. under the convention of writing-out the dot arguments, the pipe looks a bit like left to right function composition, call this explicit dot notation).

This sort of notation becomes useful when we compose many steps. Some consider "A %.>% B(.) %.>% C(.) %.>% D(.)" to be easier to read and easier to maintain than "D(C(B(A)))".

Continue reading Better R Code with wrapr Dot Arrow

Posted on Categories Programming, TutorialsTags , , , 5 Comments on Announcing wrapr 1.6.2

Announcing wrapr 1.6.2

wrapr 1.6.2 is now up on CRAN. We have some neat new features for R users to try (in addition to many earlier wrapr goodies).



Continue reading Announcing wrapr 1.6.2

Posted on Categories Programming, TutorialsTags 2 Comments on A Quick Appreciation of the R transform Function

A Quick Appreciation of the R transform Function

R users who also use the dplyr package will be able to quickly understand the following code that adds an estimated area column to a data.frame.

suppressPackageStartupMessages(library("dplyr"))

iris %>%
  mutate(
    ., 
    Petal.Area = (pi/4)*Petal.Width*Petal.Length) %>%
  head(.)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Area
## 1          5.1         3.5          1.4         0.2  setosa  0.2199115
## 2          4.9         3.0          1.4         0.2  setosa  0.2199115
## 3          4.7         3.2          1.3         0.2  setosa  0.2042035
## 4          4.6         3.1          1.5         0.2  setosa  0.2356194
## 5          5.0         3.6          1.4         0.2  setosa  0.2199115
## 6          5.4         3.9          1.7         0.4  setosa  0.5340708

The notation we used above is the "explicit argument" variation we recommend for readability. What a lot of dplyr users do not seem to know is: base-R already has this functionality. The function is called transform().

To demonstrate this, let’s first detach dplyr to show that we are not using functions from dplyr.

detach("package:dplyr", unload = TRUE)

Now let’s write the equivalent pipeline using exclusively base-R.

iris ->.
   transform(
     ., 
     Petal.Area = (pi/4)*Petal.Width*Petal.Length) ->.
   head(.)
##   Sepal.Length Sepal.Width Petal.Length Petal.Width Species Petal.Area
## 1          5.1         3.5          1.4         0.2  setosa  0.2199115
## 2          4.9         3.0          1.4         0.2  setosa  0.2199115
## 3          4.7         3.2          1.3         0.2  setosa  0.2042035
## 4          4.6         3.1          1.5         0.2  setosa  0.2356194
## 5          5.0         3.6          1.4         0.2  setosa  0.2199115
## 6          5.4         3.9          1.7         0.4  setosa  0.5340708

The "->." notation is the end-of-line variation of the Bizarro Pipe. The transform() function has been part of R since 1998. dplyr::mutate() was introduced in 2014.

git log --all -p --reverse --source -S 'transform <-'

commit 41c2f7338c45dbf9eac99c210206bc3657bca98a refs/remotes/origin/tags/R-0-62-4
Author: pd <pd@00db46b3-68df-0310-9c12-caf00c1e9a41>
Date:   Wed Feb 11 18:31:12 1998 +0000

    Added the frametools functions subset() and transform()
    
    git-svn-id: https://svn.r-project.org/R/trunk@709 00db46b3-68df-0310-9c12-caf00c1e9a41