Posted on Categories Coding, OpinionTags , , 1 Comment on Quasiquotation in R via bquote()

Quasiquotation in R via bquote()

In August of 2003 Thomas Lumley added bquote() to R 1.8.1. This gave R and R users an explicit Lisp-style quasiquotation capability. bquote() and quasiquotation are actually quite powerful. Professor Thomas Lumley should get, and should continue to receive, a lot of credit and thanks for introducing the concept into R.

In fact bquote() is already powerful enough to build a version of dplyr 0.5.0 with quasiquotation semantics quite close (from a user perspective) to what is now claimed in tidyeval/rlang.

Let’s take a look at that.

Continue reading Quasiquotation in R via bquote()

Posted on Categories Programming, TutorialsTags , , , , , Leave a comment on Piping into ggplot2

Piping into ggplot2

In our wrapr pipe RJournal article we used piping into ggplot2 layers/geoms/items as an example.

Being able to use the same pipe operator for data processing steps and for ggplot2 layering is a question that comes up from time to time (for example: Why can’t ggplot2 use %>%?). In fact the primary ggplot2 package author wishes that magrittr piping was the composing notation for ggplot2 (though it is obviously too late to change).

There are some fundamental difficulties in trying to use the magrittr pipe in such a way. In particular magrittr looks for its own pipe by name in un-evaluated code, and thus is difficult to engineer over (though it can be hacked around). The general concept is: pipe stages are usually functions or function calls, and ggplot2 components are objects (verbs versus nouns); and at first these seem incompatible.

However, the wrapr dot-arrow-pipe was designed to handle such distinctions.

Let’s work an example.

Continue reading Piping into ggplot2

Posted on Categories Opinion, TutorialsTags , , Leave a comment on Some R Guides: tidyverse and data.table Versions

Some R Guides: tidyverse and data.table Versions

Saghir Bashir of ilustat recently shared a nice getting started with R and tidyverse guide.

NewImage

In addition they were generous enough to link to Dirk Eddelbuette’s later adaption of the guide to use data.table.

NewImage

This type of cooperation and user choice is what keeps the R community vital. Please encourage it. (Heck, please insist on it!)

Posted on Categories Coding, OpinionTags , , 13 Comments on Running the Same Task in Python and R

Running the Same Task in Python and R

According to a KDD poll fewer respondents (by rate) used only R in 2017 than in 2016. At the same time more respondents (by rate) used only Python in 2017 than in 2016.

Let’s take this as an excuse to take a quick look at what happens when we try a task in both systems.

Continue reading Running the Same Task in Python and R

Posted on Categories Exciting Techniques, Practical Data Science, Pragmatic Data Science, Statistics, TutorialsTags , , 1 Comment on Quick Significance Calculations for A/B Tests in R

Quick Significance Calculations for A/B Tests in R

Introduction

Let’s take a quick look at a very important and common experimental problem: checking if the difference in success rates of two Binomial experiments is statistically significant. This can arise in A/B testing situations such as online advertising, sales, and manufacturing.

We already share a free video course on a Bayesian treatment of planning and evaluating A/B tests (including a free Shiny application). Let’s now take a look at the should be simple task of simply building a summary statistic that includes a classic frequentist significance.

Continue reading Quick Significance Calculations for A/B Tests in R

Posted on Categories data science, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , Leave a comment on Modeling muti-category Outcomes With vtreat

Modeling muti-category Outcomes With vtreat

vtreat is a powerful R package for preparing messy real-world data for machine learning. We have further extended the package with a number of features including rquery/rqdatatable integration (allowing vtreat application at scale on Apache Spark or data.table!).

In addition vtreat and can now effectively prepare data for multi-class classification or multinomial modeling.

Continue reading Modeling muti-category Outcomes With vtreat