Posted on Categories Coding, TutorialsTags , , 4 Comments on Getting Started With rquery

Getting Started With rquery

To make getting started with rquery (an advanced query generator for R) easier we have re-worked the package README for various data-sources (including SparkR!).

Continue reading Getting Started With rquery

Posted on Categories Coding, OpinionTags , , , , Leave a comment on Playing With Pipe Notations

Playing With Pipe Notations

Recently Hadley Wickham prescribed pronouncing the magrittr pipe as “then” and using right-assignment as follows:

NewImage

I am not sure if it is a good or bad idea. But let’s play with it a bit, and perhaps readers can submit their experience and opinions in the comments section.

Continue reading Playing With Pipe Notations

Posted on Categories Coding, Opinion, TutorialsTags , , , 7 Comments on Timing the Same Algorithm in R, Python, and C++

Timing the Same Algorithm in R, Python, and C++

While developing the RcppDynProg R package I took a little extra time to port the core algorithm from C++ to both R and Python.

This means I can time the exact same algorithm implemented nearly identically in each of these three languages. So I can extract some comparative “apples to apples” timings. Please read on for a summary of the results.

Continue reading Timing the Same Algorithm in R, Python, and C++

Posted on Categories Coding, Opinion, Programming, TutorialsTags , , Leave a comment on Quoting Concatenate

Quoting Concatenate

In our last note we used wrapr::qe() to help quote expressions. In this note we will discuss quoting and code-capturing interfaces (interfaces that capture user source code) a bit more.

Continue reading Quoting Concatenate

Posted on Categories Coding, Exciting Techniques, Programming, TutorialsTags , , Leave a comment on Reusable Pipelines in R

Reusable Pipelines in R

Pipelines in R are popular, the most popular one being magrittr as used by dplyr.

This note will discuss the advanced re-usable piping systems: rquery/rqdatatable operator trees and wrapr function object pipelines. In each case we have a set of objects designed to extract extra power from the wrapr dot-arrow pipe %.>%.

Continue reading Reusable Pipelines in R

Posted on Categories Coding, OpinionTags , , , Leave a comment on Timing Grouped Mean Calculation in R

Timing Grouped Mean Calculation in R

This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement.

The original published timings were as follows:

With performance metrics: measurements are marketing. So let’s dig in the above a bit.

Continue reading Timing Grouped Mean Calculation in R

Posted on Categories Coding, OpinionTags , , 2 Comments on Quasiquotation in R via bquote()

Quasiquotation in R via bquote()

In August of 2003 Thomas Lumley added bquote() to R 1.8.1. This gave R and R users an explicit Lisp-style quasiquotation capability. bquote() and quasiquotation are actually quite powerful. Professor Thomas Lumley should get, and should continue to receive, a lot of credit and thanks for introducing the concept into R.

In fact bquote() is already powerful enough to build a version of dplyr 0.5.0 with quasiquotation semantics quite close (from a user perspective) to what is now claimed in tidyeval/rlang.

Let’s take a look at that.

Continue reading Quasiquotation in R via bquote()

Posted on Categories Coding, OpinionTags , , 15 Comments on Running the Same Task in Python and R

Running the Same Task in Python and R

According to a KDD poll fewer respondents (by rate) used only R in 2017 than in 2016. At the same time more respondents (by rate) used only Python in 2017 than in 2016.

Let’s take this as an excuse to take a quick look at what happens when we try a task in both systems.

Continue reading Running the Same Task in Python and R

Posted on Categories Coding, data science, Programming, TutorialsTags , , , , 15 Comments on Using a Column as a Column Index

Using a Column as a Column Index

We recently saw a great recurring R question: “how do you use one column to choose a different value for each row?” That is: how do you use a column as an index? Please read on for some idiomatic base R, data.table, and dplyr solutions.

Continue reading Using a Column as a Column Index