To make getting started with `rquery`

(an advanced query generator for `R`

) easier we have re-worked the package `README`

for various data-sources (including `SparkR`

!).

# Tag: R

## Playing With Pipe Notations

Recently Hadley Wickham prescribed pronouncing the `magrittr`

pipe as “then” and using right-assignment as follows:

I am not sure if it is a good or bad idea. But let’s play with it a bit, and perhaps readers can submit their experience and opinions in the comments section.

## Query Generation in R

## PDSwR2 Free Excerpt and New Discount Code

Manning has a new discount code and a free excerpt of our book Practical Data Science with R, 2nd Edition: here.

This section is elementary, but things really pick up speed as later on (also available in a paid preview).

## cdata Control Table Keys

In our `cdata`

`R`

package and training materials we emphasize the record-oriented thinking and how to design a transform control table. We now have an additional exciting new feature: control table keys.

The user can now control which columns of a `cdata`

control table are the keys, including now using composite keys (that is keys that are spread across more than one column). This is easiest to demonstrate with an example.

## PDSwR2: New Chapters!

We have two new chapters of *Practical Data Science with R, Second Edition* online and available for review!

The newly available chapters cover:

**Data Engineering And Data Shaping** – Explores how to use R to organize or wrangle data into a shape useful for analysis. The chapter covers applying data transforms, data manipulation packages, and more.

**Choosing and Evaluating Models** – The chapter starts with exploring machine learning approaches and then moves to studying key model evaluation topics like mapping business problems to machine learning tasks, evaluating model quality, and how to explain model predictions.

If you haven’t signed up for our book’s MEAP (Manning Early Access Program), we encourage you to do so. The MEAP includes a free copy of *Practical Data Science with R*, First Edition, as well as early access to chapter drafts of the second edition as we complete them.

For those of you who have already subscribed — thank you! We hope you enjoy the new chapters, and we look forward to your feedback.

## Function Objects and Pipelines in R

Composing functions and sequencing operations are core programming concepts.

Some notable realizations of sequencing or pipelining operations include:

- Unix’s
`|`

-pipe - CMS Pipelines.
`F#`

‘s forward pipe operator`|>`

.- Haskel’s Data.Function
`&`

operator. - The
`R`

`magrittr`

forward pipe. - Scikit-learn‘s
`sklearn.pipeline.Pipeline`

.

The idea is: many important calculations can be considered as a sequence of transforms applied to a data set. Each step may be a function taking many arguments. It is often the case that only one of each function’s arguments is primary, and the rest are parameters. For data science applications this is particularly common, so having convenient pipeline notation can be a plus. An example of a non-trivial data processing pipeline can be found here.

In this note we will discuss the advanced `R`

pipeline operator "dot arrow pipe" and an `S4`

class (`wrapr::UnaryFn`

) that makes working with pipeline notation much more powerful and much easier.

## Fully General Record Transforms with cdata

One of the design goals of the `cdata`

`R`

package is that very powerful and arbitrary record transforms should be convenient and take only one or two steps. In fact it is the goal to take just about any record shape to any other in two steps: first convert to row-records, then re-block the data into arbitrary record shapes (please see here and here for the concepts).

But as with all general ideas, it is much easier to see what we mean by the above with a concrete example.

## Make Teaching R Quasi-Quotation Easier

To make teaching `R`

quasi-quotation easier it would be nice if `R`

string-interpolation and quasi-quotation both used the same notation. They are related concepts. So some commonality of notation would actually be clarifying, and help teach the concepts. We will define both of the above terms, and demonstrate the relation between the two concepts.

## R Tip: Use Inline Operators For Legibility

`R`

Tip: use inline operators for legibility.

A `Python`

feature I miss when working in `R`

is the convenience of `Python`

‘s inline `+`

operator. In `Python`

, `+`

does the right thing for some built in data types:

- It concatenates lists:
`[1,2] + [3]`

is`[1, 2, 3]`

. - It concatenates strings:
`'a' + 'b'`

is`'ab'`

.

And, of course, it adds numbers: `1 + 2`

is `3`

.

The inline notation is very convenient and legible. In this note we will show how to use a related notation `R`

.