Posted on Categories data science, Exciting Techniques, Statistics, TutorialsTags , , 1 Comment on Fully General Record Transforms with cdata

## Fully General Record Transforms with cdata

One of the design goals of the `cdata` `R` package is that very powerful and arbitrary record transforms should be convenient and take only one or two steps. In fact it is the goal to take just about any record shape to any other in two steps: first convert to row-records, then re-block the data into arbitrary record shapes (please see here and here for the concepts).

But as with all general ideas, it is much easier to see what we mean by the above with a concrete example.

Posted on Categories Opinion, Programming, Tutorials3 Comments on Make Teaching R Quasi-Quotation Easier

## Make Teaching R Quasi-Quotation Easier

To make teaching `R` quasi-quotation easier it would be nice if `R` string-interpolation and quasi-quotation both used the same notation. They are related concepts. So some commonality of notation would actually be clarifying, and help teach the concepts. We will define both of the above terms, and demonstrate the relation between the two concepts.

Posted on Categories Programming, Tutorials

## R Tip: Use Inline Operators For Legibility

`R` Tip: use inline operators for legibility.

A `Python` feature I miss when working in `R` is the convenience of `Python`‘s inline `+` operator. In `Python`, `+` does the right thing for some built in data types:

• It concatenates lists: `[1,2] + [3]` is `[1, 2, 3]`.
• It concatenates strings: `'a' + 'b'` is `'ab'`.

And, of course, it adds numbers: `1 + 2` is `3`.

The inline notation is very convenient and legible. In this note we will show how to use a related notation `R`.

Posted on 1 Comment on Practical Data Science with R, 2nd Edition discount!

## Practical Data Science with R, 2nd Edition discount!

The second edition of our best-selling book Practical Data Science with R2, Zumel, Mount is featured as deal of the day at Manning.

The second edition isn’t finished yet, but chapters 1 through 4 are available in the Manning Early Access Program (MEAP), and we have finished chapters 5 and 6 which are now in production at Manning (so they should be available soon). The authors are hard at work on chapters 7 and 8 right now.

The discount gets you half off. Also the 2nd edition comes with a free e-copy the first edition (so you can jump ahead).

Here are the details in Tweetable form:

Deal of the Day January 13: Half off Practical Data Science with R, Second Edition. Use code dotd011319au at http://bit.ly/2SKAxe9.

Posted on Categories ProgrammingTags , , 2 Comments on R Tip: Use seqi() For Indexes

## R Tip: Use seqi() For Indexes

`R` Tip: use `seqi()` for indexing.

`R`‘s `1:0` trap” is a mal-feature that confuses newcomers and is a reliable source of bugs. This note will show how to use `seqi()` to write more reliable code and document intent.

Posted on Categories Mathematics, Opinion, Tutorials

## A Beautiful 2 by 2 Matrix Identity

While working on a variation of the `RcppDynProg` algorithm we derived the following beautiful identity of 2 by 2 real matrices:

The superscript “top” denoting the transpose operation, the ||.||^2_2 denoting sum of squares norm, and the single |.| denoting determinant.

This is derived from one of the check equations for the Moore–Penrose inverse and we have details of the derivation here, and details of the messy algebra here.

Posted on Categories Coding, Opinion, TutorialsTags , , , 7 Comments on Timing the Same Algorithm in R, Python, and C++

## Timing the Same Algorithm in R, Python, and C++

While developing the `RcppDynProg` `R` package I took a little extra time to port the core algorithm from `C++` to both `R` and `Python`.

This means I can time the exact same algorithm implemented nearly identically in each of these three languages. So I can extract some comparative “apples to apples” timings. Please read on for a summary of the results.

Posted on Categories Programming, Statistics, Tutorials, UncategorizedTags , 4 Comments on What does it mean to write “vectorized” code in R?

## What does it mean to write “vectorized” code in R?

One often hears that `R` can not be fast (false), or more correctly that for fast code in `R` you may have to consider “vectorizing.”

A lot of knowledgable `R` users are not comfortable with the term “vectorize”, and not really familiar with the method.

“Vectorize” is just a slightly high-handed way of saying:

`R` naturally stores data in columns (or in column major order), so if you are not coding to that pattern you are fighting the language.

In this article we will make the above clear by working through a non-trivial example of writing vectorized code.