Posted on Categories Opinion, Programming, TutorialsTags , , , Leave a comment on wrapr 1.4.1 now up on CRAN

wrapr 1.4.1 now up on CRAN

wrapr 1.4.1 is now available on CRAN. wrapr is a really neat R package both organizing, meta-programming, and debugging R code. This update generalizes the dot-pipe feature’s dot S3 features.

Please give it a try!

Continue reading wrapr 1.4.1 now up on CRAN

Posted on Categories Opinion, Statistics, TutorialsTags , , Leave a comment on Ready Made Plots make Work Easier

Ready Made Plots make Work Easier

A while back Simon Jackson and Kara Woo shared some great ideas and graphs on grouped bar charts and density plots (link). Win-Vector LLC‘s Nina Zumel just added a graph of this type to the development version of WVPlots.

NewImage

Nina has, as usual, some great documentation here.

Continue reading Ready Made Plots make Work Easier

Posted on Categories Administrativia, Opinion, Programming, StatisticsTags , , , , , 2 Comments on rquery: SQL from R

rquery: SQL from R

My BARUG rquery talk went very well, thank you very much to the attendees for being an attentive and generous audience.


IMG 5152

(John teaching rquery at BARUG, photo credit: Timothy Liu)

I am now looking for invitations to give a streamlined version of this talk privately to groups using R who want to work with SQL (with databases such as PostgreSQL or big data systems such as Apache Spark). rquery has a number of features that greatly improve team productivity in this environment (strong separation of concerns, strong error checking, high usability, specific debugging features, and high performance queries).

If your group is in the San Francisco Bay Area and using R to work with a SQL accessible data source, please reach out to me at jmount@win-vector.com, I would be honored to show your team how to speed up their project and lower development costs with rquery. If you are a big data vendor and some of your clients use R, I am especially interested in getting in touch: our system can help R users start working with your installation.

Posted on Categories Administrativia, data science, Exciting Techniques, Opinion, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , , Leave a comment on Upcoming speaking engagments

Upcoming speaking engagments

I have a couple of public appearances coming up soon.

Continue reading Upcoming speaking engagments

Posted on Categories Coding, Opinion, Programming, Statistics, TutorialsTags , , , Leave a comment on R Tip: Use Slices

R Tip: Use Slices

R tip: use slices.

SliceOMatic

R has a very powerful array slicing ability that allows for some very slick data processing.

Continue reading R Tip: Use Slices

Posted on Categories data science, Opinion, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, StatisticsTags , , Leave a comment on cdata Update

cdata Update

The R package cdata now has version 0.7.0 available from CRAN.

cdata is a data manipulation package that subsumes many higher order data manipulation operations including pivot/un-pivot, spread/gather, or cast/melt. The record to record transforms are specified by drawing a table that expresses the record structure (called the “control table” and also the link between the key concepts of row-records and block-records).

What can be quickly specified and achieved using these concepts and notations is amazing and quite teachable. These transforms can be run in-memory or in remote database or big-data systems (such as Spark).

The concepts are taught in Nina Zumel’s excellent tutorial.


Untitled

And in John Mount’s quick screencast/lecture.

link, slides

The 0.7.0 update adds local versions of the operators in addition to the Spark and database implementations. These methods should now be a bit safer for in-memory complex/annotated types such as dates and times.

Posted on Categories Opinion, Programming, StatisticsTags , , , 12 Comments on Neglected R Super Functions

Neglected R Super Functions

R has a lot of under-appreciated super powerful functions. I list a few of our favorites below.


6095431665 88664494f0 b

Atlas, carrying the sky. Royal Palace (Paleis op de Dam), Amsterdam.

Photo: Dominik Bartsch, CC some rights reserved.

Continue reading Neglected R Super Functions

Posted on Categories Coding, Opinion, TutorialsTags , , , 4 Comments on magrittr and wrapr Pipes in R, an Examination

magrittr and wrapr Pipes in R, an Examination

Let’s consider piping in R both using the magrittr package and using the wrapr package.

Continue reading magrittr and wrapr Pipes in R, an Examination

Posted on Categories Administrativia, data science, Opinion, Practical Data Science, Pragmatic Data Science, StatisticsTags , , , Leave a comment on Four Years of Practical Data Science with R

Four Years of Practical Data Science with R

Four years ago today authors Nina Zumel and John Mount received our author’s copies of Practical Data Science with R!

1960860 10203595069745403 608808262 o

Continue reading Four Years of Practical Data Science with R