Please give it a try!
A while back Simon Jackson and Kara Woo shared some great ideas and graphs on grouped bar charts and density plots (link). Win-Vector LLC‘s Nina Zumel just added a graph of this type to the development version of WVPlots.
Nina has, as usual, some great documentation here.
rquery at BARUG, photo credit: Timothy Liu)
I am now looking for invitations to give a streamlined version of this talk privately to groups using
R who want to work with
SQL (with databases such as PostgreSQL or big data systems such as Apache Spark).
rquery has a number of features that greatly improve team productivity in this environment (strong separation of concerns, strong error checking, high usability, specific debugging features, and high performance queries).
If your group is in the San Francisco Bay Area and using
R to work with a
SQL accessible data source, please reach out to me at email@example.com, I would be honored to show your team how to speed up their project and lower development costs with
rquery. If you are a big data vendor and some of your clients use
R, I am especially interested in getting in touch: our system can help
R users start working with your installation.
I have a couple of public appearances coming up soon.
- The East Bay R Language Beginners Group: Preparing Datasets – The Ugly Truth & Some Solutions, Tuesday, May 1, 2018 at Robert Half Technologies, 1999 Harrison Street, Oakland, CA, 94612.
- Official May 2018 BARUG Meeting: rquery: a Query Generator for Working With SQL Data, Tuesday, May 8, 2018 at Intuit, Building 20
2600 Marine Way · Mountain View, CA.
cdata is a data manipulation package that subsumes many higher order data manipulation operations including pivot/un-pivot, spread/gather, or cast/melt. The record to record transforms are specified by drawing a table that expresses the record structure (called the “control table” and also the link between the key concepts of row-records and block-records).
What can be quickly specified and achieved using these concepts and notations is amazing and quite teachable. These transforms can be run in-memory or in remote database or big-data systems (such as Spark).
The concepts are taught in Nina Zumel’s excellent tutorial.
And in John Mount’s quick screencast/lecture.
0.7.0 update adds local versions of the operators in addition to the Spark and database implementations. These methods should now be a bit safer for in-memory complex/annotated types such as dates and times.
R has a lot of under-appreciated super powerful functions. I list a few of our favorites below.
Atlas, carrying the sky. Royal Palace (Paleis op de Dam), Amsterdam.
Photo: Dominik Bartsch, CC some rights reserved.
Four years ago today authors Nina Zumel and John Mount received our author’s copies of Practical Data Science with R!