cdata is a data manipulation package that subsumes many higher order data manipulation operations including pivot/un-pivot, spread/gather, or cast/melt. The record to record transforms are specified by drawing a table that expresses the record structure (called the “control table” and also the link between the key concepts of row-records and block-records).
What can be quickly specified and achieved using these concepts and notations is amazing and quite teachable. These transforms can be run in-memory or in remote database or big-data systems (such as Spark).
The 0.7.0 update adds local versions of the operators in addition to the Spark and database implementations. These methods should now be a bit safer for in-memory complex/annotated types such as dates and times.
Rtip: first organize your tasks in terms of data, values, and desired transformation of values, not initially in terms of concrete functions or code.
I know I write a lot about coding in R. But it is in the service of supporting statistics, analysis, predictive analytics, and data science.
R without data is like going to the theater to watch the curtain go up and down.
(Adapted from Ben Katchor’s Julius Knipl, Real Estate Photographer: Stories, Little, Brown, and Company, 1996, page 72, “Excursionist Drama 2”.)
Usually you come to R to work with data. If you think and plan in terms of data and values (including introducing more data to control processing) you will usually work in much faster, explainable, and maintainable fashion.
Take care if trying the new RPostgres database connection package. By default it returns some non-standard types that code developed against other database drivers may not expect, and may not be ready to defend against.