Posted on Categories data science, Statistics, TutorialsTags , , , , ,

New improved cdata instructional video

We have a new improved version of the “how to design a cdata/data_algebra data transform” up!

The original article, the Python example, and the R example have all been updated to use the new video.

Please check it out!

Posted on Categories data science, Opinion, Pragmatic Data Science, TutorialsTags , , , , , , , , , 1 Comment on New Timings for a Grouped In-Place Aggregation Task

New Timings for a Grouped In-Place Aggregation Task

I’d like to share some new timings on a grouped in-place aggregation task. A client of mine was seeing some slow performance, so I decided to time a very simple abstraction of one of the steps of their workflow.

Continue reading New Timings for a Grouped In-Place Aggregation Task

Posted on Categories Administrativia, Computer Science, Pragmatic Data ScienceTags , , , ,

Better SQL Generation via the data_algebra

In our recent note What is new for rquery December 2019 we mentioned an ugly processing pipeline that translates into SQL of varying size/quality depending on the query generator we use. In this note we try a near-relative of that query in the data_algebra.

Continue reading Better SQL Generation via the data_algebra