Posted on Categories data science, Opinion, Pragmatic Data Science, TutorialsTags , , , , , , , , , 1 Comment on New Timings for a Grouped In-Place Aggregation Task

New Timings for a Grouped In-Place Aggregation Task

I’d like to share some new timings on a grouped in-place aggregation task. A client of mine was seeing some slow performance, so I decided to time a very simple abstraction of one of the steps of their workflow.

Continue reading New Timings for a Grouped In-Place Aggregation Task

Posted on Categories Administrativia, data science, Opinion, Pragmatic Data Science, Pragmatic Machine Learning, TutorialsTags , , , , , , , ,

Introducing data_algebra

This article introduces the data_algebra project: a data processing tool family available in R and Python. These tools are designed to transform data either in-memory or on remote databases.

In particular we will discuss the Python implementation (also called data_algebra) and its relation to the mature R implementations (rquery and rqdatatable).

Continue reading Introducing data_algebra