Tag Archives: Medium Scale Data

SQL Screwdriver

We discuss a “medium scale data” technique that we call “SQL Screwdriver.”

Previously we discussed some of the issues of large scale data analytics. A lot of the work done at the MapReduce scale is necessarily limited to mere aggregation and report generation. But what of medium scale? That is data too large to perform all steps in your favorite tool (R, Excel or something else) but small enough that you are expected to produce sophisticated models, decisions and analysis. At this scale, if properly prepared, you don’t need large scale tools and their limitations. With extra preparation you can continue to use your preferred tools. We call this the realm of medium scale data and discuss a preparation tool style we call “screwdriver” (as opposed to larger hammers).

We stand the “no SQL” movement on its head and discuss the beneficial use of SQL without a server (as opposed to their vision of a key-value store without SQL). Database servers can be a nuisance- but that is not enough reason to give up the power of relational query languages.
Continue reading