Posted on Categories Opinion, Rants, StatisticsTags , , , Leave a comment on You can’t do that in statistics

You can’t do that in statistics

There are a number of statistical principles that are perhaps more honored in the breach than in the observance. For fun I am going to name a few, and show why they are not always the “precision surgical knives of thought” one would hope for (working more like large hammers).

NewImage Continue reading You can’t do that in statistics

Posted on Categories math programming, Mathematics, StatisticsTags , , ,

Sequential Analysis

We here at Win-Vector LLC been working through an ad-hoc series about A/B testing combining elements of both operations research and statistical points of view.

Our most recent article was a dynamic programming solution to the A/B test problem. Explicitly solving such dynamic programs gets long and tedious, so you are well served by finding and introducing clever invariants to track (something better than just raw win-rates). That clever idea is called “sequential analysis” and was introduced by Abraham Wald (somebody we have written about before). If you have ever heard of a test plan such as “first process to get more than 30 wins ahead of the other is the one we choose” you have seen methods derived from Wald’s sequential analysis technique.

Wald’s famous airplane armor problem

In this “statistics as it should be” article we will discuss Wald’s sequential analysis. Continue reading Sequential Analysis

Posted on Categories AdministrativiaTags , , ,

Wald’s sequential analysis technique

Microsoft Revolution Analytics has just posted our latest article on A/B testing: Wald’s graphical sequential inspection procedure. It is a fun appreciation of a really cool procedure and I hope you check it out.

IMG 1692
Figure 14, Section 6.4.2, page 111, Abraham Wald, Sequential Analysis, Dover 2004 (reprinting a 1947 edition).

Posted on Categories Opinion, StatisticsTags , , , 3 Comments on What was data science before it was called data science?

What was data science before it was called data science?

“Data Science” is obviously a trendy term making it way through the hype cycle. Either nobody is good enough to be a data scientist (unicorns) or everybody is too good to be a data scientist (or the truth is somewhere in the middle).


Gartner hype cycle (Wikipedia).

And there is a quarter that grumbles that we are merely talking about statistics under a new name (see here and here).

It has always been the case that advances in data engineering (such as punch cards, or data centers) make analysis practical at new scales (though I still suspect Map/Reduce was a plot designed to trick engineers into being excited about ETL and report generation).


Data Science 1832: Semen Korsakov card.

However, in the 1940s and 1950s the field was called “operations research” (even when performed by statisticians). When you read John F. Magee, (2002) “Operations Research at Arthur D. Little, Inc.: The Early Years”, Operations Research 50(1):149-153 you really come away with the impression you are reading about a study of online advertising performed in the 1940s (okay mail advertising, but mail was “the email of its time”).

In this spirit next week we will write about the sequential analysis solution for A/B-testing, invented in the 1940s by one of the greats of statistics and operations research: Abraham Wald (whom we have written about before).


Abraham Wald

Posted on Categories History, StatisticsTags , , , , , ,

Deming, Wald and Boyd: cutting through the fog of analytics

This article is a quick appreciation of some of the statistical, analytic and philosphic techniques of Deming, Wald and Boyd. Many of these techniques have become pillars of modern industry through the sciences of statistics and operations research.
Continue reading Deming, Wald and Boyd: cutting through the fog of analytics