Posted on Categories data science, Opinion, Pragmatic Data Science, Pragmatic Machine Learning, StatisticsTags , , 2 Comments on A deeper theory of testing

A deeper theory of testing

In some of my recent public talks (for example: here and here) I have mentioned a desire for “a deeper theory of fitting and testing.” I thought I would expand on what I meant by this.

In this note I am going to cover a lot of different topics to try and suggest some perspective. I won’t have my usual luxury of fully defining my terms or working concrete examples. Hopefully a number of these ideas (which are related, but don’t seem to easily synthesize together) will be subjects of their own later articles.

Introduction

The focus of this article is: the true goal of predictive analytics is always: to build a model that works well in production. Training and testing procedures are designed to simulate this unknown future model performance, but can be expensive and can also fail.

What we want is a good measure of future model performance, and to apply that measure in picking a model without running deep into Goodhart’s law (“When a measure becomes a target, it ceases to be a good measure.”).

Most common training and testing procedures are destructive in the sense they use up data (data used for one step may not be safely used for another step in an unbiased fashion, example: excess generalization error). In this note I thought I would expand on the ideas for extending statistical efficiency or getting more out of your training while avoiding overfitting.


6816780226 4ff0d8324a o
Destructive testing.

I will outline a few variations of model construction and testing techniques that one should keep in mind.

Continue reading A deeper theory of testing

Posted on Categories AdministrativiaTags , , ,

A bit more on testing

If you liked Nina Zumel’s article on the limitations of Random Test/Train splits you might want to check out her recent article on predictive analytics product evaluation hosted by our friends at Fliptop. Continue reading A bit more on testing