Posted on 2 Comments on Bandit Formulations for A/B Tests: Some Intuition

## Bandit Formulations for A/B Tests: Some Intuition

Controlled experiments embody the best scientific design for establishing a causal relationship between changes and their influence on user-observable behavior.

— Kohavi, Henne, Sommerfeld, “Practical Guide to Controlled Experiments on the Web” (2007)

A/B tests are one of the simplest ways of running controlled experiments to evaluate the efficacy of a proposed improvement (a new medicine, compared to an old one; a promotional campaign; a change to a website). To run an A/B test, you split your population into a control group (let’s call them “A”) and a treatment group (“B”). The A group gets the “old” protocol, the B group gets the proposed improvement, and you collect data on the outcome that you are trying to achieve: the rate that patients are cured; the amount of money customers spend; the rate at which people who come to your website actually complete a transaction. In the traditional formulation of A/B tests, you measure the outcomes for the A and B groups, determine which is better (if either), and whether or not the difference observed is statistically significant. This leads to questions of test size: how big a population do you need to get reliably detect a difference to the desired statistical significance? And to answer that question, you need to know how big a difference (effect size) matters to you.

The irony is that to detect small differences accurately you need a larger population size, even though in many cases, if the difference is small, picking the wrong answer matters less. It can be easy to lose sight of that observation in the struggle to determine correct experiment sizes.

There is an alternative formulation for A/B tests that is especially suitable for online situations, and that explicitly takes the above observation into account: the so-called multi-armed bandit problem. Imagine that you are in a casino, faced with K slot machines (which used to be called “one-armed bandits” because they had a lever that you pulled to play (the “arm”) — and they pretty much rob you of all your money). Each of the slot machines pays off at a different (unknown) rate. You want to figure out which of the machines pays off at the highest rate, then switch to that one — but you don’t want to lose too much money to the suboptimal slot machines while doing so. What’s the best strategy?

The “pulling one lever at a time” formulation isn’t a bad way of thinking about online transactions (as opposed to drug trials); you can imagine all your customers arriving at your site sequentially, and being sent to bandit A or bandit B according to some strategy. Note also, that if the best bandit and the second-best bandit have very similar payoff rates, then settling on the second best bandit, while not optimal, isn’t necessarily that bad a strategy. You lose winnings — but not much.

Traditionally, bandit games are infinitely long, so analysis of bandit strategies is asymptotic. The idea is that you test less as the game continues — but the testing stage can go on for a very long time (often interleaved with periods of pure exploitation, or playing the best bandit). This infinite-game assumption isn’t always tenable for A/B tests — for one thing, the world changes; for another, testing is not necessarily without cost. We’ll look at finite games below.

Posted on Categories Mathematics, Opinion, Pragmatic Data Science, Rants, Statistics

## What is meant by regression modeling?

What is meant by regression modeling?

Linear Regression is one of the most common statistical modeling techniques. It is very powerful, important, and (at first glance) easy to teach. However, because it is such a broad topic it can be a minefield for teaching and discussion. It is common for angry experts to accuse writers of carelessness, ignorance, malice and stupidity. If the type of regression the expert reader is expecting doesn’t match the one the writer is discussing then the writer is assumed to be ill-informed. The writer is especially vulnerable to experts when writing for non-experts. In such writing the expert finds nothing new (as they already know the topic) and is free to criticize any accommodation or adaption made for the intended non-expert audience. We argue that many of the corrections are not so much evidence of wrong ideas but more due a lack of empathy for the necessary informality necessary in concise writing. You can only define so much in a given space, and once you write too much you confuse and intimidate a beginning audience. Continue reading What is meant by regression modeling?

Posted on Categories Coding, Expository Writing, Programming, Tutorials9 Comments on You don’t need to understand pointers to program using R

## You don’t need to understand pointers to program using R

R is a statistical analysis package based on writing short scripts or programs (versus being based on GUIs like spreadsheets or directed workflow editors). I say “writing short scripts” because R’s programming language (itself called S) is a bit of an oddity that you really wouldn’t be using except it gives you access to superior analytics data structures (R’s data.frame and treatment of missing values) and deep ready to go statistical libraries. For longer pure programming tasks you are better off using something else (be it Python, Ruby, Java, C++, Javascript, Go, ML, Julia, or something else). However, the S language has one feature that makes it pleasant to learn (despite any warts): it can be initially used and taught without having the worry about the semantics of references or pointers. Continue reading You don’t need to understand pointers to program using R