Let’s take a look at this powerful notation.
There is interest in converting relational query languages (that work both over SQL databases and on local data) into
data.table commands, to take advantage of
data.table‘s superior performance. Obviously if one wants to use
data.table it is best to learn
data.table. But if we want code that can run multiple places a translation layer may be in order.
In this note we look at how this translation is commonly done.
In this note, we discuss the use of Cohen’s D for planning difference-of-mean experiments.
Estimating sample size
Let’s imagine you are testing a new weight loss program and comparing it so some existing weight loss regimen. You want to run an experiment to determine if the new program is more effective than the old one. You’ll put a control group on the old plan, and a treatment group on the new plan, and after three months, you’ll measure how much weight the subjects lost, and see which plan does better on average.
The question is: how many subjects do you need to run a good experiment? Continue reading Cohen’s D for Experimental Planning
It usually gives us a chuckle when we find some natural and seemingly easy data science question is NP-hard. For instance we have written that variable pruning is NP-hard when one insists on finding a minimal sized set of variables (and also why there are no obvious methods for exact large permutation tests).
In this note we show that finding a minimal set of columns that form a primary key in a database is also NP-hard.
We are sharing a chalk talk rehearsal on applied probability. We use basic notions of probability theory to work through the estimation of sample size needed to reliably estimate event rates. This expands basic calculations, and then moves to the ideas of: Sample size and power for rare events.
Please check it out here.
Nina and I have been sending out drafts of our book Practical Data Science with R 2nd Edition for technical review. A few of the reviews came back from reviewers that described themselves with variations of:
Senior Business Analyst for COMPANYNAME. I have been involved in presenting graphs of data for many years.
To us this reads as somebody with deep experience, confidence, and bit of humility. They do something technical and valuable, but because they understand it they do not consider it to be arcane magic.
In this note we describe might can happen if such a person (or if a junior version of such a person) acquires 1 or 2 technical books.