## Correlation and R-Squared

What is R^{2}? In the context of predictive models (usually linear regression), where *y* is the true outcome, and *f* is the model’s prediction, the definition that I see most often is:

In words, R^{2} is a measure of how much of the variance in *y* is explained by the model, *f*.

Under “general conditions”, as Wikipedia says, R^{2} is also the square of the correlation (correlation written as a “p” or “rho”) between the actual and predicted outcomes:

I prefer the “squared correlation” definition, as it gets more directly at what is usually my primary concern: prediction. If R^{2} is close to one, then the model’s predictions mirror true outcome, tightly. If R^{2} is low, then either the model does not mirror true outcome, or it only mirrors it loosely: a “cloud” that — hopefully — is oriented in the right direction. Of course, looking at the graph always helps:

The question we will address here is : how do you get from R^{2} to correlation?