Posted on Categories Rants, StatisticsTags , , 3 Comments on CRU graph yet again (with R)

CRU graph yet again (with R)

IowaHawk has a excellent article attempting to reproduce the infamous CRU climate graph using OpenOffice: Fables of the Reconstruction. We thought we would show how to produced similarly bad results using R.
Continue reading CRU graph yet again (with R)

Posted on Categories Coding, Statistics, TutorialsTags , 4 Comments on R examine objects tutorial

R examine objects tutorial

This article is quick concrete example of how to use the techniques from Survive R to lower the steepness of The R Project for Statistical Computing‘s learning curve (so an apology to all readers who are not interested in R). What follows is for people who already use R and want to achieve more control of the software. Continue reading R examine objects tutorial

Posted on Categories Pragmatic Machine Learning, StatisticsTags 22 Comments on Survive R

Survive R

New PDF slides version (presented at the Bay Area R Users Meetup October 13, 2009).

We at Win-Vector LLC appear to like R a bit more than some of our, perhaps wiser, colleagues ( see: Choose your weapon: Matlab, R or something else? and R and data ). While we do like R (see: Exciting Technique #1: The “R” language ) we also understand the need to defend oneself against the abuse regularly dished out by R. Here we will quickly share a few fighting techniques.
Continue reading Survive R

Posted on Categories Exciting Techniques, Expository Writing, Mathematics, Pragmatic Data Science, Pragmatic Machine Learning, StatisticsTags , , , , , , 7 Comments on Good Graphs: Graphical Perception and Data Visualization

Good Graphs: Graphical Perception and Data Visualization

What makes a good graph? When faced with a slew of numeric data, graphical visualization can be a more efficient way of getting a feel for the data than going through the rows of a spreadsheet. But do we know if we are getting an accurate or useful picture? How do we pick an effective visualization that neither obscures important details, or drowns us in confusing clutter? In 1968, William Cleveland published a text called The Elements of Graphing Data, inspired by Strunk and White’s classic writing handbook The Elements of Style . The Elements of Graphing Data puts forward Cleveland’s philosophy about how to produce good, clear graphs — not only for presenting one’s experimental results to peers, but also for the purposes of data analysis and exploration. Cleveland’s approach is based on a theory of graphical perception: how well the human perceptual system accomplishes certain tasks involved in reading a graph. For a given data analysis task, the goal is to align the information being presented with the perceptual tasks the viewer accomplishes the best. Continue reading Good Graphs: Graphical Perception and Data Visualization

Posted on Categories Exciting Techniques, Pragmatic Machine Learning, StatisticsTags , 2 Comments on Exciting Technique #1: The “R” language.

Exciting Technique #1: The “R” language.

Our first “exciting technique” article is about a statistical language called “R.”

R is a language for statistical analysis available from http://cran.r-project.org/ . The things you can immediately do with it are incredible. You can import a spreadsheet and immediately spot relationships, trend and anomalies. R gives you instant access to top notch visualization methods and sophisticated statistical methods.

Continue reading Exciting Technique #1: The “R” language.