A quick R mini-tip: don’t use
data.matrix when you mean
model.matrix. If you do so you may lose (without noticing) a lot of your model’s explanatory power (due to poor encoding). Continue reading R minitip: don’t use data.matrix when you mean model.matrix
While following up on Nina Zumel’s excellent Trimming the Fat from glm() Models in R I got to thinking about code style in R. And I realized: you can make your code much prettier by designing more of your functions to return
data.frames. That may seem needlessly heavy-weight, but it has a lot of down-stream advantages. Continue reading R style tip: prefer functions that return data frames
Been reading a lot of Gelman, Carlin, Stern, Dunson, Vehtari, Rubin “Bayesian Data Analysis” 3rd edition lately. Overall in the Bayesian framework some ideas (such as regularization, and imputation) are way easier to justify (though calculating some seemingly basic quantities becomes tedious). A big advantage (and weakness) of this formulation is statistics has a much less “shrink wrapped” feeling than the classic frequentist presentations. You feel like the material is being written to peers instead of written to calculators (of the human or mechanical variety). In the Bayesian formulation you don’t feel like you will be yelled at for using 1 tablespoon of sugar when the recipe calls for 3 teaspoons (at least if you live in the United States).
Some other stuff reads differently after this though. Continue reading Skimming statistics papers for the ideas (instead of the complete procedures)
There are a lot of good books on statistics, machine learning, analytics, and R. So it is valid to ask: how does Practical Data Science with R stand out? Why should a data scientist or an aspiring data scientist buy it?
We admit, it isn’t the only book we own. Some relevant books from the Win-Vector LLC company library include:
Continue reading How does Practical Data Science with R stand out?