Posted on Categories data science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , , , ,

Modeling Trick: Masked Variables

A primary problem data scientists face again and again is: how to properly adapt or treat variables so they are best possible components of a regression. Some analysts at this point delegate control to a shape choosing system like neural nets. I feel such a choice gives up far too much statistical rigor, transparency and control without real benefit in exchange. There are other, better, ways to solve the reshaping problem. A good rigorous way to treat variables are to try to find stabilizing transforms, introduce splines (parametric or non-parametric) or use generalized additive models. A practical or pragmatic approach we advise to get some of the piecewise reshaping power of splines or generalized additive models is: a modeling trick we call “masked variables.” This article works a quick example using masked variables. Continue reading Modeling Trick: Masked Variables

Posted on Categories data science, Opinion, Pragmatic Data Science, Pragmatic Machine LearningTags , , , 1 Comment on Congratulations to both Dr. Nina Zumel and EMC- great job

Congratulations to both Dr. Nina Zumel and EMC- great job

A big congratulations to Win-Vector LLC‘s Dr. Nina Zumel for authoring and teaching portions of EMC‘s new Data Science and Big Data Analytics training and certification program. A big congratulations to EMC, EMC Education Services and Greenplum for creating a great training course. Finally a huge thank you to EMC, EMC Education Services and Greenplum for inviting Win-Vector LLC to contribute to this great project.

389273 10150730223199318 602824317 9375276 1010737649 n Continue reading Congratulations to both Dr. Nina Zumel and EMC- great job