We have just released a major update of the
cdata R package to CRAN.
If you work with
R and data, now is the time to check out the
cdata package. Continue reading Update on coordinatized or fluid data
I am pleased to announce that
vtreat version 0.6.0 is now available to
R users on CRAN.
vtreat is an excellent way to prepare data for machine learning, statistical inference, and predictive analytic projects. If you are an
R user we strongly suggest you incorporate
vtreat into your projects. Continue reading Upcoming data preparation and modeling article series
Somebody nice reached out and gave us this wonderful feedback on our new Supervised Learning in R: Regression (paid) video course.
Thanks for a wonderful course on DataCamp on
Random forest. I was struggling with
Xgboost earlier and
Vtreat has made my life easy now :).
Continue reading Thank You For The Very Nice Comment
The Win-Vector public R packages now all have new
pkgdown documentation sites! (And, a thank-you to Hadley Wickham for developing the
Please check them out (hint:
vtreat is our favorite).
Continue reading More documentation for Win-Vector R packages
The development version of my new
seplyr is performing in practical applications with
0.7.* much better than even I (the
seplyr package author) expected.
I think I have hit a very good set of trade-offs, and I have now spent significant time creating documentation and examples.
I wish there had been such a package weeks ago, and that I had started using this approach in my own client work at that time. If you are already a
dplyr user I strongly suggest trying
seplyr in your own analysis projects.
Please see here for details.
Win-Vector LLC has recently been teaching how to use
R with big data through
sparklyr. We have also been helping clients become productive on
R/Spark infrastructure through direct consulting and bespoke training. I thought this would be a good time to talk about the power of working with big-data using
R, share some hints, and even admit to some of the warts found in this combination of systems.
The ability to perform sophisticated analyses and modeling on “big data” with
R is rapidly improving, and this is the time for businesses to invest in the technology. Win-Vector can be your key partner in methodology development and training (through our consulting and training practices).
J. Howard Miller, 1943.
The field is exciting, rapidly evolving, and even a touch dangerous. We invite you to start using
R and are starting a new series of articles tagged “R and big data” to help you produce production quality solutions quickly.
Please read on for a brief description of our new articles series: “R and big data.” Continue reading New series: R and big data (concentrating on Spark and sparklyr)
Our book Practical Data Science with R has just been reviewed in Association for Computing Machinery Special Interest Group on Algorithms and Computation Theory (ACM SIGACT) News by Dr. Allan M. Miller (U.C. Berkeley)!
The book is half off at Manning March 21st 2017 using the following code (please share/Tweet):
Deal of the Day March 21: Half off my book Practical Data Science with R. Use code
dotd032117au at https://www.manning.com/dotd
Please read on for links and excerpts from the review. Continue reading Practical Data Science with R: ACM SIGACT News Book Review and Discount!