Please help share our news and this discount.
The second edition of our best-selling book Practical Data Science with R2, Zumel, Mount is featured as deal of the day at Manning.
The second edition isn’t finished yet, but chapters 1 through 4 are available in the Manning Early Access Program (MEAP), and we have finished chapters 5 and 6 which are now in production at Manning (so they should be available soon). The authors are hard at work on chapters 7 and 8 right now.
The discount gets you half off. Also the 2nd edition comes with a free e-copy the first edition (so you can jump ahead).
Here are the details in Tweetable form:
Deal of the Day January 13: Half off Practical Data Science with R, Second Edition. Use code dotd011319au at http://bit.ly/2SKAxe9.
We are thrilled to announce our (my and Nina Zumel’s) paper on the dot-pipe has been accepted by the R-Journal!
Continue reading Dot-Pipe Paper Accepted by the R Journal!!!
Some more Practical Data Science with R news.
Practical Data Science with R is the book we wish we had when we started in data science. Practical Data Science with R, Second Edition is the revision of that book with the packages we wish had been available at that time (in particular
wrapr). A second edition also lets us also correct some omissions, such as not demonstrating
For your part: please help us get the word out about this book. Practical Data Science with R, Second Edition, R in Action, Second Edition, and Think Like a Data Scientist are Manning’s August 20th 2018 “Deal of the Day” (use code
dotd082018au at https://www.manning.com/dotd).
For our part we are busy revising chapters and setting up a new Github repository for examples and code and other reader resources.
We are pleased and excited to announce that we are working on a second edition of Practical Data Science with R!
Continue reading Announcing Practical Data Science with R, 2nd Edition
rqdatatable are new
R packages for data wrangling; either at scale (in databases, or big data systems such as Apache Spark), or in-memory. The packages speed up both execution (through optimizations) and development (though a good mental model and up-front error checking) for data wrangling tasks.
Win-Vector LLC‘s John Mount will be speaking on the
rqdatatable packages at the The East Bay R Language Beginners Group Tuesday, August 7, 2018 (Oakland, CA).
Continue reading John Mount speaking on rquery and rqdatatable
We here at Win-Vector LLC have some really big news we would please like the
R-community’s help sharing.
vtreat version 1.2.0 is now available on CRAN, and this version of
vtreat can now implement its data cleaning and preparation steps on databases and big data systems such as
vtreat is a very complete and rigorous tool for preparing messy real world data for supervised machine-learning tasks. It implements a technique we call “safe y-aware processing” using cross-validation or stacking techniques. It is very easy to use: you show it some data and it designs a data transform for you.
Thanks to the
rquery package, this data preparation transform can now be directly applied to databases, or big data systems such as
Apache Spark, or
Google BigQuery. Or, thanks to the
rqdatatable packages, even fast large in-memory transforms are possible.
We have some basic examples of the new
vtreat capabilities here and here.
R package wrapr 1.5.0 is now available on CRAN.
wrapr includes a lot of tools for writing better
I’ll be writing articles on a number of the new capabilities. For now I just leave you with the nifty operator coalesce notation.
Continue reading wrapr 1.5.0 available on CRAN
rquery talk went very well, thank you very much to the attendees for being an attentive and generous audience.
rquery at BARUG, photo credit: Timothy Liu)
I am now looking for invitations to give a streamlined version of this talk privately to groups using
R who want to work with
SQL (with databases such as PostgreSQL or big data systems such as Apache Spark).
rquery has a number of features that greatly improve team productivity in this environment (strong separation of concerns, strong error checking, high usability, specific debugging features, and high performance queries).
If your group is in the San Francisco Bay Area and using
R to work with a
SQL accessible data source, please reach out to me at firstname.lastname@example.org, I would be honored to show your team how to speed up their project and lower development costs with
rquery. If you are a big data vendor and some of your clients use
R, I am especially interested in getting in touch: our system can help
R users start working with your installation.
Four years ago today authors Nina Zumel and John Mount received our author’s copies of Practical Data Science with R!
Continue reading Four Years of Practical Data Science with R