Please give it a try!
vtreat is a system for preparing messy real world data for predictive modeling tasks (classification, regression, and so on). In particular it is very good at re-coding high-cardinality string-valued (or categorical) variables for later use.
For all our remote learners, we are sharing a free coupon code for our R video course Introduction to Data Science. The code is ITDS2020, and can be used at this URL https://www.udemy.com/course/introduction-to-data-science/?couponCode=ITDS2020 . Please check it out and share it!
A big thank you to Dmytro Perepolkin for sharing a “Keep Calm and Use vtreat” poster!
Also, we have translated the Python vtreat steps from our recent “Cross-Methods are a Leak/Variance Trade-Off” article into R vtreat steps here.
This R-port demonstrates the new to R fit/prepare notation!
We want vtreat to be a platform agnostic (works in R, works in Python, works elsewhere) well documented standard methodology.
To this end: Nina and I have re-organized the basic vtreat use documentation as follows:
Rregression example, fit/prepare
Rregression example, design/prepare/experiment
Rclassification example, fit/prepare
Rclassification example, design/prepare/experiment
- Unsupervised tasks:
Runsupervised example, fit/prepare
Runsupervised example, design/prepare/experiment
- Multinomial classification:
Rmultinomial classification example, design/prepare/experiment
We have a new data scientist sticker!
If you see Nina or John at a conference/MeetUp, please ask us for a sticker!
For the next version of the R package wrapr we are going to be removing a number of under-used functions/methods and classes. This update will likely happen in March 2020, and is the start of the wrapr 2.* series.
Most of the items being removed are different abstractions for helping with function composition. We ended up moving most of our work to category-theory based composition, so don’t think these various frameworks are needed any longer. If you have been using these items in your own projects, please reach out and we try and find a way to help you out.
In an off-topic post we would like to share a series of horror narrations based on Win Vector LLC’s very own Nina Zumel’s translations of Uruguayan author Horacio Quiroga. This is a free series produced by Rue Morgue
The first is: “The Feather Pillow.” DO NOT LISTEN TO THIS IN BED!