vtreat‘s purpose is to produce pure numeric
data.frames that are ready for supervised predictive modeling (predicting a value from other values). By ready we mean: a purely numeric data frame with no missing values and a reasonable number of columns (missing-values re-encoded with indicators, and high-degree categorical re-encode by effects codes or impact codes).
In this note we will discuss a small aspect of the
vtreat package: variable screening.
Continue reading vtreat Variable Importance
In our last note we used
wrapr::qe() to help quote expressions. In this note we will discuss quoting and code-capturing interfaces (interfaces that capture user source code) a bit more.
Continue reading Quoting Concatenate
R are popular, the most popular one being
magrittr as used by
This note will discuss the advanced re-usable piping systems:
rqdatatable operator trees and
wrapr function object pipelines. In each case we have a set of objects designed to extract extra power from the
wrapr dot-arrow pipe
Continue reading Reusable Pipelines in R
Reusable modeling pipelines are a practical idea that gets re-developed many times in many contexts.
wrapr supplies a particularly powerful pipeline notation, and a pipe-stage re-use system (notes here). We will demonstrate this with the
vtreat data preparation system.
Continue reading Sharing Modeling Pipelines in R
This note is a comment on some of the timings shared in the dplyr-0.8.0 pre-release announcement.
The original published timings were as follows:
With performance metrics: measurements are marketing. So let’s dig in the above a bit.
Continue reading Timing Grouped Mean Calculation in R
Our group has done a lot of work with non-standard calling conventions in
Our tools work hard to eliminate non-standard calling (as is the purpose of
wrapr::let()), or at least make it cleaner and more controllable (as is done in the wrapr dot pipe). And even so, we still get surprised by some of the side-effects and mal-consequences of the over-use of non-standard calling conventions in
Please read on for a recent example.
Continue reading Very Non-Standard Calling in R