Posted on Categories Administrativia, art, OpinionTags , Leave a comment on Off topic: Horror Translations by Nina Zumel

Off topic: Horror Translations by Nina Zumel

In an off-topic post we would like to share a series of horror narrations based on Win Vector LLC’s very own Nina Zumel’s translations of Uruguayan author Horacio Quiroga. This is a free series produced by Rue Morgue

The first is: “The Feather Pillow.” DO NOT LISTEN TO THIS IN BED!

(YouTube link, Rue Morge link, Ephemera link)

More of Nina’s literary work can be found at: Ephemera Experiments in Writing, and Multo (Ghost).

Posted on Categories Exciting Techniques, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, TutorialsTags , , , , Leave a comment on Why we wrote wrapr to/unpack

Why we wrote wrapr to/unpack

One reason we are developing the wrapr to/unpack methods is the following: we wanted to spruce up the R vtreat interface a bit.

Continue reading Why we wrote wrapr to/unpack

Posted on Categories data science, Statistics, TutorialsTags , , 2 Comments on Using unpack to Manage Your R Environment

Using unpack to Manage Your R Environment

In our last note we stated that unpack is a good tool for load R RDS files into your working environment. Here is the idea expanded into a worked example.

Continue reading Using unpack to Manage Your R Environment

Posted on Categories Exciting Techniques, TutorialsTags , , , Leave a comment on unpack Your Values in R

unpack Your Values in R

I would like to introduce an exciting feature in the upcoming 1.9.6 version of the wrapr R package: value unpacking.

Continue reading unpack Your Values in R

Posted on Categories Exciting Techniques, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , Leave a comment on sklearn Pipe Step Interface for vtreat

sklearn Pipe Step Interface for vtreat

We’ve been experimenting with this for a while, and the next R vtreat package will have a back-port of the Python vtreat package sklearn pipe step interface (in addition to the standard R interface).

Continue reading sklearn Pipe Step Interface for vtreat

Posted on Categories data science, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , Leave a comment on New vtreat Feature: Nested Model Bias Warning

New vtreat Feature: Nested Model Bias Warning

For quite a while we have been teaching estimating variable re-encodings on the exact same data they are later naively using to train a model on, leads to an undesirable nested model bias. The vtreat package (both the R version and Python version) both incorporate a cross-frame method that allows one to use all the training data both to build learn variable re-encodings and to correctly train a subsequent model (for an example please see our recent PyData LA talk).

The next version of vtreat will warn the user if they have improperly used the same data for both vtreat impact code inference and downstream modeling. So in addition to us warning you not to do this, the package now also checks and warns against this situation. vtreat has had methods for avoiding nested model bias for vary long time, we are now adding new warnings to confirm users are using them.

Set up the Example

This example is excerpted from some of our classification documentation.

Continue reading New vtreat Feature: Nested Model Bias Warning

Posted on Categories Administrativia, Opinion, Practical Data Science, Pragmatic Data Science, Pragmatic Machine LearningTags , , , , , Leave a comment on New Year’s Resolution 2020: Work on more R Data Science Projects

New Year’s Resolution 2020: Work on more R Data Science Projects

We had such a positive reception to our last Introduction to Data Science promotion, that we are going to try and make the course available to more people by lowering the base-price to $29.99. We are also creating a 1 month promotional price of $20.99. To get a permanent subscription to the course for less than $21 just visit this link https://www.udemy.com/course/introduction-to-data-science/ and use the discount code ITDS21 any time in January of 2020.

Combine this with the new second edition of Practical Data Science with R, and you have a great study set to succeed at substantial statistical modeling and analytics tasks using the R programming language.


PDSwR2Lego

(Note: Lego mini-fig not included!)

Posted on Categories Administrativia, data science, Practical Data ScienceTags Leave a comment on Manning Deal of the Day January 3, 2020 : Half off Practical Data Science with R, Second Edition

Manning Deal of the Day January 3, 2020 : Half off Practical Data Science with R, Second Edition

Manning Deal of the Day January 3, 2020 : Half off Practical Data Science with R, Second Edition. Use code dotd010320au at http://bit.ly/39vD1G4

Please share!

Posted on Categories data science, Opinion, Pragmatic Data Science, TutorialsTags , , , , , , , , , 1 Comment on New Timings for a Grouped In-Place Aggregation Task

New Timings for a Grouped In-Place Aggregation Task

I’d like to share some new timings on a grouped in-place aggregation task. A client of mine was seeing some slow performance, so I decided to time a very simple abstraction of one of the steps of their workflow.

Continue reading New Timings for a Grouped In-Place Aggregation Task

Posted on Categories Administrativia, data science, StatisticsTags , , Leave a comment on Introduction to Data Science in R, Free for 3 days

Introduction to Data Science in R, Free for 3 days

To celebrate the new year and the recent release of Practical Data Science with R 2nd Edition, we are offering a free coupon for our video course “Introduction to Data Science.”

The following URL and code should get you permanent free access to the video course, if used between now and January 1st 2020:

https://www.udemy.com/course/introduction-to-data-science/ code: PDSWR2