Posted on Categories Programming, TutorialsTags , , 2 Comments on R Tip: How To Look Up Matrix Values Quickly

R Tip: How To Look Up Matrix Values Quickly

R is a powerful data science language because, like Matlab, numpy, and Pandas, it exposes vectorized operations. That is, a user can perform operations on hundreds (or even billions) of cells by merely specifying the operation on the column or vector of values.

Of course, sometimes it takes a while to figure out how to do this. Please read for a great R matrix lookup problem and solution.

Continue reading R Tip: How To Look Up Matrix Values Quickly

Posted on Categories Administrativia, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , , , Leave a comment on Re-Share: vtreat Data Preparation Documentation and Video

Re-Share: vtreat Data Preparation Documentation and Video

I would like to re-share vtreat (R version, Python version) a data preparation documentation for machine learning tasks.

vtreat is a system for preparing messy real world data for predictive modeling tasks (classification, regression, and so on). In particular it is very good at re-coding high-cardinality string-valued (or categorical) variables for later use.

Continue reading Re-Share: vtreat Data Preparation Documentation and Video

Posted on Categories Administrativia, data science, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics To English Translation, TutorialsTags , , Leave a comment on What is New For vtreat 1.5.2?

What is New For vtreat 1.5.2?

vtreat version 1.5.2 just became available from CRAN.

We have a logged a few improvement in the NEWS. The changes are small and incremental, as the package is already in a great stable state for production use.

Continue reading What is New For vtreat 1.5.2?

Posted on Categories data science, Statistics, TutorialsTags , , , , , Leave a comment on New improved cdata instructional video

New improved cdata instructional video

We have a new improved version of the “how to design a cdata/data_algebra data transform” up!

The original article, the Python example, and the R example have all been updated to use the new video.

Please check it out!

Posted on Categories data science, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , , , , , Leave a comment on Data re-Shaping in R and in Python

Data re-Shaping in R and in Python

Nina Zumel and I have a two new tutorials on fluid data wrangling/shaping. They are written in a parallel structure, with the R version of the tutorial being almost identical to the Python version of the tutorial.

This reflects our opinion on the “which is better for data science R or Python?” They both are great. So start with one, and expect to eventually work with both (if you are lucky).

Continue reading Data re-Shaping in R and in Python

Posted on Categories Administrativia, data science, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , 2 Comments on wrapr 1.9.6 is now up on CRAN

wrapr 1.9.6 is now up on CRAN

wrapr 1.9.6 is now up on CRAN.

We unfortunately usually forget to say this. A big thank you to the staff and volunteers at CRAN.

Continue reading wrapr 1.9.6 is now up on CRAN

Posted on Categories data science, Statistics, TutorialsTags , , 2 Comments on Using unpack to Manage Your R Environment

Using unpack to Manage Your R Environment

In our last note we stated that unpack is a good tool for load R RDS files into your working environment. Here is the idea expanded into a worked example.

Continue reading Using unpack to Manage Your R Environment

Posted on Categories Exciting Techniques, TutorialsTags , , , Leave a comment on unpack Your Values in R

unpack Your Values in R

I would like to introduce an exciting feature in the upcoming 1.9.6 version of the wrapr R package: value unpacking.

Continue reading unpack Your Values in R

Posted on Categories Administrativia, Opinion, Practical Data Science, Pragmatic Data Science, Pragmatic Machine LearningTags , , , , , Leave a comment on New Year’s Resolution 2020: Work on more R Data Science Projects

New Year’s Resolution 2020: Work on more R Data Science Projects

We had such a positive reception to our last Introduction to Data Science promotion, that we are going to try and make the course available to more people by lowering the base-price to $29.99. We are also creating a 1 month promotional price of $20.99. To get a permanent subscription to the course for less than $21 just visit this link https://www.udemy.com/course/introduction-to-data-science/ and use the discount code ITDS21 any time in January of 2020.

Combine this with the new second edition of Practical Data Science with R, and you have a great study set to succeed at substantial statistical modeling and analytics tasks using the R programming language.


PDSwR2Lego

(Note: Lego mini-fig not included!)