Posted on Categories Administrativia, Coding, Statistics, TutorialsTags , , , 4 Comments on R Tip: Use [[ ]] Wherever You Can

R Tip: Use [[ ]] Wherever You Can

R tip: use [[ ]] wherever you can.

In R the [[ ]] is the operator that (when supplied a simple scalar argument) pulls a single element out of lists (and the [ ] operator pulls out sub-lists).

For vectors [[ ]] and [ ] appear to be synonyms (modulo the issue of names). However, for a vector [[ ]] checks that the indexing argument is a scalar, so if you intend to retrieve one element this is a good way of getting an extra check and documenting intent. Also, when writing reusable code you may not always be sure if your code is going to be applied to a vector or list in the future.

It is safer to get into the habit of always using [[ ]] when you intend to retrieve a single element.

Example with lists:

list("a", "b")[1]
#> [[1]]
#> [1] "a"

list("a", "b")[[1]]
#> [1] "a"

Example with vectors:

c("a", "b")[1]
#> [1] "a"

c("a", "b")[[1]]
#> [1] "a"

The idea is: in situations where both [ ] and [[ ]] apply we rarely see [[ ]] being the worse choice.


Note on this article series.

This R tips series is short simple notes on R best practices, and additional packaged tools. The intent is to show both how to perform common tasks, and how to avoid common pitfalls. I hope to share about 20 of these about every other day to learn from the community which issues resonate and to also introduce some of features from some of our packages. It is an opinionated series and will sometimes touch on coding style, and also try to showcase appropriate Win-Vector LLC R tools.

Posted on Categories Administrativia, data science, Exciting Techniques, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , Leave a comment on Data Reshaping with cdata

Data Reshaping with cdata

I’ve just shared a short webcast on data reshaping in R using the cdata package.

(link)

We also have two really nifty articles on the theory and methods:

Please give it a try!

This is the material I recently presented at the January 2017 BARUG Meetup.

NewImage

Posted on Categories Administrativia, Exciting Techniques, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, StatisticsTags , , , , Leave a comment on Big cdata News

Big cdata News

I have some big news about our R package cdata. We have greatly improved the calling interface and Nina Zumel has just written the definitive introduction to cdata.

cdata is our general coordinatized data tool. It is what powers the deep learning performance graph (here demonstrated with R and Keras) that I announced a while ago.

KerasPlot

However, cdata is much more than that.

Continue reading Big cdata News

Posted on Categories Administrativia, Opinion, Pragmatic Data Science, Pragmatic Machine Learning, Programming, StatisticsTags , , , , , , , , 4 Comments on Getting started with seplyr

Getting started with seplyr

A big “thank you!!!” to Microsoft for hosting our new introduction to seplyr. If you are working R and big data I think the seplyr package can be a valuable tool.


Safety
Continue reading Getting started with seplyr

Posted on Categories Administrativia, Statistics, TutorialsTags , , , , , 8 Comments on Update on coordinatized or fluid data

Update on coordinatized or fluid data

We have just released a major update of the cdata R package to CRAN.

Cdata

If you work with R and data, now is the time to check out the cdata package. Continue reading Update on coordinatized or fluid data

Posted on Categories Administrativia, data science, StatisticsTags , , , 1 Comment on Some Announcements

Some Announcements

Some Announcements:

  • Dr. Nina Zumel will be presenting “Myths of Data Science: Things you Should and Should Not Believe”,
    Sunday, October 29, 2017
    10:00 AM to 12:30 PM at the She Talks Data Meetup (Bay Area).
  • ODSC West 2017 is soon. It is our favorite conference and we will be giving both a workshop and a talk.
    • Thursday Nov 2 2017,
      2:00 PM,
      Room T2,
      “Modeling big data with R, Sparklyr, and Apache Spark”,
      Workshop/Training intermediate, 4 hours,
      by Dr. John Mount (link).

    • Friday Nov 3 2017,
      4:15 PM,
      Room TR2
      “Myths of Data Science: Things you Should and Should Not Believe”,
      Data Science lecture beginner/intermediate, 45 minutes,
      by Dr. Nina Zumel (link, length, abstract, and title to be corrected).

    • We really hope you can make these talks.

  • On the “R for big data” front we have some big news: the replyr package now implements pivot/un-pivot (or what tidyr calls spread/gather) for big data (databases and Sparklyr). This data shaping ability adds a lot of user power. We call the theory “coordinatized data” and the work practice “fluid data”.
Posted on Categories Administrativia, Opinion, Practical Data Science, Pragmatic Data Science, Pragmatic Machine Learning, Statistics, TutorialsTags , , , , , , 3 Comments on Upcoming data preparation and modeling article series

Upcoming data preparation and modeling article series

I am pleased to announce that vtreat version 0.6.0 is now available to R users on CRAN.


Vtreat

vtreat is an excellent way to prepare data for machine learning, statistical inference, and predictive analytic projects. If you are an R user we strongly suggest you incorporate vtreat into your projects. Continue reading Upcoming data preparation and modeling article series

Posted on Categories Administrativia, Opinion, StatisticsTags , ,

Thank You For The Very Nice Comment

Somebody nice reached out and gave us this wonderful feedback on our new Supervised Learning in R: Regression (paid) video course.

Thanks for a wonderful course on DataCamp on XGBoost and Random forest. I was struggling with Xgboost earlier and Vtreat has made my life easy now :).

Continue reading Thank You For The Very Nice Comment

Posted on Categories Administrativia, data science, Practical Data Science, Pragmatic Data Science, StatisticsTags , 4 Comments on Supervised Learning in R: Regression

Supervised Learning in R: Regression

We are very excited to announce a new (paid) Win-Vector LLC video training course: Supervised Learning in R: Regression now available on DataCamp

Shield image course 3851 20170725 24872 3f982z Continue reading Supervised Learning in R: Regression