Posted on Categories Coding, data science, Exciting Techniques, Programming, Statistics, TutorialsTags , , ,

Wanted: cdata Test Pilots

I need a few volunteers to please “test pilot” the development version of the R package cdata, please.

Jackie Cochran at 1938 Bendix Race
Jacqueline Cochran: at the time of her death, no other pilot held more speed, distance, or altitude records in aviation history than Cochran.

Our cdata package is using an upcoming new feature called “build_frame()” that allows for the very easy and legible entry of example data.frames. It allows one to type in a data.frame in a row-oriented form (much like tibble::tribble()).

Using build_frame() we can type in a by-hand example data.frame as follows:

library("cdata")

d <- build_frame(
  "names", "x", "y" |
  "a"    , 1  ,  1  |
  "b"    , 2  ,  4  |
  "c"    , 3  ,  9  )

The idea is the above code looks very much like it prints (both are in a row-oriented form):

print(d)

#   names x  y
# 1     a 1  1
# 2     b 2  4
# 3     c 3  9

The number of columns was inferred from the location of the first infix operator “|“, and all other formatting details are irrelevant.

cdata also includes a printer that prints any simple data.frame (one containing only numeric, character, and logical values) in a ready to share format:

cat(draw_frame(
  data.frame(names = c('a', 'b', 'c'),
             x = c(1, 2, 3),
             y = c(1, 4, 9))
  ))

# build_frame(
#    "names", "x", "y" |
#    "a"    , 1  , 1   |
#    "b"    , 2  , 4   |
#    "c"    , 3  , 9   )

More information and examples can be found in the method documentation (help(build_frame), help(draw_frame), and help(qchar_frame), and the frame tools vignette.

This is new functionality, and I wouldn’t mind a few “test pilots” to generate some feedback before submitting this update to CRAN. Please consider filing issues on the project, or emailing me (email in the project description).

To try it you would need to install the development version of cdata, which in R can be accomplished by a series of command such as the following:

install.packages("devtools")
devtools::install_github("WinVector/wrapr")
devtools::install_github("WinVector/cdata")

I also strongly recommend R users check out cdata‘s primary function: next generation fluid reshaping of data (more teaching materials: 1, 2, 3, and 4). It is a new thing to learn, but it takes your data engineering and data wrangling to a whole new level.