I have recently been working on projects using Amazon EC2 (elastic compute cloud), and RStudio Server. I thought I would share some of my working notes.
Amazon EC2 supplies near instant access to on-demand disposable computing in a variety of sizes (billed in hours). RStudio Server supplies an interactive user interface to your remote R environment that is nearly indistinguishable from a local RStudio console. The idea is: for a few dollars you can work interactively on R tasks requiring hundreds of GB of memory and tens of CPUs and GPUs.
If you are already an Amazon EC2 user with some Unix experience it is very easy to quickly stand up a powerful R environment, which is what I will demonstrate in this note.
Continue reading Setting up RStudio Server quickly on Amazon EC2
A good fraction of R users use Apple computers. Apple machines historically have sat at a sweet spot of convenience, power, and utility:
- Convenience: Apple machines are available at retail stores, come with purchasable support, and can run a lot of common commercial software.
R packages such as
Rcpp work better on top of a Posix environment.
- Utility: OSX was good at interoperating with the Linux your big data systems are likely running on, and some R packages expect a native operating system supporting a Posix environment (which historically has not been a Microsoft Windows, strength despite claims to the contrary).
Frankly the trade-off is changing:
- Apple is neglecting its computer hardware and operating system in favor of phones and watches. And (for claimed license prejudice reasons) the lauded OSX/macOS “Unix userland” is woefully out of date (try “
bash --version” in an Apple Terminal; it is about 10 years out of date!).
- Microsoft Windows Unix support is improving (
Windows 10 bash is interesting, though R really can’t take advantage of that yet).
- Linux hardware support is improving (though not fully there for laptops, modern trackpads, touch screens, or even some wireless networking).
Our current R platform remains Apple macOS. But our next purchase is likely a Linux laptop with the addition of a legal copy of Windows inside a virtual machine (for commercial software not available on Linux). It has been a while since Apple last “sparked joy” around here, and if Linux works out we may have a few Apple machines sitting on the curb with paper bags over their heads (Marie Kondo’s advice for humanely disposing of excess inanimate objects that “see”, such as unloved stuffed animals with eyes and laptops with cameras).
That being said: how does one update an existing Apple machine to macOS Sierra and then restore enough functionality to resume working? Please read on for my notes on the process. Continue reading Upgrading to macOS Sierra (nee OSX) for R users
Alexander Mordvintsev, Christopher Olah, and Mike Tyka, recently posted a great research blog article where they tried to visualize what a image classification neural net “wants to see.” They achieve this by optimizing the input to correspond to a fixed pattern of neural net internal node activation. This generated truly beautiful and fascinating phantasmagorical images (or an “image salad” by analogy to word salad). It is sort of like a search for eigenfaces (but a lot more fun).
A number of researchers had previously done this (many cited in their references), but the authors added more good ideas:
- Enforce a “natural image constraint” through insisting on near-pixel correlations.
- Start the search from another real image. For example: if the net is internal activation is constrained to recognize buildings and you start the image optimization from a cloud you can get a cloud with building structures. This is a great way to force interesting pareidolia like effects.
- They then “apply the algorithm iteratively on its own outputs and apply some zooming after each iteration.” This gives them wonderful fractal architecture with repeating motifs and beautiful interpolations.
- Freeze the activation pattern on intermediate layers of the neural network.
- (not claimed, but plausible given the look of the results) Use the access to the scoring gradient for final image polish (likely cleans up edges and improves resolution).
From Michael Tyka’s Inceptionism gallery
Likely this used a lot of GPU cycles. The question is, can we play with some of the ideas on our own (and on the cheap)? The answer is yes.
I share complete instructions, and complete code for a baby (couple of evenings) version of related effects. Continue reading Neural net image salad again (with code)
I am going to come-out and say it: I am emotionally done with 32 bit machines and operating systems. My sympathy for them is at an end.
I know that ARM is still 32 bit, but in that case you get something big back in exchange: the ability to deploy on smartphones and tablets. For PCs and servers 32 bit addressing’s time is long past, yet we still have to code for and regularly run into these machines and operating systems. The time/space savings of 32 bit representations is nothing compared to the loss of capability in sticking with that architecture and the wasted effort in coding around it. My work is largely data analysis in a server environment, and it is just getting ridiculous to not be able to always assume at least a 64 bit machine. Continue reading I am done with 32 bit machines
There is no excuse for a digital creative person to not use some sort of version control or source control. In the past disk space was too dear, version control systems were too expensive and software was not powerful enough; this is no longer the case. Unless your work is worthless both back it up and version control it. We will demonstrate a minimal set of version control commands that will one day save your bacon. Continue reading Minimal Version Control Lesson: Use It
I tend to prefer command line Linux and full window OSX for my work. The development and data handling tool chain is a bit better in Linux and the user interface reliability of the complete vertical stack is a bit better in OSX. I repeat here a couple of tips I found to improve the OSX finder.
Continue reading Enhance OSX Finder
Re-read Fred Brooks “The Mythical Man Month” over vacation. Book remains insightful about computer science and project management. Continue reading “The Mythical Man Month” is still a good read
Having worked with Unix (BSD, HPUX, IRIX, Linux and OSX), Windows (NT4, 2000, XP, Vista and 7) for quite a while I have seen a lot of different software tools. I would like to quickly exhibit my “must have” list. These are the packages that I find to be the single “must have offerings” in a number of categories. I have avoided some categories (such as editors, email programs, programing language, IDEs, photo editors, backup solutions, databases, database tools and web tools) where I have no feeling of having seen a single absolute best offering.
The spirit of the list is to pick items such that: if you disagree with an item in this list then either you are wrong or you know something I would really like to hear about.
Continue reading Must Have Software