Ambitious analytics projects have a tangible risk of failure. Uncertainty breeds anxiety. There are known techniques to lower the uncertainty, guarantee failure and shift the blame onto others. We outline a few proven methods of analytics sabotage and their application. In honor of Steven Potter call this activity “statsmanship” which we define as pursing the goal of making your analytics group cry.

Read more…
Fast Portfolio re-Balancing as a Fractional Linear Program is an example of the kind of work we have done encoding client problems (in this case optimal portfolio selection) as optimization problems (so we can use purchased software to solve them). Its a bit mathy- but we are excited we got permission to share this. Read more…
We have been living in the age of “big data” for some time now. This is an age where incredible things can be accomplished through the effective application of statistics and machine learning at large scale (for example see: “The Unreasonable Effectiveness of Data” Alon Halevy, Peter Norvig, Fernando Pereira, IEEE Intelligent Systems (2009)). But I have gotten to thinking about the period before this. The period before we had easy access to so much data, before most computation was aggregation and before we accepted numerical analysis style convergence as “efficient.” A small problem I needed to solve (as part of a bigger project) reminded me what theoretical computer scientists did then: we worried about provable worst case efficiency.
Read more…
Categories: Computer Science, Expository Writing, Mathematics Tags: Age of Big Data, Big Data, Mathematical Bedside Reading, Mean, Mean of Medians, Median, Median of Means, Theorist, Winsorized mean
We extend the ideas of from Automatic Differentiation with Scala to include the reverse accumulation. Reverse accumulation is a non-obvious improvement to automatic differentiation that can in many cases vastly speed up calculations of gradients. Read more…
Categories: Applications, Coding, Exciting Techniques, Mathematics, Programming, Tutorials Tags: Automatic Differentiation, Conjugate Gradient, Gradient, Mathematical Bedside Reading, Optimization, Reverse Accumulation, Scala
This article is a worked-out exercise in applying the Scala type system to solve a small scale optimization problem. For this article we supply complete Scala source code (under a GPLv3 license) and some design discussion. Read more…
Categories: Applications, Coding, Computer Science, Exciting Techniques, Mathematics, Programming, Tutorials Tags: Automatic Differentiation, Conjugate Gradient, Dual Numbers, Geometric Median, Numeric Methods, Optimization, Scala, Steiner Tree
Having worked with Unix (BSD, HPUX, IRIX, Linux and OSX), Windows (NT4, 2000, XP, Vista and 7) for quite a while I have seen a lot of different software tools. I would like to quickly exhibit my “must have” list. These are the packages that I find to be the single “must have offerings” in a number of categories. I have avoided some categories (such as editors, email programs, programing language, IDEs, photo editors, backup solutions, databases, database tools and web tools) where I have no feeling of having seen a single absolute best offering.
The spirit of the list is to pick items such that: if you disagree with an item in this list then either you are wrong or you know something I would really like to hear about.
Read more…
Categories: Computers, Opinion, Programming, Tutorials Tags: Excel, git, GnuPG, Keynote, Latex, Must Have Software, Papers, R, Software, Tools, TrueCrypt
We would like to share a new algorithmic movie we have created.
Since the mid 90′s we have been dabbling off and on with a combination of algorithmic and genetic art (see: What is “Genetic Art?” or try running the Java code directly in your browser). Every once in a while we return to the project and generate something we would like to share.
Read more…
This article is a quick appreciation of some of the statistical, analytic and philosphic techniques of Deming, Wald and Boyd. Many of these techniques have become pillars of modern industry through the sciences of statistics and operations research.
Read more…
Readers returning to our blog will know that Win-Vector LLC is fairly “pro-R.” You can take that to mean “in favor or R” or “professionally using R” (both statements are true). Some days we really don’t feel that way. Read more…