Archive

Archive for the ‘Mathematics’ Category

Prefer = for assignment in R

April 23rd, 2013 20 comments

We share our opinion that = should be preferred to the more standard <- for assignment in R. This is from a draft of the appendix of our upcoming book. This has the risk of becoming an R version of Javascript’s semicolon controversy, but here you have it. Read more…

Categories: Mathematics, Programming Tags: , ,

Yet Another Java Linear Programming Library

November 23rd, 2012 Comments off

From time to time we work on projects that would benefit from a free lightweight pure Java linear programming library. That is a library unencumbered by a bad license, available cheaply, without an infinite amount of file format and interop cruft and available in Java (without binary blobs and JNI linkages). There are a few such libraries, but none have repeatably, efficiently and reliably met our needs. So we have re-packaged an older one of our own for release under the Apache 2.0 license. This code will have its own rough edges (not having been used widely in production), but I still feel fills an important gap. This article is brief introduction to our WVLPSolver Java library. Read more…

Working an example of von Neumann and Morgenstern utility

November 6th, 2012 Comments off

von Neumann and Morgenstern’s “Theory of Games and Economic Behavior” is the famous basis for game theory. One of the central accomplishments is the rigorous proof that comparative “preference methods” over fairly complicated “event spaces” are no more expressive than numeric (real number valued) utilities. That is: for a very wide class of event spaces and comparison functions “>” there is a utility function u() such that:

a > b (“>” representing the arbitrary comparison or preference for the event space) if and only if u(a) > u(b) (this time “>” representing the standard order on the reals).

However, an active reading of sections 1 through 3 and even the 2nd edition’s axiomatic appendix shows that the concept of “events” (what preferences and utilities are defined over) is deliberately left undefined. There is math and objects and spaces, but not all of them are explicitly defined in term of known structures (are they points in R^n, sets, multi-sets, sums over sets or what?). The word “event” is used early in the book and not in the index. Axiomatic treatments often rely on intentionally leaving ground-concepts undefined, but we are going to work a concrete example through von Neumann and Morgenstern to try and illustrate a bit more of the required intuition and deep nature of their formal notions of events and utility. I also will illustrate how, at least in discussion, von Neuman and Morgenstern may have held on to a naive “single outcome” intuition of events and a naive “direct dollars” intuition of utility despite erecting a theory carefully designed to support much more structure. This is possible because they never have to calculate in the general event space: they prove access to the preference allows them to construct the utility funciton u() and then work over the real numbers. Sections 1 through 3 are designed to eliminate the need for a theory of preference or utility and allow von Neuman and Morgenstern to work with real numbers (while achieving full generality). They never need to make the translations explicit, because soon after showing the translations are possible they assume they have already been applied. Read more…

Added worked example to logistic regression project

October 12th, 2012 Comments off

We have added a worked example to the README of our experimental logistic regression code.

The Logistic codebase is designed to support experimentation on variations of logistic regression including:

What we mean by this code being “experimental” is that it has capabilities that many standard implementations do not. In fact most of the items in the above list are not usually made available to the logistic regression user. But our project is also stand-alone and not as well integrated into existing workflows as standard production systems. Before trying our code you may want to try R or Mahout. Read more…

Rudie can’t fail (if majorized)

October 6th, 2012 Comments off

We have been writing for a while about the convergence of Newton steps applied to a logistic regression (See: What does a generalized linear model do?, How robust is logistic regression? and Newton-Raphson can compute an average). This is all based on our principle of working examples for understanding. This eventually progressed to some writing on the nature of problem solving (a nice complement to our earlier writing on calculation). In the course of research we were directed to a very powerful technique called the MM algorithm (see: “The MM Algorithm” Kenneth Lang, 2007; “A Tutorial on MM Algorithms”, David R. Hunter, Kenneth Lange, Amer. Statistician 58:30–37, 2004; and “Monotonicity of Quadratic-Approximation Algorithms”, Dankmar Bohning, Bruce G. Lindsay, Ann. Inst. Statist. Math, Vol. 40, No. 4, pp 641-664, 1988). The MM algorithm introduces an essential idea: majorized functions (not to be confused with the majorized order on R^d). Majorization it is an interesting way to modify Newton methods to be reliable contractions (and therefore converge in a manner similar to EM algorithms).

Here we will work an example of the MM method. We will not work it in its most general form, but in a form that quickly reveals much of the beauty of the method. We also introduce a “collared Newton step” which guarantees convergence without resorting to line-search (essentially resolving the issues in solving a logistic regression by Newton style methods). Read more…

The Mathematician’s Dilemma

September 3rd, 2012 Comments off

A recent run of too many articles on the same topic (exhibits: A, B and C) puts me in a position where I feel the need to explain my motivation. Which itself becomes yet another article related to the original topic. The explanation I offer is: this is the way mathematicians think. To us mathematicians the tension is that there are far too many observable patterns in the world to be attributed to mere chance. So our dilemma is: for which patterns/regularities should we derive some underlying law and which ones are not worth worrying about. Or which conjectures should try to work all the way to proof or counter-example? Read more…

Newton-Raphson can compute an average

August 28th, 2012 1 comment

In our article How robust is logistic regression? we pointed out some basic yet deep limitations of the traditional full-step Newton-Raphson or Iteratively Reweighted Least Squares methods of solving logistic regression problems (such as in R‘s standard glm() implementation). In fact in the comments we exhibit a well posed data fitting problem that can not be fit using the traditional methods starting at the traditional (0,0) start point. And we cited an example where the traditional methods fail to compute the average from a non-zero start. The question remained: can we prove the standard methods always compute the average correctly if started at zero? It turns out they can, and the proof isn’t as messy as I anticipated. Read more…

How robust is logistic regression?

August 23rd, 2012 6 comments

Logistic Regression is a popular and effective technique for modeling categorical outcomes as a function of both continuous and categorical variables. The question is: how robust is it? Or: how robust are the common implementations? (note: we are using robust in a more standard English sense of performs well for all inputs, not in the technical statistical sense of immune to deviations from assumptions or outliers.)

Even a detailed reference such as “Categorical Data Analysis” (Alan Agresti, Wiley, 1990) leaves off with an empirical observation: “the convergence … for the Newton-Raphson method is usually fast” (chapter 4, section 4.7.3, page 117). This is a book that if there is a known proof that the estimation step is a contraction (one very strong guarantee of convergence) you would expect to see the proof reproduced. I always suspected there was some kind of Brouwer fixed-point theorem based folk-theorem proving absolute convergence of the Newton-Raphson method in for the special case of logistic regression. This can not be the case as the Newton-Raphson method can diverge even on trivial full-rank well-posed logistic regression problems. Read more…

A bit more on impact coding

August 2nd, 2012 1 comment

Dr. Nina Zumel recently published an excellent tutorial on a modeling technique she called impact coding. It is a pragmatic machine learning technique that has helped with more than one client project. Impact coding is a bridge from Naive Bayes (where each variable’s impact is added without regard to the known effects of any other variable) to Logistic Regression (where dependencies between variables and levels is completely accounted). A natural question is can pick up more of the positive features of each model? Read more…

How to outrun a crashing alien spaceship

June 11th, 2012 4 comments

Hollywood movies are obsessed with outrunning explosions and outrunning crashing alien spaceships. For explosions the movies give the optimal (but unusable) solution: run straight away. For crashing alien spaceships they give the same advice, but in this case it is wrong. We demonstrate the correct angle to flee.

PrometheusRun

Running from a crashing alien spaceship, Prometheus 2012, copyright 20th Century Fox
Read more…