Posted on Categories Applications, Expository Writing, StatisticsTags , 2 Comments on The Data Enrichment Method

The Data Enrichment Method

We explore some of the ideas from the seminal paper “The Data-Enrichment Method” ( Henry R Lewis, Operations Research (1957) vol. 5 (4) pp. 1-5). The paper explains a technique of improving the quality of statistical inference by increasing the effective size of the data-set. This is called “Data-Enrichment.”

Now more than ever we must be familiar with the consequences of these important techniques. Especially if we don’t know if we might already be a victim of them.

Continue reading The Data Enrichment Method

Posted on Categories Expository Writing, FinanceTags , , , , 1 Comment on What does the Market Think?

What does the Market Think?

What does the market think about IBM’s proposed acquisition of Sun? Continue reading What does the Market Think?

Posted on Categories Expository Writing, Finance, OpinionTags , , , ,

It is not all the quants’ fault.

There is plenty of blame to go around from the current global financial crisis. But, I would like to point out that it is not “all the quants’ fault.” We are all now, unfortunately, sitting in the middle of a high quality (and extremely expensive) lesson in financial mathematics. I would hate for some of the truly important points to be lost to paying too much attention to some of the shiny details.

Continue reading It is not all the quants’ fault.

Posted on Categories Expository Writing, MathematicsTags , ,

Volunteers in Large Clubs: The Theorist’s View

I have just posted a new write-up: Volunteers in Large Clubs: The Theorist’s View. This paper describes some interesting issues in organizing volunteers in a large club and tries to show (without math) how a theoretical computer scientist attacks such problems. Continue reading Volunteers in Large Clubs: The Theorist’s View

Posted on Categories Computer Science, Opinion, RantsTags , 1 Comment on Map Reduce: A Good Idea

Map Reduce: A Good Idea

Some time ago I subscribed to The Database Column because it would be fun to see what these incredible people wanted to discuss. We owe much of our current database technology to Professor Stonebraker and Vertica sounds like an incredible product. And I definitely want to continue to subscribe.

However, the reading experience is marred by some flaw in their RSS system that keeps marking the article “MapReduce: A major step backwards” as a new article. This causes the article to appear in my RSS reader every few weeks as “new.” This wouldn’t bother me too much except that the article runs so counter to experience that it is itself offensive.
Continue reading Map Reduce: A Good Idea

Posted on Categories Exciting Techniques, Pragmatic Machine Learning, StatisticsTags , 2 Comments on Exciting Technique #1: The “R” language.

Exciting Technique #1: The “R” language.

Our first “exciting technique” article is about a statistical language called “R.”

R is a language for statistical analysis available from http://cran.r-project.org/ . The things you can immediately do with it are incredible. You can import a spreadsheet and immediately spot relationships, trend and anomalies. R gives you instant access to top notch visualization methods and sophisticated statistical methods.

Continue reading Exciting Technique #1: The “R” language.

Posted on Categories Administrativia

New “exciting techniques” series of articles.

I am starting a new “exciting techniques” series of articles on the Win-Vector blog. The primary purpose of the Win-Vector blog remains identifying and describing needs, but I am starting a new sub-series of articles about techniques. Continue reading New “exciting techniques” series of articles.

Posted on Categories AdministrativiaTags , , , ,

The Purpose of this Blog

The purpose of this blog (which is not quite “blog like” in its promise of a once a month longish technical article) is to educate, share the Win-Vector principles and learn more about writing (through practice).

I am a big fan of “understanding through writing” (you learn through trying to explain). The difficulty in technical writing is always balancing the incompatible competing needs for conciseness, clarity, correctness and utility. There is a next-level of writing and understanding (that I am not at, but I am becoming more able to recognize) where these things synergize instead of compete. This post will close with such an example from Edsger Dijkstra (in its entirety):

Elegance is not a dispensable luxury but a factor that decides between success and failure.

This covers so much of what I am trying to say.

(And thank you to Peteris Krumins for blogging on this)

Posted on Categories Opinion, RantsTags 1 Comment on Something I don’t get about business and bailouts

Something I don’t get about business and bailouts

I don’t really know what the right answer to the $700 Billion Dollar Bailout Question is (I have not read the bill, and I wonder if the bill really describes what would happen). But the whole situation does remind me of a related question: is it really the end of the world if the “credit markets freeze?” It is a disaster if the equity markets tank for a period of longer than a year or so (prevents people from retiring and so on)- but I am not sure if all of the consequences we are being told really follow. Continue reading Something I don’t get about business and bailouts

Posted on Categories Applications, Expository Writing, FinanceTags , ,

A Quick Appreciation of the Sharpe Ratio

The current state of the global financial markets has gotten more people than usual worrying about the technical aspects of finance. One method for reasoning about investment returns and risk is a tool called the Sharpe Ratio. It is well worth reviewing this measure and seeing how, if used properly, it doesn’t favor any of the mistakes that underly our current financial crisis. Continue reading A Quick Appreciation of the Sharpe Ratio