Home > Coding, Computer Science, Rants > Postel’s Law: Not Sure Who To Be Angry With

Postel’s Law: Not Sure Who To Be Angry With

February 25th, 2010

One of my research interests is finding the principles that underly the management of information, complexity and uncertainty. When something as simple as a web-form is called “technology” it is time to step back and examine your principles. One principle I am not sure about Postel’s law. It doesn’t hold often enough to be relied on and when it fails I am not sure who to be angry with.

Postel’s Law (also called The Robustness Principle) comes from RFC 761 “Transmission Control Protocol” in 1980 and is: “Be conservative in what you do; be liberal in what you accept from others.” (Side note: RFC is the now ironic acronym used to describe Internet standards- the letters stood for “request for comments”).

This idea probably worked best where it started- in the TCP/IP world which is where a lot of hairy details of how computers network are handled. When your goal is the basic establishment of transient communications- success is measured by getting information through and not unnecessarily triggering a failure. It may be okay to tolerate mistakes here- because they don’t live long anyway.

Unfortunately, the law works less well other places where it is applied. The law is a downright hazard in dealing with archiving meaningful data (instead of managing transient signaling protocols). Sometimes the cost of obeying the law far outweighs the potential benefit.

A common arena for Postel’s law is now HTML, the markup language used to represent the content we view in web-browsers. In this arena Postel’s law has had two consequences: one good, one bad.

The good: almost anyone can create a working web-page or even a web-site because modern browsers have been designed to paper around almost every common HTML mistake. It has been pointed out that this ease of creation and “worse is better” (a deep principal due to Richard P. Gabriel see wikipedia: worse is better) has been one of the reasons that HTML out-competed and killed many other ideas. Philip Greenspun’s famous story of a 10-year-old building web site to get his mother medical attention happened in the sloppy world of HTML and could not have happened in the straight jacket of RDF (Resource Description Framework: the darling of the semantic web). I would not wish having to actually read or adhere to the incredibly long and irrelevant standards from w3.org (where the ratio of value to pedantry goes to zero) on an enemy. The web is only interesting due to its content and much of its content was only possible due to low barrier of entry.

The bad: to read HTML you almost have to re-create the entire history of web-browsers. This is a history of many hostile competitors (Microsoft, Netscape, Opera, WebKit, Mozzila, Google) and billions of dollars. Reproducing a significant fraction of this history is a significant (and useless) expense. For the most part I use a permissive parsing library like TagSoup or HTMLTidy but even these miss some things that browsers accept and are far more complicated than the task truly justifies.

Even worse is the cases of XML and RDF. These are often used for archival storage of semantic data. That is you may need to read and understand (not just display) data in XML for a long time. To be liberal in what you accept you have to again master a long set of useless complications (DTDs, namespaces incredibly inept character encoding and escapes) and still get burned by improperly encoded XML (that “used to work” because the bugs in the emitted XML matched the bugs in a library that is now out of date).

It is clear in the case of HTML and XML that Postel’s law’s cost is too high for what it delivers. Or at least half of the law is too expensive: no amount of being generous in what we accept makes up for the original data not have been impounded correctly (not being “conservative in what they do” and not having checked that at the time it was created). Some of this is that the producers of the data have no way of telling they are not being “conservative in what they do” because the “generous in what they accept” libraries they use to debug don’t tell them they are emitting bad data. And lets be honest- most systems are not designed for correctness, they are instead debugged until they seem to work. I would say that in fact HTML is not an example of the power of Postel’s law but of the pernicious influence of “worse is better.” Computer science has not risen to the level of “software engineering” we still are a horrible “fit to finish” industry.

Frankly for many things we need a simpler “fail early” discipline. Tools need to be better and standards need to be simpler so that if you write something that is wrong it is easy to see why it is wrong and easy to fix it. Postel’s law has helped hide the negative impacts of complicated standards, we need to push the cost of complications back on to standards committees. The need to be “generous in what you accept” overly favors large, rich entrenched players who have had the time and resources in incrementally invest in papering around every common mistake.

However, I am not sure if we can throw out half of Postel’s law or even if we want to. When Postel’s law fails it is not clear who to be mad at.

Sun, to kick somebody who is already down, was famous for making elaborate frameworks that correctly and brutally implement many details of RFCs. Sun’s Java includes huge frameworks for XML, UTF8 and email that scrupulously implement page after page of useless standard documentation but fail in the wild due to not being “generous in what they accept.” For example Sun’s GlassFish (which got listed named as one of four or five important assets during Sun’s various acquisition talks much like the fact the car has cup-holder somehow always gets mentioned in spec sheets) is an “open source production-quality enterprise software application server.” A supposedly major component of the GlassFish is its email component which is a huge unwieldy framework that implements many of the email related RFCs and protocols including IMAP. Unfortunately for all its hugeness it can not reliably read email folder names from one of the biggest IMAP servers: Google Mail. Google Mail includes “against standard” characters in the protocol and crashes the GlassFish software.

And here is where Postel’s law fails us: under Postel’s law both sides are at fault (Sun for failing to be generous in what they accepted, Google for failing to be conservative in what they did). We can’t assign only one villain. We have no proscription of who to ask for a fix. Postel’s law seems useful in that if either Google or Sun had followed it the two systems would work. But the law doesn’t pick one side to assign blame and help us to efficiently diagnose and fix the problem. It becomes difficult to find the critical bugs when they are masked by a see of “acceptable” bugs. Take a contrary example: the simpler law “implement the standard or fix the standard” would clearly assign blame to GMail.

Similar pain is encountered in Java’s handling of character encodings like UTF8. It is hard to move up the stack of artificial intelligence (from words, to concepts, to ideas, to reasoning to consciousness) when you can’t even reliably transcribe characters. When faced with bad character sequences (a common occurrence on the web) there is no practical way to get Java “mostly parse it,” Java libraries and frameworks authors seem to extract a perverse joy in throwing a program-killing exception (it does not matter if you catch it the library has already stopped doing what you wanted) because they are concerned that a diacritical mark was not properly encoded (web browsers, on the other hand, lose the mark or show some sort of damage near the mistake and blunder on). And here is were the frustration sets in, how can you make applications that are generous in what they accept when the libraries and frameworks are overly proud and picky? This, at first, seems like an argument for Postel’s law- if everybody else (especially the library authors) were generous in what they accepted your life could be easy. That is certainly one possibility- but I argue it often becomes a matter of semantics to assign blame where there is no pre-existing specification or performance agreement. In the end you will waste more time dealing with errors that should never have made it to you than the time you save emitting the odd error of your own.

The unit testing people have a somewhat better idea: fail early, fail at the factory where it is cheap to fix. Don’t litter all of your code with indecisive statements like:

  Set matches = computeMatches();
  if( matches!=null ) {
     for(String match: matches) {
         ...
     }
  }

Instead: write a unit test to document you expectation that the empty set is expressed in single consistent way:

   Set matches = computeMatches();
   assertNotNull(matches);

And from then on write more confident code:

  for(String match: computeMatches()) {
         ...
  }

This may seem overly optimistic and overly strict- but I have a point. One of the few good principles in computer science (and perhaps one of computer science’s contributions to knowledge, computers are a huge contribution to society- but they were made by engineers) is composition. A plan for getting from A to B followed by (or composed with) a plan for getting from B to C is a plan for getting from A to C. Well a correct plan for getting from A to B when composed with a correct plan for getting from B to C, if each of the plans “is mostly right if the piece after is so nice to fix up a few mistakes” you really don’t know what you have. You may have nothing.

That is my complaint- you can’t put an a priori bound on how expensive attempting to allow both sides of Postel’s law will be. You would like others to paper over your mistakes, but it is becoming too expensive to paper over the mistakes of others. In the end Postel’s law is of little help when cleaning up the inevitable mess.


Be Sociable, Share!
Comments are closed.