Nina Zumel recently gave a very clear explanation of logistic regression ( The Simpler Derivation of Logistic Regression ). In particular she called out the central role of log-odds ratios and demonstrated how the “deviance” (that mysterious
quantity reported by fitting packages) is both a term in “the pseudo-R^2” (so directly measures goodness of fit) and is the quantity that is actually optimized during the fitting procedure. One great point of the writeup was how simple everything is once you start thinking in terms of derivatives (and that it isn’t so much the functional form of the sigmoid that is special but its relation to its own derivative that is special).
We adapt these presentation ideas to make explicit the well known equivalence of logistic regression and maximum entropy models.In our new writeup: The equivalence of logistic regression and maximum entropy models we move to multi-category modeling and demonstrate how one invents something as remarkable as logistic regression.