The amazing, all-encompassing generalized linear models


Takeaways:

More & Special cases: Probit link allows extension of binary outcome variables to categorical (# categories > 2), and ordinal categorical (ranked categories).

Gamma distribution for proportional count response variable (like election results): [helpful ipython notebook from statsmodels][3] (scroll down)

Funky models tobit, censored, and truncated regression are for data that are continuous but do not meet the stochastic assumption for normal distribution due to restrictions on the range of the response variable.

Handy blog post

for doing GzLM in statsmodels: http://slendermeans.org/ml4h-ch2-p2.html

further reading:

  1. This reading has a good narrative and is relatively easier to follow: glm.pdf (90.2 KB)

  2. This one is denser (though probably the gentlest lecture from this whole series) - lecture notes from Princeton CS's [Introduction to Probabilistic Modeling][1] class: generalized_linear_model.pdf (146.6 KB)

  3. Murphy ([Machine Learning: a Probabilistic Perspective][2]), ch 9 (p283- )