Generalized linear models (GLMs) rest on two foundational
ideas:
(1) exponential family likelihoods, which provide a
unified mathematical structure for a wide range of data-generating
processes, and
(2) link functions, which connect the mean of the
response to a linear predictor.
This section reviews the core concepts behind exponential families, explains why canonical links play such an important role, and highlights the special role of log-concavity in both classical and Bayesian estimation. Standard references for GLM structure include (McCullagh and Nelder 1989; Nelder and Wedderburn 1972); the original formulation of GLMs in S appears in (Hastie and Pregibon 1992). For Bayesian GLMs with normal priors, envelope-based iid posterior sampling in glmbayes builds on (Nygren and Nygren 2006).
Many common statistical models belong to the exponential family, a class of distributions that can be written in the weighted form
\[ f(y \mid \theta, \phi, w) = \exp\left\{ \sum_{i=1}^{n} w_i \left[ \frac{y_i \theta_i - b(\theta_i)}{a(\phi)} + c(y_i, \phi) \right] \right\}. \]
Here:
This formulation includes the Gaussian, Poisson, Binomial, Gamma, and
many others.
The exponential-family form is not merely aesthetic: it guarantees
several structural properties that GLMs rely on:
These properties make exponential-family models computationally stable and theoretically elegant, especially when combined with linear predictors.
A GLM specifies a relationship between the mean mu and a
linear predictor eta = X beta through a link
function:
\[ g(\mu) = \eta. \]
When the link function is chosen so that \(\theta = \eta\), the link is called
canonical.
Canonical links have several advantages:
beta (a key
point for Section 5).Examples:
Although noncanonical links are sometimes preferred for interpretability or domain-specific reasons, canonical links typically yield the most stable estimation behavior.
A function f is log-concave if
log f is concave.
Most exponential-family likelihoods with canonical links are log-concave
in the linear predictor, and often in the coefficients beta
as well.
Log-concavity has several important implications:
For concave functions, the gradient exists almost everywhere, and
when it does not, a subgradient always exists.
This is crucial for:
Because GLM likelihoods are log-concave in many common cases, subgradient-based methods are guaranteed to work.
Concavity implies:
This is one of the reasons GLMs are so widely used: the optimization landscape is benign.
The envelope construction approach of (Nygren and Nygren 2006) relies on:
For GLMs with canonical links, these conditions are naturally
satisfied.
This makes GLMs an ideal setting for likelihood-subgradient densities,
mixture envelopes, and accept–reject sampling strategies.
Log-concave likelihoods interact especially well with:
Posterior modes are unique, posterior tails behave predictably, and envelope-based samplers remain efficient even in moderate dimensions.
This conceptual foundation sets the stage for the rest of the chapter:
glmb.Section 1 introduced the exponential‑family structure underlying
generalized linear models.
This section explains how that structure is implemented in practice
through R’s glm function.
The goal is to show how formulas, families, and link functions work
together inside glm, and how these components are passed to the
model‑fitting machinery.
A GLM in R is defined by three user‑supplied components:
Together, these components translate the theoretical framework of Section 1 into a practical modeling interface.
In R, the linear predictor is defined using a formula such as:
y ~ x1 + x2 + x3
This corresponds to the linear predictor
\[ \eta = X \beta \]
where X is the model matrix constructed from the formula.
The formula interface automatically handles:
Thus, the formula defines the systematic component of the GLM.
The family argument in glm is not a character string.
It is a function call that constructs a family object.
For example:
family = poisson(link = "log")
When glm is called, this expression is evaluated first.
The result is a structured object containing:
glm then uses this object to:
The link is modified by passing an argument to the family
constructor.
Examples include:
family = binomial(link = "probit")
family = poisson(link = "identity")
family = Gamma(link = "log")
In all cases, glm receives a family object with the appropriate link embedded inside it.
The binomial and quasibinomial families accept responses in three forms:
This flexibility is unique to the binomial family and reflects its exponential‑family structure.
The quasi, quasibinomial, and quasipoisson families:
They behave like GLMs but without a fixed likelihood.
The link function \(g(\mu) = \eta\)
determines how the mean of the response relates to the linear
predictor.
Each family has a canonical link, which ensures that:
Examples of canonical links include:
Noncanonical links (for example, probit, cloglog, identity for
binomial) are also available.
They use the same family‑object machinery but may lose some of the
computational advantages of canonical links.
The glm function brings together the components above in a way that mirrors the theory of Section 1:
A single line of code captures the entire GLM specification in R:
glm(counts ~ outcome + treatment,
family = poisson(link = "log"))
This line shows all three components working together:
Formula
counts ~ outcome + treatment
tells glm how to build the model matrix X, and therefore the linear
predictor \(\eta = X \beta\).
Family
poisson(...)
constructs a family object describing the Poisson distribution and its
variance function \(V(\mu) =
\mu\).
Link
link = "log"
is passed inside the family call, telling glm to use the log link
\(g(\mu) = \log(\mu)\),
with inverse
\(\mu = \exp(\eta)\).
glm evaluates the family call first, constructs the family object, builds the model matrix from the formula, and then fits the model using IRLS under the Poisson log‑link specification.
This single line illustrates exactly how R combines the formula, the family, and the link to define and fit a GLM.
The concepts above describe how GLMs are specified and fitted in
practice.
Section 3 now provides a complete reference of all families and link
functions implemented in glm, expressed in the exponential‑family form
introduced in Section 1.
glm()Section 2 explained how formulas, families, and link functions are
passed to glm() and how they define the structure of a
generalized linear model.
This section provides a concise overview of the families and link
functions available in base R, and explains how they relate to the
exponential‑family framework introduced in Section 1.
A complete, fully expanded reference of all family–link combinations—including canonical parameters, cumulant functions, and variance functions—is provided in Appendix A.
The table below summarizes the families available in base R’s
glm() function and the link functions supported for each
family. Canonical links are marked with “(canonical)”.
| Family | Description | Supported Link Functions |
|---|---|---|
gaussian() |
Continuous outcomes with constant variance | identity (canonical), log,
inverse |
binomial() |
Binary data or proportions | logit (canonical), probit,
cauchit, cloglog, identity |
poisson() |
Count data | log (canonical), identity,
sqrt |
Gamma() |
Positive continuous, skewed | inverse (canonical), identity,
log |
inverse.gaussian() |
Positive continuous with heavy tails | $1 / \mu^{2}$ (canonical), identity,
log, inverse |
quasi() |
User-specified mean-variance relationship | Depends on user specification |
quasibinomial() |
Overdispersed binomial | Same as binomial() |
quasipoisson() |
Overdispersed Poisson | Same as poisson() |
quasi, quasibinomial,
quasipoisson) do not correspond to true exponential-family
distributions but retain the GLM mean-variance structure.Each family in glm() corresponds to a specific
exponential‑family distribution of the form
\[ f(y \mid \theta, \phi) = \exp\left\{ \frac{y\theta - b(\theta)}{a(\phi)} + c(y,\phi) \right\}. \]
The choice of family determines:
The choice of link function determines:
Canonical links (e.g., logit for binomial, log for Poisson) often yield:
Noncanonical links remain valid but may alter curvature and convergence properties.
The full exponential‑family expansions for every family–link combination—including:
are provided in Appendix A.
This keeps the main text focused on concepts while still giving advanced users access to the complete mathematical reference.
The Bayesian extensions used in glmb() and
Prior_Setup() rely on the same structure:
In short, the classical GLM framework provides the mathematical and computational foundation on which the Bayesian methods in later chapters are built.
glmb()The function glmb() is a Bayesian extension of the
classical glm() function.
Its interface mirrors glm() as closely as possible: users
specify a model using a formula, choose a likelihood family, and then
supply a prior distribution through the pfamily
argument.
This design preserves the familiar GLM workflow while enabling full
Bayesian inference.
The setup for glmb() follows the same structure as
glm():
"glm" and
"lm"This compatibility ensures that standard
generics—summary(), predict(),
residuals(), extractAIC(), and others—work
naturally with glmb objects.
pfamily Argument: Specifying PriorsThe key addition in glmb() is the required
pfamily argument, which specifies the prior distribution
for the regression coefficients.
The pfamily system parallels how glm() uses
family:
family → likelihood and linkpfamily → prior distribution and hyperparametersThe default prior is a multivariate normal:
pfamily = dNormal(mu, Sigma)
The helper function Prior_Setup() constructs sensible
defaults for mu and Sigma, using a
reparameterized form of Zellner’s g-prior.
Users may also fully customize the prior.
Supported prior families include:
glmb() currently supports:
These match the most commonly used GLM families.
glmb() with
Formulas, Families, and PriorsJust as a classical GLM is defined by a formula and a likelihood family:
glm(counts ~ outcome + treatment,
family = poisson(link = "log"))
a Bayesian GLM adds one additional component: the prior family.
A typical workflow uses Prior_Setup() to construct prior
parameters:
ps <- Prior_Setup(counts ~ outcome + treatment,
family = poisson(link = "log"))
mu <- ps$mu
V <- ps$Sigma
The corresponding Bayesian call mirrors the classical one:
glmb.D93 <- glmb(counts ~ outcome + treatment,
family = poisson(link = "log"),
pfamily = dNormal(mu = mu, Sigma = V))
This single line shows how glmb() receives:
The result is a set of independent posterior draws for coefficients, fitted values, linear predictors, and deviance, along with posterior summaries such as the posterior mode and DIC.
For any supported combination of likelihood family, link, and prior
family, glmb() generates independent draws
from the posterior distribution—no MCMC chains are required.
glmb() uses an accept–reject
sampler based on likelihood‑subgradient envelopes.By default:
n = 1000 posterior draws are generatedA glmb object contains:
pD, Dbar,
Dthetabar, DIC)"glm" objectiters)Because glmb inherits from "glm" and
"lm", most classical methods apply directly.
The Bayesian methods implemented in glmb() rely on the
exponential‑family structure described in Sections 1–3.
This section explains, at a high level, why the posterior distribution
is well‑behaved for generalized linear models and how
glmb() exploits this structure to generate independent
posterior draws.
For the likelihood families supported by glmb(), the
log‑likelihood is concave in the canonical
parameter.
When combined with a log‑concave prior (such as the multivariate normal
used by default), the posterior density is also log‑concave.
This ensures:
Canonical links (e.g., logit for binomial, log for Poisson) preserve concavity and are therefore especially convenient.
For non‑Gaussian models, glmb() uses an accept–reject
sampler based on likelihood‑subgradient
envelopes.
The idea is to build a tight, convex upper bound on the negative
log‑likelihood using tangent points.
This envelope:
The Gridtype and n_envopt arguments control
how many tangent points are used, trading off envelope tightness against
construction cost.
The component iters in the returned object reports how many
candidate draws were generated before acceptance.
Posterior sampling in glmb() proceeds in two stages:
Mode finding
A classical GLM fit is used as the starting point, and the posterior
mode is obtained by optimizing the sum of the log‑likelihood and
log‑prior.
Independent sampling
Because the sampler produces independent draws, there are no chains,
no burn‑in, and no convergence diagnostics.
This makes posterior summaries straightforward and computationally
efficient.
The computational methods in glmb() leverage the
structure of exponential‑family likelihoods and log‑concave priors to
produce fast, reliable Bayesian inference for generalized linear
models.
These methods ensure that the Bayesian extension behaves predictably
across all supported families and links, while maintaining a workflow
that closely parallels the classical glm() function.
glm()
Families and LinksThis appendix provides a complete reference for the
exponential‑family forms associated with every family–link combination
implemented in base R’s glm() function.
These tables expand the conceptual overview in Section 3 by listing:
All expressions follow the exponential‑family form introduced in Section 1:
\[ f(y \mid \theta, \phi) = \exp\left\{ \frac{y\theta - b(\theta)}{a(\phi)} + c(y,\phi) \right\}. \]
| Family | Link | \(g(\mu)=\eta\) | \(\mu=g^{-1}(\eta)\) | \(\theta\) | \(b(\theta)\) | \(V(\mu)\) |
|---|---|---|---|---|---|---|
| Gaussian | identity (canonical) | \(\eta=\mu\) | \(\mu=\eta\) | \(\theta=\mu\) | \(\theta^2/2\) | \(1\) |
| Gaussian | log | \(\eta=\log(\mu)\) | \(\mu=e^\eta\) | \(\theta=\mu\) | \(\theta^2/2\) | \(1\) |
| Gaussian | inverse | \(\eta=1/\mu\) | \(\mu=1/\eta\) | \(\theta=\mu\) | \(\theta^2/2\) | \(1\) |
Exponential‑family core:
\[ b(\theta)=\log(1+e^\theta), \quad \mu=\frac{e^\theta}{1+e^\theta}, \quad V(\mu)=\mu(1-\mu). \]
| Family | Link | \(g(\mu)=\eta\) | \(\mu=g^{-1}(\eta)\) | \(\theta\) | \(b(\theta)\) | \(V(\mu)\) |
|---|---|---|---|---|---|---|
| Binomial | logit (canonical) | \(\eta=\log(\mu/(1-\mu))\) | \(\mu=\frac{e^\eta}{1+e^\eta}\) | \(\theta=\eta\) | \(\log(1+e^\theta)\) | \(\mu(1-\mu)\) |
| Binomial | probit | \(\eta=\Phi^{-1}(\mu)\) | \(\mu=\Phi(\eta)\) | \(\theta=\log(\mu/(1-\mu))\) | same | same |
| Binomial | cloglog | \(\eta=\log[-\log(1-\mu)]\) | \(\mu=1-e^{-e^\eta}\) | \(\theta=\log(\mu/(1-\mu))\) | same | same |
| Binomial | cauchit | \(\eta=\tan\{\pi(\mu-1/2)\}\) | \(\mu=\frac{1}{2}+\frac{1}{\pi}\arctan(\eta)\) | \(\theta=\log(\mu/(1-\mu))\) | same | same |
| Binomial | identity | \(\eta=\mu\) | \(\mu=\eta\) | \(\theta=\log(\mu/(1-\mu))\) | same | same |
\[ b(\theta)=e^\theta, \quad \mu=e^\theta, \quad V(\mu)=\mu. \]
| Family | Link | \(g(\mu)=\eta\) | \(\mu=g^{-1}(\eta)\) | \(\theta\) | \(b(\theta)\) | \(V(\mu)\) |
|---|---|---|---|---|---|---|
| Poisson | log (canonical) | \(\eta=\log(\mu)\) | \(\mu=e^\eta\) | \(\theta=\eta\) | \(e^\theta\) | \(\mu\) |
| Poisson | identity | \(\eta=\mu\) | \(\mu=\eta\) | \(\theta=\log(\mu)\) | same | same |
| Poisson | sqrt | \(\eta=\sqrt{\mu}\) | \(\mu=\eta^2\) | \(\theta=\log(\mu)\) | same | same |
\[ b(\theta)=-\log(-\theta), \quad \mu=-1/\theta, \quad V(\mu)=\mu^2. \]
| Family | Link | \(g(\mu)=\eta\) | \(\mu=g^{-1}(\eta)\) | \(\theta\) | \(b(\theta)\) | \(V(\mu)\) |
|---|---|---|---|---|---|---|
| Gamma | inverse (canonical) | \(\eta=1/\mu\) | \(\mu=1/\eta\) | \(\theta=-1/\mu\) | \(-\log(-\theta)\) | \(\mu^2\) |
| Gamma | identity | \(\eta=\mu\) | \(\mu=\eta\) | \(\theta=-1/\mu\) | same | same |
| Gamma | log | \(\eta=\log(\mu)\) | \(\mu=e^\eta\) | \(\theta=-1/\mu\) | same | same |
\[ b(\theta)=\sqrt{-2\theta}, \quad \mu=\frac{1}{\sqrt{-2\theta}}, \quad V(\mu)=\mu^3. \]
| Family | Link | \(g(\mu)=\eta\) | \(\mu=g^{-1}(\eta)\) | \(\theta\) | \(b(\theta)\) | \(V(\mu)\) |
|---|---|---|---|---|---|---|
| Inverse Gaussian | \(1/\mu^2\) (canonical) | \(\eta=1/\mu^2\) | \(\mu=1/\sqrt{\eta}\) | \(\theta=-1/(2\mu^2)\) | \(\sqrt{-2\theta}\) | \(\mu^3\) |
| Inverse Gaussian | identity | \(\eta=\mu\) | \(\mu=\eta\) | \(\theta=-1/(2\mu^2)\) | same | same |
| Inverse Gaussian | log | \(\eta=\log(\mu)\) | \(\mu=e^\eta\) | \(\theta=-1/(2\mu^2)\) | same | same |
| Inverse Gaussian | inverse | \(\eta=1/\mu\) | \(\mu=1/\eta\) | \(\theta=-1/(2\mu^2)\) | same | same |
These are not true exponential families, but glm()
allows them.
| Family | Link Functions | Variance Function |
|---|---|---|
| quasi | any link | user‑specified |
| quasibinomial | same as binomial | \(\mu(1-\mu)\) up to scale |
| quasipoisson | same as poisson | \(\mu\) up to scale |