Poisson regression is the canonical GLM for modeling count data (McCullagh and Nelder 1989).
The Poisson likelihood is log‑concave under the log link, making it an
ideal setting for envelope‑based Bayesian sampling (Nygren and Nygren 2006).
This chapter mirrors the structure used for the Binomial vignette: we
begin with the classical glm() fit, then introduce the Bayesian glmb()
version, discuss prior specification, and compare posterior summaries to
their classical counterparts.
We use the well‑known randomized controlled trial dataset of (Dobson 1990), which appears in many GLM
textbooks and is included in the base R documentation for glm().
The dataset contains counts of subjects classified by treatment group
and outcome category.
Poisson regression models count data of the form
\[ Y_i \in \{0,1,2,\ldots\}, \qquad Y_i \sim \text{Poisson}(\mu_i), \]
where \(\mu_i > 0\) is the
expected count.
Observation weights \(w_i\) (e.g.,
exposure times) enter the likelihood through
\[ Y_i \sim \text{Poisson}(w_i \mu_i). \]
The mean is linked to a linear predictor via
\[ \eta_i = x_i^\top \beta, \qquad \mu_i = g^{-1}(\eta_i). \]
Up to constants, the weighted log‑likelihood is
\[ \ell(\beta) = \sum_{i=1}^n \left[ w_i y_i \log(\mu_i) - w_i \mu_i \right]. \]
Under the log link, the mean is
\[ \mu_i = e^{\eta_i}, \]
and the log‑likelihood simplifies to
\[ \ell(\beta) = \sum_{i=1}^n \left[ w_i y_i \eta_i - w_i e^{\eta_i} \right]. \]
This is the form used by both glm() and the Bayesian
functions glmb() and rglmb().
The Poisson distribution belongs to the exponential family with canonical parameter
\[ \theta_i = \log(\mu_i), \]
and cumulant function
\[ b(\theta_i) = e^{\theta_i}. \]
The variance is
\[ \mathrm{Var}(Y_i) = \mu_i. \]
Under the log link, the linear predictor equals the canonical parameter:
\[ \eta_i = \theta_i. \]
This identity ensures that the Poisson log‑likelihood is globally log‑concave in the linear predictor, which is ideal for envelope construction and accept–reject sampling in glmbayes.
The canonical link for the Poisson family is the log link:
\[ g(\mu_i) = \log(\mu_i), \qquad \mu_i = e^{\eta_i}. \]
Key properties:
Noncanonical links (identity, square‑root) do not
preserve log‑concavity and are not supported in glmb().
With the log link, the weighted log‑likelihood is
\[ \ell(\beta) = \sum_{i=1}^n \left[ w_i y_i \eta_i - w_i e^{\eta_i} \right]. \]
A multivariate normal prior is specified as
\[ \log p(\beta) = -\tfrac{1}{2}(\beta - \mu_0)^\top \Sigma_0^{-1} (\beta - \mu_0) + \text{const}. \]
Combining likelihood and prior yields the log‑posterior:
\[ \log p(\beta \mid y) = \sum_{i=1}^n \left[ w_i y_i \eta_i - w_i e^{\eta_i} \right] - \tfrac{1}{2}(\beta - \mu_0)^\top \Sigma_0^{-1} (\beta - \mu_0) + \text{const}. \]
Because both terms are concave in \(\beta\), the posterior is log‑concave, enabling efficient iid sampling via the envelope‑based accept–reject sampler implemented in glmbayes.
We reproduce Dobson’s example exactly.
The data consist of 9 observations: three outcome levels crossed with
three treatment groups.
## Dobson (1990) Page 93: Randomized Controlled Trial :
set.seed(333)
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
print(d.AD <- data.frame(treatment, outcome, counts))
#> treatment outcome counts
#> 1 1 1 18
#> 2 1 2 17
#> 3 1 3 15
#> 4 2 1 20
#> 5 2 2 10
#> 6 2 3 20
#> 7 3 1 25
#> 8 3 2 13
#> 9 3 3 12The table shows the observed counts for each treatment–outcome combination.
The classical glm() call uses the Poisson family with the canonical log link:
glm.D93 <- glm(counts ~ outcome + treatment,
family = poisson(link = "log"))
summary(glm.D93)
#>
#> Call:
#> glm(formula = counts ~ outcome + treatment, family = poisson(link = "log"))
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 3.045e+00 1.709e-01 17.815 <2e-16 ***
#> outcome2 -4.543e-01 2.022e-01 -2.247 0.0246 *
#> outcome3 -2.930e-01 1.927e-01 -1.520 0.1285
#> treatment2 1.217e-15 2.000e-01 0.000 1.0000
#> treatment3 8.438e-16 2.000e-01 0.000 1.0000
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for poisson family taken to be 1)
#>
#> Null deviance: 10.5814 on 8 degrees of freedom
#> Residual deviance: 5.1291 on 4 degrees of freedom
#> AIC: 56.761
#>
#> Number of Fisher Scoring iterations: 4The glm summary reports:
The coefficients represent multiplicative effects on the expected
count.
For example, \(\exp(\beta)\) gives the
rate ratio associated with a one-unit change in a predictor.
The residual deviance (approximately 5.13) indicates a good fit relative to the null deviance (approximately 10.58), consistent with Dobson’s original analysis.
To fit the Bayesian analogue, we must specify a prior distribution
for the regression coefficients.
The recommended workflow uses Prior_Setup(), which constructs a
Zellner‑type g‑prior aligned with the model matrix.
ps <- Prior_Setup(counts ~ outcome + treatment,
family = poisson())
#> Using default pwt = 0.01 (low-d default).
mu <- ps$mu
V <- ps$SigmaThe printed Prior_Setup() output shows:
With the default pwt = 0.01, the prior is weakly informative.
print(glmb.D93)
#>
#> Call: glmb(formula = counts ~ outcome + treatment, family = poisson(),
#> pfamily = dNormal(mu = mu, Sigma = V))
#>
#> Posterior Mean Coefficients:
#> (Intercept) outcome2 outcome3 treatment2 treatment3
#> 3.026184 -0.447186 -0.296234 0.005697 -0.003414
#>
#> Effective Number of Parameters: 4.943835
#> Expected Residual Deviance: 10.11906
#> DIC: 56.69507
summary(glmb.D93)
#> Call
#> glmb(formula = counts ~ outcome + treatment, family = poisson(),
#> pfamily = dNormal(mu = mu, Sigma = V))
#>
#> Expected Residuals:
#> Min 1Q Median 3Q Max
#> -0.9604284 -0.6108847 -0.1084108 0.9280288 1.0914552
#>
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#>
#> Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept) 2.813e+00 2.813e+00 1.700e+00 3.045e+00 0.171
#> outcome2 0.000e+00 0.000e+00 2.012e+00 -4.543e-01 0.202
#> outcome3 0.000e+00 0.000e+00 1.918e+00 -2.930e-01 0.193
#> treatment2 0.000e+00 0.000e+00 1.990e+00 1.217e-15 0.200
#> treatment3 0.000e+00 0.000e+00 1.990e+00 8.438e-16 0.200
#>
#> Bayesian Estimates Based on 1000 iid draws
#>
#> Post.Mode Post.Mean Post.Sd MC Error Pr(Null_tail) SE(tail)
#> (Intercept) 3.042e+00 3.026e+00 1.679e-01 5.309e-03 9.200e-02 0.009
#> outcome2 -4.497e-01 -4.472e-01 1.989e-01 6.290e-03 1.400e-02 0.004
#> outcome3 -2.901e-01 -2.962e-01 1.878e-01 5.938e-03 5.900e-02 0.007
#> treatment2 -1.395e-05 5.697e-03 2.098e-01 6.634e-03 4.850e-01 0.016
#> treatment3 1.209e-06 -3.414e-03 1.989e-01 6.290e-03 4.940e-01 0.016
#> Pr(Prior_tail)
#> (Intercept) 0.092 .
#> outcome2 0.014 *
#> outcome3 0.059 .
#> treatment2 0.485
#> treatment3 0.494
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Directional Tail Summaries:
#>
#> Metric vs Null vs Prior
#> Mahalanobis Distance 2.3308 2.3308
#> Tail Probability 0.0120 0.0120
#> [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#>
#>
#> Distribution Percentiles
#>
#> 1.0% 2.5% 5.0% Median 95.0% 97.5% 99.0%
#> (Intercept) 2.647232 2.697381 2.752295 3.020564 3.306623 3.358232 3.411
#> outcome2 -0.877269 -0.827067 -0.765187 -0.452620 -0.108355 -0.049461 0.022
#> outcome3 -0.703049 -0.654677 -0.601365 -0.291129 0.017370 0.069558 0.117
#> treatment2 -0.475433 -0.401161 -0.346093 0.010560 0.330793 0.393848 0.489
#> treatment3 -0.465001 -0.411936 -0.332202 0.003298 0.314890 0.371910 0.473
#>
#> Effective Number of Parameters: 4.943835
#> Expected Residual Deviance: 10.11906
#> DIC: 56.69507
#>
#> Expected Mean dispersion: 1
#> Sq.root of Expected Mean dispersion: 1
#>
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.772The summary includes:
Posterior means are close to the classical MLEs because:
The pD value is typically slightly below the nominal number of coefficients, reflecting mild shrinkage from the prior.
DIC is approximately 56.4, very close to the classical AIC value of
56.76.
This agreement is expected when the prior is weak and the posterior
distribution is close to Gaussian.
| Quantity | Classical glm | Bayesian glmb |
|---|---|---|
| Point estimate | MLE | Posterior mean |
| Uncertainty | SE | Posterior SD |
| Model fit | Deviance, AIC | \(\bar{D}\), \(p_D\), DIC |
| Interpretation | log-rate ratios | same, with shrinkage |
The Bayesian model provides richer diagnostics (tail probabilities, posterior quantiles) while preserving the familiar GLM structure.
Posterior percentiles (2.5%, 50%, 97.5%) provide Bayesian credible
intervals for each coefficient.
These intervals closely track the classical Wald intervals because the
posterior is nearly normal.
This vignette demonstrates that:
In later chapters, we extend these ideas to: