Chapter 07: Models for the Binomial Family

library(glmbayes)

1. Introductory Discussion

Binomial generalized linear models (GLMs) are used when the response represents binary outcomes (success/failure) or proportions (successes out of trials). They are among the most widely used GLMs in applied statistics, powering models for:

Binomial regression is a standard generalized linear model (Nelder and Wedderburn 1972; McCullagh and Nelder 1989; Agresti 2015).

In classical statistics, these models are fit using:

glm(..., family = binomial(link = ...))

In glmbayes, the Bayesian analogue is:

glmb(..., family = binomial(link = ...), pfamily = dNormal(mu, Sigma))

This chapter introduces:

  1. the structure of binomial GLMs
  2. the available link functions (logit, probit, cloglog)
  3. how to specify these models in glmbayes
  4. worked examples for each link function

We build on the foundations from Chapters 05 and 06, especially the role of link functions, log‑concavity, and prior specification.

2. Binomial Likelihood and Weighted Formulation

Binomial data arise in several equivalent representations:

In all cases, the underlying sampling model is

\[ Y_i \sim \text{Binomial}(n_i, \mu_i), \qquad 0 < \mu_i < 1, \]

where: - \(n_i\) is the number of trials, - \(\mu_i = \Pr(Y_i = 1)\) is the success probability.

2.1 Linear predictor and mean structure

A binomial GLM links the mean \(\mu_i\) to a linear predictor through

\[ \eta_i = x_i^\top \beta, \qquad \mu_i = g^{-1}(\eta_i), \]

where \(g(\cdot)\) is the chosen link function (logit, probit, cloglog, etc.).

2.2 Weighted binomial log‑likelihood

Using weights \(w_i = n_i\), the log‑likelihood (up to constants) becomes

\[ \ell(\beta) = \sum_{i=1}^n w_i\Big[ y_i \log(\mu_i) + (1-y_i)\log(1-\mu_i) \Big]. \]

This form is used by both glm() and the Bayesian functions glmb() and rglmb().

2.3 Exponential‑family representation

The binomial likelihood belongs to the exponential family (McCullagh and Nelder 1989; Agresti 2015).
For a model with linear predictor \[ \eta_i = x_i^\top \beta, \] and mean \[ \mu_i = g^{-1}(\eta_i), \] the contribution of observation \(i\) to the log‑likelihood can be written as \[ \ell_i(\beta) = w_i\Big[ y_i \log(\mu_i) + (1-y_i)\log(1-\mu_i) \Big], \] where \(w_i\) is the number of trials (or a user‑supplied weight).

This representation does not require the link to be canonical.
The variance of a binomial observation is always \[ \mathrm{Var}(Y_i) = \mu_i(1-\mu_i), \] regardless of the link function.

3. Specifying Binomial Models in glmbayes

The general Bayesian call is:

glmb(
  formula,
  family   = binomial(link = "logit" | "probit" | "cloglog"),
  pfamily  = dNormal(mu = mu, Sigma = V),
  data     = ...
)

3.1 Prior Specification

As in earlier chapters, the recommended workflow is:

ps <- Prior_Setup(formula, family = binomial(link = "logit"), data = ...)
mu <- ps$mu
V  <- ps$Sigma

This produces:

You may override these defaults for more informative priors (see Chapter 10).


8. Concluding Discussion

Binomial GLMs are a core component of the glmbayes package. Their log‑concave likelihoods make them ideal for the envelope‑based accept‑reject sampler, and the familiar link functions allow analysts to choose models that match the scientific context (McCullagh and Nelder 1989; Gelman et al. 2013).

This chapter demonstrated:

In the next chapter, we extend these ideas to Poisson models, which share many structural similarities but introduce new considerations for count data.

References

Agresti, Alan. 2015. Foundations of Linear and Generalized Linear Models. Cambridge University Press.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3rd ed. CRC Press.
Griffin, Jim E., and Philip J. Brown. 2010. “Inference with Normal-Gamma Prior Distributions in Regression Problems.” Bayesian Analysis 5 (1): 171–88. https://doi.org/10.1214/10-BA507.
McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.
Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society. Series A (General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Spiegelhalter, David J., Nicky G. Best, Bradley P. Carlin, and Angelika van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (4): 583–639. https://doi.org/10.1111/1467-9868.00353.