Gamma regression models are used when the response variable is
strictly positive and right‑skewed.
Typical applications include:
Gamma regression is a standard GLM for positive, right‑skewed responses (Nelder and Wedderburn 1972; McCullagh and Nelder 1989; Agresti 2015).
In classical statistics, these models are fit using
glm(…, family = Gamma(link = …)).
In glmbayes, the Bayesian analogue is
glmb(…, family = Gamma(link = …), pfamily = dNormal(…))
or, when dispersion is unknown and modeled,
glmb(…, family = Gamma(link = …), pfamily = dNormal_Gamma(…))
or related pfamilies.
This chapter:
Gamma regression models strictly positive, right-skewed
responses.
In a Gamma GLM with fixed dispersion, the response satisfies
\[ Y_i > 0, \qquad Y_i \sim \text{Gamma}(\text{mean}=\mu_i,\ \text{dispersion}=\phi), \]
where the variance is
\[ \mathrm{Var}(Y_i) = \phi\,\mu_i^2. \]
The mean is linked to a linear predictor through
\[ \eta_i = x_i^\top \beta, \qquad \mu_i = g^{-1}(\eta_i). \]
In this chapter we focus exclusively on the log
link, which is the standard choice for Gamma GLMs and the only
link supported by glmb() when dispersion is fixed.
Let \(w_i\) denote observation
weights (e.g., claim counts or exposures).
Under the log link,
\[ \mu_i = e^{\eta_i}, \]
and the Gamma log-likelihood (up to constants) is
\[ \ell(\beta) = \sum_{i=1}^n w_i\left[ -\frac{1}{\phi}\,\eta_i - \frac{1}{\phi}\,y_i e^{-\eta_i} \right]. \]
Terms not involving \(\beta\) have
been absorbed into the constant.
This form is used by both glm() and the Bayesian functions
glmb() and rglmb() when dispersion is
fixed.
The Gamma distribution belongs to the exponential family (McCullagh and Nelder 1989; Agresti 2015) with canonical parameter
\[ \theta_i = -\frac{1}{\mu_i}, \]
and cumulant function
\[ b(\theta_i) = -\log(-\theta_i). \]
The variance is
\[ \mathrm{Var}(Y_i) = \phi\,\mu_i^2. \]
Under the log link, the canonical parameter is a
monotone transformation of the linear predictor, and the resulting
log-likelihood is log-concave in \(\eta_i\).
This property is essential for stable envelope construction and
accept–reject sampling in glmbayes.
The log link is defined by
\[ g(\mu_i) = \log(\mu_i), \qquad \mu_i = e^{\eta_i}. \]
It is the preferred link for Gamma GLMs because:
glmb() when dispersion
is fixed.Other links (inverse, identity) do not preserve log-concavity and are
therefore not used in this chapter.
Models with estimated dispersion or non-log
links are covered in Chapter 11.
With the log link, the weighted log-likelihood is
\[ \ell(\beta) = \sum_{i=1}^n w_i\left[ -\frac{1}{\phi}\,\eta_i - \frac{1}{\phi}\,y_i e^{-\eta_i} \right]. \]
A multivariate normal prior is specified as
\[ \log p(\beta) = -\tfrac{1}{2}(\beta - \mu_0)^\top \Sigma_0^{-1}(\beta - \mu_0) + \text{const}. \]
Combining likelihood and prior yields the log-posterior:
\[ \log p(\beta \mid y) = \sum_{i=1}^n w_i\left[ -\frac{1}{\phi}\,\eta_i - \frac{1}{\phi}\,y_i e^{-\eta_i} \right] - \tfrac{1}{2}(\beta - \mu_0)^\top \Sigma_0^{-1}(\beta - \mu_0) + \text{const}. \]
Because both terms are concave in \(\beta\), the posterior is log-concave, enabling efficient iid sampling via the envelope-based accept–reject sampler implemented in glmbayes. ## 3. Link Functions for Gamma GLMs
The link function relates the mean \(\mu\) to the linear predictor \(\eta = X\beta\):
\[ g(\mu) = \eta. \]
The Gamma family in base R supports:
| Link | Formula | Notes |
|---|---|---|
| inverse | \(\eta = 1/\mu\) | canonical link |
| identity | \(\eta = \mu\) | must ensure \(\mu > 0\) |
| log | \(\eta = \log(\mu)\) | most common; ensures \(\mu > 0\) |
In practice, the log link is usually preferred because:
At present (due to the need for log-concavity), only the log link is implemented in the glmb function.
We now consider a Gamma regression example based on the carinsca
dataset.
The goal is to model average claim cost as a function of rating
variables Merit and Class, using claim counts as weights.
We begin by preparing the data and setting appropriate factor levels and contrasts.
data(carinsca)
carinsca$Merit <- ordered(carinsca$Merit)
carinsca$Class <- factor(carinsca$Class)
oldopt <- options(contrasts = c("contr.treatment", "contr.treatment"))
Claims <- carinsca$Claims
Insured <- carinsca$Insured
Merit <- carinsca$Merit
Class <- carinsca$Class
Cost <- carinsca$CostThe response Cost/Claims represents the average claim
cost (severity) per claim.
The weights Claims reflect the number of claims on which each average is
based, improving efficiency and aligning with standard GLM practice for
rate or average models.
We fit a classical Gamma GLM with a log link. As glm()
only crudely estimates the dispersion, we use
MASS::gamma.dispersion() (Venables
and Ripley 2002) and pass the estimate to
summary().
out <- glm(Cost/Claims ~ Merit + Class,
family = Gamma(link = "log"),
weights = Claims,
x = TRUE)
## Estimate the dispersion using MLE
disp <- gamma.dispersion(out)
summary(out,dispersion=disp)
#>
#> Call:
#> glm(formula = Cost/Claims ~ Merit + Class, family = Gamma(link = "log"),
#> weights = Claims, x = TRUE)
#>
#> Coefficients:
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) -1.17456 0.01195 -98.286 < 2e-16 ***
#> Merit1 -0.06867 0.02008 -3.420 0.000626 ***
#> Merit2 -0.07025 0.02238 -3.138 0.001700 **
#> Merit3 -0.05673 0.01254 -4.523 6.08e-06 ***
#> Class2 0.08274 0.02031 4.073 4.64e-05 ***
#> Class3 0.01583 0.01410 1.123 0.261529
#> Class4 0.15981 0.01494 10.698 < 2e-16 ***
#> Class5 -0.08142 0.03005 -2.709 0.006744 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for Gamma family taken to be 7.840919)
#>
#> Null deviance: 1556.0 on 19 degrees of freedom
#> Residual deviance: 156.9 on 12 degrees of freedom
#> AIC: -2999165
#>
#> Number of Fisher Scoring iterations: 4The summary shows:
Under the log link, exp(\(\beta_{j}\)) can be interpreted as a multiplicative factor on the expected cost ratio for a one‑unit change in the corresponding covariate (or a change in factor level).
To fit the Bayesian analogue, we proceed in two steps:
The gamma.dispersion() helper estimates the dispersion parameter from a fitted classical Gamma GLM:
## better than the crude estimate from the summary function
disp <- gamma.dispersion(out)
disp
#> [1] 7.840919This value will be treated as known in our Bayesian model, so that the Bayesian and classical fits are directly comparable in terms of how they treat phi.
Next we use Prior_Setup() to construct a Zellner‑type
g‑prior (Griffin and Brown 2010) for the
coefficients, aligned with the Gamma log‑link model and weighting
structure:
ps <- Prior_Setup(Cost/Claims ~ Merit + Class,
family = Gamma(link = "log"),
weights = Claims)
#> Using default pwt = 0.01 (low-d default).
mu <- ps$mu
V <- ps$SigmaThe output from ps (if printed) will include:
With the default pwt = 0.01, the prior is weakly informative, and the posterior estimates typically remain close to the MLEs.
We now call glmb(), specifying the same model and likelihood as in the classical fit, but adding a dNormal prior family that fixes dispersion at the value estimated above:
out_glmb <- glmb(Cost/Claims ~ Merit + Class,
family = Gamma(link = "log"),
pfamily = dNormal(mu = mu, Sigma = V, dispersion = disp),
weights = Claims)The arguments mirror the glm() call, with pfamily providing the prior:
Bayesian summary:
summary(out_glmb)
#> Call
#> glmb(formula = Cost/Claims ~ Merit + Class, family = Gamma(link = "log"),
#> pfamily = dNormal(mu = mu, Sigma = V, dispersion = disp),
#> weights = Claims)
#>
#> Expected Residuals:
#> Min 1Q Median 3Q Max
#> -5.9889656 -2.1503982 -0.3464031 2.2958381 6.3430236
#>
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#>
#> Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept) -1.18283 -1.20215 0.15462 -1.17456 0.016
#> Merit1 0.00000 0.00000 0.25980 -0.06867 0.026
#> Merit2 0.00000 0.00000 0.28961 -0.07025 0.029
#> Merit3 0.00000 0.00000 0.16226 -0.05673 0.016
#> Class2 0.00000 0.00000 0.26281 0.08274 0.026
#> Class3 0.00000 0.00000 0.18245 0.01583 0.018
#> Class4 0.00000 0.00000 0.19328 0.15981 0.019
#> Class5 0.00000 0.00000 0.38885 -0.08142 0.039
#>
#> Bayesian Estimates Based on 1000 iid draws
#>
#> Post.Mode Post.Mean Post.Sd MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -1.1747274 -1.1746675 0.0117659 0.0003721 0.2420000 0.003
#> Merit1 -0.0682525 -0.0678516 0.0206808 0.0006540 0.0000000 0.000
#> Merit2 -0.0698241 -0.0698158 0.0225002 0.0007115 0.0010000 0.001
#> Merit3 -0.0563841 -0.0561201 0.0126068 0.0003987 0.0000000 0.000
#> Class2 0.0822404 0.0820272 0.0198569 0.0006279 0.0000000 0.000
#> Class3 0.0157394 0.0155515 0.0137728 0.0004355 0.1340000 0.011
#> Class4 0.1588636 0.1585185 0.0147610 0.0004668 0.0000000 0.000
#> Class5 -0.0809389 -0.0809062 0.0299447 0.0009469 0.0050000 0.002
#> Pr(Prior_tail)
#> (Intercept) 0.009 **
#> Merit1 <2e-16 ***
#> Merit2 0.001 **
#> Merit3 <2e-16 ***
#> Class2 <2e-16 ***
#> Class3 0.134
#> Class4 <2e-16 ***
#> Class5 0.005 **
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> Directional Tail Summaries:
#>
#> Metric vs Null vs Prior
#> Mahalanobis Distance 16.6510 15.8459
#> Tail Probability 0.0000 0.0000
#> [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#>
#>
#> Distribution Percentiles
#>
#> 1.0% 2.5% 5.0% Median 95.0% 97.5% 99.0%
#> (Intercept) -1.201518 -1.196129 -1.193432 -1.174957 -1.155021 -1.149703 -1.146
#> Merit1 -0.121085 -0.109264 -0.102227 -0.067554 -0.035026 -0.029120 -0.023
#> Merit2 -0.118013 -0.113105 -0.106580 -0.070020 -0.033930 -0.026077 -0.019
#> Merit3 -0.085389 -0.082037 -0.078022 -0.055310 -0.035998 -0.032365 -0.028
#> Class2 0.037135 0.043259 0.048983 0.082879 0.114347 0.120577 0.128
#> Class3 -0.015756 -0.011230 -0.007122 0.015346 0.037787 0.041287 0.049
#> Class4 0.124130 0.129555 0.135374 0.158407 0.183404 0.187520 0.193
#> Class5 -0.147714 -0.137897 -0.130838 -0.081565 -0.029834 -0.019777 -0.011
#>
#> Effective Number of Parameters: 8.057135
#> Expected Residual Deviance: 220.2001
#> DIC: -107.1082
#>
#> Expected Mean dispersion: 7.840919
#> Sq.root of Expected Mean dispersion: 2.800164
#>
#> Mean Likelihood Subgradient Candidates Per iid sample: 2.562
options(oldopt)The Bayesian summary will report:
Because dispersion is fixed at the same value in both models, differences between out and out_glmb reflect primarily the influence of the prior on \(\beta\), not differences in \(\phi\).
Under the log link, coefficient interpretations are multiplicative:
With a weak prior, posterior means in out_glmb will be close to the MLEs in out, but with slightly more regularization and smaller posterior standard deviations.
By fixing dispersion at disp from the classical model, the Bayesian and classical fits share the same assumed level of variability.
This allows:
If desired, a more advanced model can estimate dispersion by switching to a pfamily such as dNormal_Gamma, but that is covered in more detail in the dispersion‑focused vignettes.
The Bayesian summary of out_glmb reports:
These can be compared to AIC and deviance from the classical model, especially when considering alternative specifications (for example, reduced models, different covariate sets, or alternative link functions).
Gamma regression provides a flexible framework for modeling positive,
skewed response variables (McCullagh and Nelder
1989; Gelman et al. 2013).
In this chapter, we have:
In later vignettes, we extend these ideas to: