Chapter 09: Models for the Gamma Family

library(glmbayes)

1. Introductory Discussion

Gamma regression models are used when the response variable is strictly positive and right‑skewed.
Typical applications include:

Gamma regression is a standard GLM for positive, right‑skewed responses (Nelder and Wedderburn 1972; McCullagh and Nelder 1989; Agresti 2015).

In classical statistics, these models are fit using

glm(…, family = Gamma(link = …)).

In glmbayes, the Bayesian analogue is

glmb(…, family = Gamma(link = …), pfamily = dNormal(…))

or, when dispersion is unknown and modeled,

glmb(…, family = Gamma(link = …), pfamily = dNormal_Gamma(…))

or related pfamilies.

This chapter:

  1. reviews the Gamma GLM structure and link functions
  2. shows how to fit classical and Bayesian Gamma regression models
  3. uses a realistic insurance example (carinsca)
  4. demonstrates how to estimate dispersion using gamma.dispersion()
  5. compares classical and Bayesian summaries, including DIC

4. Gamma Regression Example: Insurance Claims

We now consider a Gamma regression example based on the carinsca dataset.
The goal is to model average claim cost as a function of rating variables Merit and Class, using claim counts as weights.

4.1 Data Setup

We begin by preparing the data and setting appropriate factor levels and contrasts.


    data(carinsca)
    carinsca$Merit <- ordered(carinsca$Merit)
    carinsca$Class <- factor(carinsca$Class)
    oldopt <- options(contrasts = c("contr.treatment", "contr.treatment"))

    Claims  <- carinsca$Claims
    Insured <- carinsca$Insured
    Merit   <- carinsca$Merit
    Class   <- carinsca$Class
    Cost    <- carinsca$Cost

The response Cost/Claims represents the average claim cost (severity) per claim.
The weights Claims reflect the number of claims on which each average is based, improving efficiency and aligning with standard GLM practice for rate or average models.

4.2 Classical Gamma Regression Using glm()

We fit a classical Gamma GLM with a log link. As glm() only crudely estimates the dispersion, we use MASS::gamma.dispersion() (Venables and Ripley 2002) and pass the estimate to summary().


    out <- glm(Cost/Claims ~ Merit + Class,
               family  = Gamma(link = "log"),
               weights = Claims,
               x       = TRUE)

    ## Estimate the dispersion using MLE
    disp <- gamma.dispersion(out)
    
    summary(out,dispersion=disp)
#> 
#> Call:
#> glm(formula = Cost/Claims ~ Merit + Class, family = Gamma(link = "log"), 
#>     weights = Claims, x = TRUE)
#> 
#> Coefficients:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) -1.17456    0.01195 -98.286  < 2e-16 ***
#> Merit1      -0.06867    0.02008  -3.420 0.000626 ***
#> Merit2      -0.07025    0.02238  -3.138 0.001700 ** 
#> Merit3      -0.05673    0.01254  -4.523 6.08e-06 ***
#> Class2       0.08274    0.02031   4.073 4.64e-05 ***
#> Class3       0.01583    0.01410   1.123 0.261529    
#> Class4       0.15981    0.01494  10.698  < 2e-16 ***
#> Class5      -0.08142    0.03005  -2.709 0.006744 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for Gamma family taken to be 7.840919)
#> 
#>     Null deviance: 1556.0  on 19  degrees of freedom
#> Residual deviance:  156.9  on 12  degrees of freedom
#> AIC: -2999165
#> 
#> Number of Fisher Scoring iterations: 4

The summary shows:

Under the log link, exp(\(\beta_{j}\)) can be interpreted as a multiplicative factor on the expected cost ratio for a one‑unit change in the corresponding covariate (or a change in factor level).

5. Bayesian Gamma Regression (fixed dispersion) with glmb()

To fit the Bayesian analogue, we proceed in two steps:

  1. estimate the dispersion parameter \(\phi\) from the classical Gamma GLM,
  2. construct a prior for the regression coefficients using ,
  3. call with a prior that incorporates the fixed dispersion \(\phi\).

5.1 Estimating Dispersion via gamma.dispersion()

The gamma.dispersion() helper estimates the dispersion parameter from a fitted classical Gamma GLM:


    ## better than the crude estimate from the summary function
    disp <- gamma.dispersion(out)
    disp
#> [1] 7.840919

This value will be treated as known in our Bayesian model, so that the Bayesian and classical fits are directly comparable in terms of how they treat phi.

5.2 Prior Setup Using Prior_Setup()

Next we use Prior_Setup() to construct a Zellner‑type g‑prior (Griffin and Brown 2010) for the coefficients, aligned with the Gamma log‑link model and weighting structure:

    ps <- Prior_Setup(Cost/Claims ~ Merit + Class,
                      family  = Gamma(link = "log"),
                      weights = Claims)
#> Using default pwt = 0.01 (low-d default).

    mu <- ps$mu
    V  <- ps$Sigma

The output from ps (if printed) will include:

With the default pwt = 0.01, the prior is weakly informative, and the posterior estimates typically remain close to the MLEs.

5.3 Fitting the Bayesian Gamma Model

We now call glmb(), specifying the same model and likelihood as in the classical fit, but adding a dNormal prior family that fixes dispersion at the value estimated above:

    out_glmb <- glmb(Cost/Claims ~ Merit + Class,
                 family  = Gamma(link = "log"),
                 pfamily = dNormal(mu = mu, Sigma = V, dispersion = disp),
                 weights = Claims)

The arguments mirror the glm() call, with pfamily providing the prior:

5.4 Summarizing the Bayesian Model

Bayesian summary:


    summary(out_glmb)
#> Call
#> glmb(formula = Cost/Claims ~ Merit + Class, family = Gamma(link = "log"), 
#>     pfamily = dNormal(mu = mu, Sigma = V, dispersion = disp), 
#>     weights = Claims)
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -5.9889656 -2.1503982 -0.3464031  2.2958381  6.3430236 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)  -1.18283   -1.20215  0.15462  -1.17456   0.016
#> Merit1        0.00000    0.00000  0.25980  -0.06867   0.026
#> Merit2        0.00000    0.00000  0.28961  -0.07025   0.029
#> Merit3        0.00000    0.00000  0.16226  -0.05673   0.016
#> Class2        0.00000    0.00000  0.26281   0.08274   0.026
#> Class3        0.00000    0.00000  0.18245   0.01583   0.018
#> Class4        0.00000    0.00000  0.19328   0.15981   0.019
#> Class5        0.00000    0.00000  0.38885  -0.08142   0.039
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>              Post.Mode  Post.Mean    Post.Sd   MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -1.1747274 -1.1746675  0.0117659  0.0003721     0.2420000    0.003
#> Merit1      -0.0682525 -0.0678516  0.0206808  0.0006540     0.0000000    0.000
#> Merit2      -0.0698241 -0.0698158  0.0225002  0.0007115     0.0010000    0.001
#> Merit3      -0.0563841 -0.0561201  0.0126068  0.0003987     0.0000000    0.000
#> Class2       0.0822404  0.0820272  0.0198569  0.0006279     0.0000000    0.000
#> Class3       0.0157394  0.0155515  0.0137728  0.0004355     0.1340000    0.011
#> Class4       0.1588636  0.1585185  0.0147610  0.0004668     0.0000000    0.000
#> Class5      -0.0809389 -0.0809062  0.0299447  0.0009469     0.0050000    0.002
#>             Pr(Prior_tail)    
#> (Intercept)          0.009 ** 
#> Merit1              <2e-16 ***
#> Merit2               0.001 ** 
#> Merit3              <2e-16 ***
#> Class2              <2e-16 ***
#> Class3               0.134    
#> Class4              <2e-16 ***
#> Class5               0.005 ** 
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 16.6510  15.8459
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                  1.0%      2.5%      5.0%    Median     95.0%     97.5%  99.0%
#> (Intercept) -1.201518 -1.196129 -1.193432 -1.174957 -1.155021 -1.149703 -1.146
#> Merit1      -0.121085 -0.109264 -0.102227 -0.067554 -0.035026 -0.029120 -0.023
#> Merit2      -0.118013 -0.113105 -0.106580 -0.070020 -0.033930 -0.026077 -0.019
#> Merit3      -0.085389 -0.082037 -0.078022 -0.055310 -0.035998 -0.032365 -0.028
#> Class2       0.037135  0.043259  0.048983  0.082879  0.114347  0.120577  0.128
#> Class3      -0.015756 -0.011230 -0.007122  0.015346  0.037787  0.041287  0.049
#> Class4       0.124130  0.129555  0.135374  0.158407  0.183404  0.187520  0.193
#> Class5      -0.147714 -0.137897 -0.130838 -0.081565 -0.029834 -0.019777 -0.011
#> 
#> Effective Number of Parameters: 8.057135 
#> Expected Residual Deviance: 220.2001 
#> DIC: -107.1082 
#> 
#> Expected Mean dispersion: 7.840919 
#> Sq.root of Expected Mean dispersion: 2.800164 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 2.562
    options(oldopt)

The Bayesian summary will report:

Because dispersion is fixed at the same value in both models, differences between out and out_glmb reflect primarily the influence of the prior on \(\beta\), not differences in \(\phi\).

6. Interpretation and Model Diagnostics

6.1 Coefficients

Under the log link, coefficient interpretations are multiplicative:

With a weak prior, posterior means in out_glmb will be close to the MLEs in out, but with slightly more regularization and smaller posterior standard deviations.

6.2 Dispersion

By fixing dispersion at disp from the classical model, the Bayesian and classical fits share the same assumed level of variability.

This allows:

If desired, a more advanced model can estimate dispersion by switching to a pfamily such as dNormal_Gamma, but that is covered in more detail in the dispersion‑focused vignettes.

6.3 DIC and Model Fit

The Bayesian summary of out_glmb reports:

These can be compared to AIC and deviance from the classical model, especially when considering alternative specifications (for example, reduced models, different covariate sets, or alternative link functions).

7. Concluding Discussion

Gamma regression provides a flexible framework for modeling positive, skewed response variables (McCullagh and Nelder 1989; Gelman et al. 2013).
In this chapter, we have:

In later vignettes, we extend these ideas to:

References

Agresti, Alan. 2015. Foundations of Linear and Generalized Linear Models. Cambridge University Press.
Gelman, Andrew, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin. 2013. Bayesian Data Analysis. 3rd ed. CRC Press.
Griffin, Jim E., and Philip J. Brown. 2010. “Inference with Normal-Gamma Prior Distributions in Regression Problems.” Bayesian Analysis 5 (1): 171–88. https://doi.org/10.1214/10-BA507.
McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.
Nelder, J. A., and R. W. M. Wedderburn. 1972. “Generalized Linear Models.” Journal of the Royal Statistical Society. Series A (General) 135 (3): 370–84. https://doi.org/10.2307/2344614.
Spiegelhalter, David J., Nicky G. Best, Bradley P. Carlin, and Angelika van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (4): 583–639. https://doi.org/10.1111/1467-9868.00353.
Venables, W. N., and B. D. Ripley. 2002. Modern Applied Statistics with S. Springer.