Chapter 13 focused on hierarchical linear models: the Eight Schools example with normally distributed group-level effects and known sampling variances. This chapter extends to hierarchical generalized linear models (Gelman and Hill 2007; Gelman et al. 2013), specifically Poisson regression with observation-level random effects.
We use the BikeSharing dataset and
rglmb to implement a two-block Gibbs
sampler. The Poisson likelihood is non-conjugate with the normal prior
on the linear predictor, so each observation-level update requires an
accept–reject step (Robert and Casella
2004) via rglmb. For a full demo,
see demo("Ex_09_BikeSharingPoisson").
Let \(y_i\) denote the count (hourly bike rentals) for observation \(i\), and \(x_i\) the covariate vector. The hierarchical model is:
The two-block Gibbs sampler alternates:
rglmb (family
gaussian()).rglmb (family
poisson()).With \(n\) observations, Block 2
requires \(n\) calls to
rglmb per iteration. To keep run time
manageable for the vignette, we use a 1% random subset of the data for
training and reserve the rest for out-of-sample prediction. The chain
used in Sections 4–5 is precomputed with the same
settings as
demo("Ex_09_BikeSharingPoisson")
(n_burn = 200, n_sim = 1000) and stored as
inst/extdata/BikeSharing_ch14_gibbs.rds.
Section 3 still shows the full Gibbs code for reference; it is not
re-run when this vignette is built.
data("BikeSharing")
# Center continuous predictors
cont_vars <- c("temp", "atemp", "hum", "windspeed", "hr_sin", "hr_cos", "mon_sin", "mon_cos")
BikeSharing_c <- BikeSharing
BikeSharing_c[cont_vars] <- scale(BikeSharing[cont_vars], center = TRUE, scale = FALSE)
# Formula (all variable model)
form <- cnt ~ part_of_day + quarter + holiday + workingday + weathersit +
hr_sin + hr_cos + mon_sin + mon_cos + temp + atemp + hum + windspeed
# Formula (Limited variable model)
form2 <- cnt ~ part_of_day + quarter + holiday + workingday + weathersit +
hr_sin + hr_cos + mon_sin + mon_cos
# Train/test split: indices bundled with precomputed Gibbs output (matches demo, set.seed(42))
pct_train <- 0.01
n <- nrow(BikeSharing_c)
ch14_path <- system.file("extdata", "BikeSharing_ch14_gibbs.rds", package = "glmbayes")
stopifnot(nzchar(ch14_path), file.exists(ch14_path))
ch14_saved <- readRDS(ch14_path)
stopifnot(length(ch14_saved$idx_train) == round(pct_train * n))
idx_train <- ch14_saved$idx_train
idx_test <- setdiff(seq_len(n), idx_train)
Bike_train <- BikeSharing_c[idx_train, ]
Bike_test <- BikeSharing_c[idx_test, ]
X_train <- model.matrix(form2, data = Bike_train)
X_test <- model.matrix(form2, data = Bike_test)
y_train <- Bike_train$cnt
y_test <- Bike_test$cnt
n_train <- length(y_train)
n_test <- length(y_test)
p <- ncol(X_train)
# Initial theta and prior for population block
theta <- log(y_train + 0.5)
data_pop <- data.frame(theta = theta, Bike_train)
form_pop <- theta ~ part_of_day + quarter + holiday + workingday + weathersit +
hr_sin + hr_cos + mon_sin + mon_cos
ps_pop <- Prior_Setup(form_pop, family = gaussian(), data = data_pop)
#> Using default pwt = 0.05 (high-d default).The following chunk is not evaluated when the
vignette is built (eval = FALSE). For a live run, execute
it in your session or use
demo("Ex_09_BikeSharingPoisson") (expect a
long runtime).
n_burn <- 200
n_sim <- 1000
beta_out <- matrix(0, nrow = n_sim, ncol = p)
sigma_out <- numeric(n_sim)
theta_out <- matrix(0, nrow = n_sim, ncol = n_train)
set.seed(123)
# Burn-in
burn_time <- system.time({
for (k in seq_len(n_burn)) {
out_pop <- rglmb(1, theta, X_train, family = gaussian(),
pfamily = dNormal_Gamma(ps_pop$mu, Sigma_0 = ps_pop$Sigma_0,
ps_pop$shape, ps_pop$rate))
beta <- as.vector(out_pop$coefficients[1, ])
sigma_theta_sq <- out_pop$dispersion[1]
mu_all <- as.vector(X_train %*% beta)
for (i in seq_len(n_train)) {
theta[i] <- rglmb(1, y_train[i], matrix(1, 1, 1), family = poisson(),
pfamily = dNormal(mu = mu_all[i], Sigma = sigma_theta_sq))$coefficients[1, 1]
}
}
})
burn_time
# Main simulation
sim_time <- system.time({
for (k in seq_len(n_sim)) {
out_pop <- rglmb(1, theta, X_train, family = gaussian(),
pfamily = dNormal_Gamma(ps_pop$mu, Sigma_0 = ps_pop$Sigma_0,
ps_pop$shape, ps_pop$rate))
beta <- as.vector(out_pop$coefficients[1, ])
sigma_theta_sq <- out_pop$dispersion[1]
mu_all <- as.vector(X_train %*% beta)
for (i in seq_len(n_train)) {
theta[i] <- rglmb(1, y_train[i], matrix(1, 1, 1), family = poisson(),
pfamily = dNormal(mu = mu_all[i], Sigma = sigma_theta_sq))$coefficients[1, 1]
}
beta_out[k, ] <- beta
sigma_out[k] <- sqrt(sigma_theta_sq)
theta_out[k, ] <- theta
}
})
sim_timeLoad the precomputed draws (regenerate with
demo("Ex_09_BikeSharingPoisson") and save
with saveRDS to
inst/extdata/BikeSharing_ch14_gibbs.rds).
We summarize the main parameters (\(\beta\) and \(\sigma_\theta\)) using the
coda package (Plummer et al.
2006). Random effects \(\theta\)
are excluded from the summary. Object mcmc_main comes from
the saved chain (coefficient columns match X_train, plus
sigma_theta).
summary(mcmc_main)
#>
#> Iterations = 1:1000
#> Thinning interval = 1
#> Number of chains = 1
#> Sample size per chain = 1000
#>
#> 1. Empirical mean and standard deviation for each variable,
#> plus standard error of the mean:
#>
#> Mean SD Naive SE Time-series SE
#> (Intercept) 3.51912 0.26805 0.008477 0.008445
#> part_of_dayMorning 2.32069 0.27789 0.008788 0.009502
#> part_of_dayAfternoon 1.79354 0.36705 0.011607 0.012202
#> part_of_dayEvening 1.38643 0.28336 0.008961 0.009674
#> quarterQ2 -0.10774 0.30643 0.009690 0.010874
#> quarterQ3 -0.07867 0.36867 0.011658 0.012734
#> quarterQ4 0.45117 0.26950 0.008522 0.009739
#> holiday -0.05947 0.40337 0.012756 0.012170
#> workingday -0.10572 0.11632 0.003678 0.003678
#> weathersit -0.16806 0.08173 0.002584 0.002923
#> hr_sin -0.71063 0.18303 0.005788 0.006263
#> hr_cos -0.06914 0.18005 0.005694 0.005848
#> mon_sin -0.18727 0.18224 0.005763 0.006030
#> mon_cos -0.51362 0.19698 0.006229 0.006721
#> sigma_theta 0.67173 0.03751 0.001186 0.001338
#>
#> 2. Quantiles for each variable:
#>
#> 2.5% 25% 50% 75% 97.5%
#> (Intercept) 2.98100 3.3490 3.51721 3.69824 4.048539
#> part_of_dayMorning 1.79388 2.1413 2.31747 2.50564 2.864538
#> part_of_dayAfternoon 1.09263 1.5512 1.79171 2.03654 2.535714
#> part_of_dayEvening 0.83131 1.1894 1.38755 1.57537 1.920033
#> quarterQ2 -0.68878 -0.3203 -0.10473 0.11176 0.474540
#> quarterQ3 -0.86117 -0.3138 -0.07295 0.15504 0.646679
#> quarterQ4 -0.06511 0.2687 0.45146 0.62428 0.979505
#> holiday -0.87759 -0.3393 -0.07459 0.22403 0.729777
#> workingday -0.34146 -0.1813 -0.10715 -0.02728 0.120492
#> weathersit -0.31819 -0.2261 -0.16709 -0.11521 -0.005618
#> hr_sin -1.06890 -0.8303 -0.71183 -0.58576 -0.364450
#> hr_cos -0.43660 -0.1952 -0.06709 0.05361 0.274436
#> mon_sin -0.55833 -0.3092 -0.17928 -0.06931 0.167212
#> mon_cos -0.88295 -0.6536 -0.51927 -0.38338 -0.126460
#> sigma_theta 0.60361 0.6465 0.66966 0.69657 0.748600
es <- coda::effectiveSize(mcmc_main)
knitr::kable(
data.frame(parameter = names(es), effective_size = as.numeric(es)),
row.names = FALSE,
digits = 4,
caption = "Effective sample size (coda::effectiveSize)"
)| parameter | effective_size |
|---|---|
| (Intercept) | 1007.6068 |
| part_of_dayMorning | 855.2566 |
| part_of_dayAfternoon | 904.8935 |
| part_of_dayEvening | 857.9380 |
| quarterQ2 | 794.0825 |
| quarterQ3 | 838.1824 |
| quarterQ4 | 765.8225 |
| holiday | 1098.5595 |
| workingday | 1000.0000 |
| weathersit | 781.9330 |
| hr_sin | 854.0864 |
| hr_cos | 947.7249 |
| mon_sin | 913.3215 |
| mon_cos | 859.0090 |
| sigma_theta | 785.7880 |
ac1 <- coda::autocorr(mcmc_main, lag = 1)
ac1_mat <- drop(ac1)
own_ac1 <- diag(ac1_mat)
names(own_ac1) <- colnames(mcmc_main)
knitr::kable(
data.frame(parameter = names(own_ac1), lag_1_autocorr = as.numeric(own_ac1)),
row.names = FALSE,
digits = 4,
caption = "Lag-1 autocorrelation (diagonal of coda::autocorr, lag = 1)"
)| parameter | lag_1_autocorr |
|---|---|
| (Intercept) | 0.0697 |
| part_of_dayMorning | 0.0775 |
| part_of_dayAfternoon | 0.0494 |
| part_of_dayEvening | 0.0760 |
| quarterQ2 | 0.1143 |
| quarterQ3 | 0.0875 |
| quarterQ4 | 0.0491 |
| holiday | 0.0590 |
| workingday | 0.0134 |
| weathersit | 0.0754 |
| hr_sin | 0.0782 |
| hr_cos | 0.0872 |
| mon_sin | 0.0448 |
| mon_cos | 0.0753 |
| sigma_theta | 0.1195 |
We compare two prediction strategies on the held-out test set:
beta_mean <- colMeans(beta_out)
sigma_mean <- mean(sigma_out)
# Option A: conditional mean
y_pred_cond <- exp(X_test %*% beta_mean)
mae_cond <- mean(abs(y_test - y_pred_cond))
rmse_cond <- sqrt(mean((y_test - y_pred_cond)^2))
# Option B: posterior predictive mean
n_pred <- 500
y_pred_samples <- matrix(0, nrow = n_pred, ncol = n_test)
for (s in seq_len(n_pred)) {
idx_s <- sample(n_sim, 1)
beta_s <- beta_out[idx_s, ]
sigma_s <- sigma_out[idx_s]
theta_test <- rnorm(n_test, mean = X_test %*% beta_s, sd = sigma_s)
y_pred_samples[s, ] <- rpois(n_test, lambda = exp(theta_test))
}
y_pred_mean <- colMeans(y_pred_samples)
mae_pp <- mean(abs(y_test - y_pred_mean))
rmse_pp <- sqrt(mean((y_test - y_pred_mean)^2))
cat("Option A (conditional): MAE =", round(mae_cond, 2), " RMSE =", round(rmse_cond, 2), "\n")
#> Option A (conditional): MAE = 87.27 RMSE = 136.55
cat("Option B (post. pred.): MAE =", round(mae_pp, 2), " RMSE =", round(rmse_pp, 2), "\n")
#> Option B (post. pred.): MAE = 96.16 RMSE = 137.24| Component | Implementation |
|---|---|
| Block 1 (population) | rglmb(1, theta, X, family = gaussian(), pfamily = dNormal_Gamma(...)) |
| Block 2 (observations) | For each \(i\):
rglmb(1, y[i], matrix(1,1,1), family = poisson(), pfamily = dNormal(mu_i, sigma^2)) |
| Prior for \(\beta, \sigma^2\) | From Prior_Setup on \(\theta
\sim X\beta\) (Gaussian) |
| Data | BikeSharing (1% train, 99% test for vignette) |
| Vignette chain | Precomputed BikeSharing_ch14_gibbs.rds (200 burn-in,
1000 stored iterations) |
For longer runs and full diagnostics, see
demo("Ex_09_BikeSharingPoisson"). For
hierarchical linear models, see Chapter 13.