Introduction

In this section, we continue the discussion of the Bayesian model fit to the menarche data from the previous chapter. Recall that our main model (with the carefully specified prior) had the summary output below. The focus of this chapter will be to dig deeper and to examine model predictions/fit, residuals, outliers, and the DIC statistic for model comparisons (Spiegelhalter et al. 2002). GLM background appears in (McCullagh and Nelder 1989); the sampling approach follows (Nygren and Nygren 2006).

summary(glmb.out1)
#> Call
#> glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(logit), 
#>     pfamily = dNormal(mu = mu1, Sigma = V1), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -2.0736194 -0.9755338 -0.4294312  0.7481039  1.3641200 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   0.36015    0.00000  0.43229  -0.01081   0.063
#> Age2          0.00000    1.09861  0.32521   1.63197   0.059
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>             Post.Mode Post.Mean   Post.Sd  MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.013688 -0.019693  0.064491  0.002039      0.000000    0.015
#> Age2         1.614377  1.618144  0.055033  0.001740      0.000000    0.000
#>             Pr(Prior_tail)    
#> (Intercept)          0.381    
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 28.7851   9.0930
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                 1.0%     2.5%     5.0%   Median    95.0%    97.5% 99.0%
#> (Intercept) -0.17473 -0.14778 -0.12127 -0.02201  0.08704  0.10776 0.125
#> Age2         1.49675  1.51001  1.52813  1.61678  1.70743  1.72662 1.748
#> 
#> Effective Number of Parameters: 1.943787 
#> Expected Residual Deviance: 28.72451 
#> DIC: 114.7201 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.302

Reviewing Model Predictions

We start with a look a model predictions, both in and out of sample. The predict function can be used for both of these purposes as we illustrate below.

Predictions for Actual Data

Generating in-sample predictions requires just a simple call to the predict function in order to extract the required information of the glmb class object (in this case we choose to extract the predictions for the “response). Once extracted, the randomized draws for the predictions can be analyzed further. Here we choose to so for my just looking at the mean predictions (we look at other outputs for the out of sample prediction below) and we then plot those for comparison to the underlying data.

As we can see from this plot, the predictions in general seems visually to match the underlying data relatively well although there may be some extent to which it overstates the percentage of girls who have had their first period for the lower age range. It is also hard to judge from this plot whether some of the points are outliers or whether they fall within a range where we would expect the observations to fall. We explore both of those issues further later in this chapter.

## Predicting for original data

pred1=predict(glmb.out1,type="response")
pred1_m=colMeans(pred1)

plot(Menarche/Total ~ Age, data=menarche,main="% of girls Who have had their first period")
lines(menarche$Age, pred1_m,lty=1)

Predictions for “New” Data

In order to understand the remaining uncertainty tied to our model and predictions for “new” observations, we use simulated value for the coefficients in order to simulate new data over a range of different ages so that we can provide “credible sets” around the model. We use the median sample size for the underlying observations in the the model. Let’s us briefly touch on each of the steps below.

In order to perform the simulation, we first set up a new dataset for the independent variables by creating a new age variable and setting the number of observations to simulate. We store the new independent variable in a a new data frame with the same variable names as in the original dataset.

## Pull Model Info
mod_Object=glmb.out1
n=nrow(mod_Object$coefficients)
obs_size=median(menarche$Total) ## Median of Total counts

# Generate New Data for The Independent Variable
Age_New <- seq(8, 20, 0.25)
Age2_New=Age_New-13

# olddata - Should contain all variables used in original model formula
olddata=data.frame(Age=menarche$Age,
Menarche=menarche$Menarche,Total=menarche$Total,Age2=Age2)

# Should contain all variables used on RHS of model formula
newdata=data.frame(Age=Age_New,Age2=Age2_New)

In order to perform the simulation, we first pass the output from the glmb, the newdata, the olddata, and the type to the predict function. We then set up a matrix into which we store the newly simulated values for the dependent variable and simulate from rbinom function using the predicted response values from the response function.

## Predict and Simulate for newdata

pred_menarche=predict(mod_Object,newdata=newdata,olddata=olddata,type="response")
pred_y=matrix(0,nrow=n,ncol=length(Age_New))
for(i in 1:n){
  pred_y[i,1:length(Age_New)]=rbinom(length(Age_New),size=obs_size,
  prob=pred_menarche[i,1:length(Age_New)])
}

The output is summarized by producing both means and credible set information for proportions based on the simulated coefficients and the simulated data (the intervals for the latter will be wider). The latter interval is more appropriate when examining if the actual data appears to correspond to outliers.


## Produce mean and quantile information

pred_m=colMeans(pred_menarche)
pred_y_m=colMeans(pred_y/obs_size)
quant1_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)
quant1_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

We now plot most of these summarized measures and the original data together in order to get a visualization of the various predictions and data. This plot suggests that the individual observations in the original data generally tends to fall between the lower and upper bounds credible sets based on the simulated data although it is hard to tell at lower end of the age scale based on this chart. We will explore the latter issue a bit more later in this chapter.


plot(Menarche/Total ~ Age, data=menarche,main="Percentage of girls Who have had their first period")
lines(Age_New, pred_m,lty=1)
lines(Age_New, quant1_m,lty=2)
lines(Age_New, quant2_m,lty=2)
lines(Age_New, quant1_m_y,lty=2)
lines(Age_New, quant2_m_y,lty=2)

We close this section with a comparison of our prior and posterior credible intervals. We store the related information for our posterior simulated credible intervals in a data frame and then look at these in comparison o the prior credible intervals. What we see is that the prior credible interval at age 13 is wider than the posterior credible interval and in fact contains the posterior interval. At age 15, the two posterior credible set is still contained within the prior credible set but skews heavily towards the upper end of the prior credible set.


outdata=data.frame(Age=Age_New,Age2=Age2_New,pred_m=pred_m, quant1_m, quant2_m,quant1_m_y,quant2_m_y)

print("Prior credible set at age 13 was")
#> [1] "Prior credible set at age 13 was"
cbind(m1_lower,m1_upper)
#>      m1_lower m1_upper
#> [1,]      0.3      0.7

print("Prior credible set at age 15 was")
#> [1] "Prior credible set at age 15 was"
cbind(m2_lower,m2_upper)
#>      m2_lower m2_upper
#> [1,]      0.7    0.972

print("Posterior Predictions between ages 13 and 15 are")
#> [1] "Posterior Predictions between ages 13 and 15 are"
outdata[which(outdata$Age >= 13&outdata$Age <= 15 ),]
#>      Age Age2    pred_m  quant1_m  quant2_m quant1_m_y quant2_m_y
#> 21 13.00 0.00 0.4950818 0.4631213 0.5269138  0.3997619  0.6000000
#> 22 13.25 0.25 0.5949412 0.5650843 0.6267817  0.5045238  0.6857143
#> 23 13.50 0.50 0.6875006 0.6593774 0.7176085  0.5904762  0.7809524
#> 24 13.75 0.75 0.7671624 0.7417636 0.7939361  0.6857143  0.8476190
#> 25 14.00 1.00 0.8314705 0.8096203 0.8547607  0.7523810  0.9047619
#> 26 14.25 1.25 0.8807580 0.8620487 0.9001824  0.8095238  0.9428571
#> 27 14.50 1.50 0.9170546 0.9016905 0.9321522  0.8571429  0.9714286
#> 28 14.75 1.75 0.9430077 0.9306520 0.9547214  0.8952381  0.9809524
#> 29 15.00 2.00 0.9611774 0.9516589 0.9700030  0.9142857  0.9904762

Reviewing Model Residuals

We now turn to a more careful examination of the deviance residuals. We first extract the residuals from the glmb object and then produce the mean and credible intervals so we can plot them later in this section.


## Get Residuals for model, their means, and credible bounds
res_out=residuals(glmb.out1)
res_mean=colMeans(res_out)
sqrt(mean(res_mean^2))
#> [1] 1.036411

res_low1=apply(res_out,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high1=apply(res_out,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

To understand whether any of the above residuals might be unexpected outliers, we now use the simulated predictions in order to simulate new data for each observation in the original dataset to see what construct a credible set for the residuals.

## Simulate new data and get residuals for simulated data

ysim1=simulate(glmb.out1,nsim=1,seed=10401L,pred=pred1,family="binomial",
                        prior.weights=weights(glmb.out1))
res_ysim_out1=residuals(glmb.out1,ysim=ysim1)
res_low=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

Using the simulate credible sets, we now plot them together. This reveals that the expected value for the actual residuals all fall within the credible sets of the simulated residuals for the new data. However, the credible sets for some of residuals at the lower age falls outside of those bounds. We will look at how alternate model specifications can avoid this later in this chapter.

# Plot Credible Interval bounds for Deviance Residuals

plot(res_mean~Age,ylim=c(-2.5,2.5),
  main="Credible Interval Bounds for Logit Model Deviance Residuals",
  xlab = "Age", ylab = "Dev. Residuals")
lines(Age, 0*res_mean,lty=1)
lines(Age, res_low,lty=1)
lines(Age, res_high,lty=1)
lines(Age, res_low1,lty=2)
lines(Age, res_high1,lty=2)

Using the DIC Statistic

The DIC Statistic is mainly useful when comparing the predictions from several models with the same likelihood function but with different specifications (say different link-functions or different model variables). Here, we will explore alternative link functions for the Menarche data by using the Probit and Clog-Log Models and will then compare the models using the DIC Statistic and related outputs.

Running The Probit Model

The Probit model uses an alternative link function based on the cumulative distribution function for the normal distribution instead of the cumulative distribution function for logistic distribution. Below, we use a similar approach to the prior specification as we did in the earlier chapter for the Logit model and then produce the corresponding output. We will not comment further on those results and will hold off on further discussion until our subsection comparing the DIC statistic from the various models.

## ----Probit: set up link function info and initialize prior matrices-----------

## Set up link function and initialize prior mean and Variance-Covariance matrices
bi_probit <- binomial(link="probit")
mu2<-matrix(0,nrow=nvars,ncol=1)
rownames(mu2)=c("Intercept","Age2")
colnames(mu2)=c("Prior Mean")
V2<-1*diag(nvars)
rownames(V2)=c("Intercept","Age2")
colnames(V2)=c("Intercept","Age2")


## ----Probit:set prior means----------------------------------------------------
## Prior mean for intercept is set to point estimate 
## at reference age1 (on probit scale)
mu2[1,1]=bi_probit$linkfun(m1)

## Prior mean for slope is set to difference in point estimates
## on probit scale divided by Age_Diff

mu2[2,1]=(bi_probit$linkfun(m2) -bi_probit$linkfun(m1))/Age_Diff 
print(mu2)
#>           Prior Mean
#> Intercept  0.0000000
#> Age2       0.6407758


## ----Probit:set prior Variance Covariance matrix-------------------------------
## Implied standard deviations for point estimates on probit scale

sd_m1= (bi_probit$linkfun(m1) -bi_probit$linkfun(m1_lower))/1.96
sd_m2= (bi_probit$linkfun(m2) -bi_probit$linkfun(m2_lower))/1.96

## Implied Standard deviation for slope (using variance formula for difference between two variables)
a=(1/Age_Diff)
sd_slope=sqrt((a*sd_m1)^2+(a*sd_m2)^2-2*a*a*(sd_m1*sd_m2*m_corr))

#Cov(m1,slope)=cov(m1, a*(m2-m1)) =a*E[(m1-E[m1])((m2-m1)-E[m2-m1])]
#   =a*E[(m1-E[m1])(m2-E[m2])]- a* E[(m1-E[m1])(m1-E[m1])]
##   =a*Cov[m1,m2] - a*Var[m1]
##  =a*sd_m1*sd_m2*m_corr-a* sd_m1*sd_m1
cov_V2=a*sd_m1*sd_m2*m_corr-a* sd_m1*sd_m1

# Set covariance matrix
V2[1,1]=sd_m1^2
V2[2,2]=sd_slope^2
V2[1,2]=cov_V2
V2[2,1]=V2[1,2]
print(V2)
#>             Intercept        Age2
#> Intercept  0.07158369 -0.01512075
#> Age2      -0.01512075  0.03453205


## ----Run Probit,results = "hide"-----------------------------------------------
Menarche_Model_Data=data.frame(Age=menarche$Age,Total=menarche$Total,
                               Menarche=menarche$Menarche,Age2)

glmb.out2<-glmb(n=1000,cbind(Menarche, Total-Menarche) ~Age2,family=binomial(probit),
                pfamily=dNormal(mu=mu2,Sigma=V2),data=Menarche_Model_Data)

print(glmb.out2)
#> 
#> Call:  glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(probit), 
#>     pfamily = dNormal(mu = mu2, Sigma = V2), n = 1000, data = Menarche_Model_Data)
#> 
#> Posterior Mean Coefficients:
#> (Intercept)         Age2  
#>     -0.0198       0.9030  
#> 
#> Effective Number of Parameters: 1.943412 
#> Expected Residual Deviance: 24.86448 
#> DIC: 110.8597
summary(glmb.out2)
#> Call
#> glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(probit), 
#>     pfamily = dNormal(mu = mu2, Sigma = V2), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -1.6097129 -0.9189120 -0.4755863  0.4770514  1.7951986 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   0.22517    0.00000  0.26755  -0.01724   0.035
#> Age2          0.00000    0.64078  0.18583   0.90782   0.030
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>              Post.Mode  Post.Mean    Post.Sd   MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.0186254 -0.0198020  0.0335817  0.0010619     0.0000000    0.014
#> Age2         0.9007094  0.9029892  0.0296440  0.0009374     0.0000000    0.000
#>             Pr(Prior_tail)    
#> (Intercept)          0.276    
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 33.1971   9.6099
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                 1.0%     2.5%     5.0%   Median    95.0%    97.5% 99.0%
#> (Intercept) -0.09196 -0.08367 -0.07457 -0.02171  0.03719  0.04469 0.059
#> Age2         0.83583  0.84606  0.85482  0.90388  0.95229  0.95959 0.971
#> 
#> Effective Number of Parameters: 1.943412 
#> Expected Residual Deviance: 24.86448 
#> DIC: 110.8597 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.249

Running The Clog-Log Model

The Clog-Log models offers an additional alternative model for the binomial data. Here we set up a model using this link function. Unlike the Logit and Probit models, the Clog-Log link function is not symmetric so we run an alternative models below with the reverse sign. Again, we will hold off on interpretation until later in the chapter.


## ----cloglog: set up link function info and initialize prior matrices-----------

## Set up link function and initialize prior mean and Variance-Covariance matrices
bi_cloglog <- binomial(link="cloglog")
mu3<-matrix(0,nrow=nvars,ncol=1)
rownames(mu3)=c("Intercept","Age2")
colnames(mu3)=c("Prior Mean")
V3<-1*diag(nvars)
rownames(V3)=c("Intercept","Age2")
colnames(V3)=c("Intercept","Age2")


## ----cloglog:set prior means----------------------------------------------------
## Prior mean for intercept is set to point estimate 
## at reference age1 (on cloglog scale)
mu3[1,1]=bi_cloglog$linkfun(m1)

## Prior mean for slope is set to difference in point estimates
## on cloglog scale divided by Age_Diff

mu3[2,1]=(bi_cloglog$linkfun(m2) -bi_cloglog$linkfun(m1))/Age_Diff 
print(mu3)
#>           Prior Mean
#> Intercept -0.3665129
#> Age2       0.6002727


## ----Cloglog:set prior Variance Covariance matrix-------------------------------
## Implied standard deviations for point estimates on clolog scale

sd_m1= (bi_cloglog$linkfun(m1) -bi_cloglog$linkfun(m1_lower))/1.96
sd_m2= (bi_cloglog$linkfun(m2) -bi_cloglog$linkfun(m2_lower))/1.96

## Implied Standard deviation for slope (using variance formula for difference between two variables)
a=(1/Age_Diff)
sd_slope=sqrt((a*sd_m1)^2+(a*sd_m2)^2-2*a*a*(sd_m1*sd_m2*m_corr))

#Cov(m1,slope)=cov(m1, a*(m2-m1)) =a*E[(m1-E[m1])((m2-m1)-E[m2-m1])]
#   =a*E[(m1-E[m1])(m2-E[m2])]- a* E[(m1-E[m1])(m1-E[m1])]
##   =a*Cov[m1,m2] - a*Var[m1]
##  =a*sd_m1*sd_m2*m_corr-a* sd_m1*sd_m1
cov_V3=a*sd_m1*sd_m2*m_corr-a* sd_m1*sd_m1

# Set covariance matrix
V3[1,1]=sd_m1^2
V3[2,2]=sd_slope^2
V3[1,2]=cov_V3
V3[2,1]=V3[1,2]
print(V3)
#>             Intercept        Age2
#> Intercept  0.11491322 -0.03502783
#> Age2      -0.03502783  0.03365986


## ----Run cloglog,results = "hide"-----------------------------------------------
Menarche_Model_Data=data.frame(Age=menarche$Age,Total=menarche$Total,
                               Menarche=menarche$Menarche,Age2)

glmb.out3<-glmb(n=1000,cbind(Menarche, Total-Menarche) ~Age2,family=binomial(cloglog),
                pfamily=dNormal(mu=mu3,Sigma=V3),data=Menarche_Model_Data)
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred

print(glmb.out3)
#> 
#> Call:  glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(cloglog), 
#>     pfamily = dNormal(mu = mu3, Sigma = V3), n = 1000, data = Menarche_Model_Data)
#> 
#> Posterior Mean Coefficients:
#> (Intercept)         Age2  
#>     -0.5933       0.9464  
#> 
#> Effective Number of Parameters: 1.980864 
#> Expected Residual Deviance: 120.8574 
#> DIC: 206.8901
summary(glmb.out3)
#> Call
#> glmb(formula = cbind(Menarche, Total - Menarche) ~ Age2, family = binomial(cloglog), 
#>     pfamily = dNormal(mu = mu3, Sigma = V3), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>       Min        1Q    Median        3Q       Max 
#> -3.976013 -2.540986 -1.121601  1.196521  3.400770 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   -0.1173    -0.3665   0.3390   -0.5960   0.041
#> Age2           0.0000     0.6003   0.1835    0.9530   0.031
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>              Post.Mode  Post.Mean    Post.Sd   MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.5922412 -0.5932565  0.0421467  0.0013328     0.0000000        0
#> Age2         0.9441886  0.9463981  0.0282187  0.0008924     0.0000000        0
#>             Pr(Prior_tail)    
#> (Intercept)         <2e-16 ***
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 33.4992  12.2175
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                1.0%    2.5%    5.0%  Median   95.0%   97.5%  99.0%
#> (Intercept) -0.6930 -0.6756 -0.6643 -0.5928 -0.5266 -0.5133 -0.500
#> Age2         0.8801  0.8897  0.8983  0.9468  0.9932  1.0024  1.013
#> 
#> Effective Number of Parameters: 1.980864 
#> Expected Residual Deviance: 120.8574 
#> DIC: 206.8901 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.271

Running The Reverse Clog-Log Model

Here we set up and run the reverse clog-log model. It models the opposite probability and is not equivalent to the earlier models because of the asymmetry of the distribution.


## ----cloglog: set up link function info and initialize prior matrices-----------

## Set up link function and initialize prior mean and Variance-Covariance matrices
bi_cloglog <- binomial(link="cloglog")
mu4<-matrix(0,nrow=nvars,ncol=1)
rownames(mu4)=c("Intercept","Age2")
colnames(mu4)=c("Prior Mean")
V4<-1*diag(nvars)
rownames(V4)=c("Intercept","Age2")
colnames(V4)=c("Intercept","Age2")


## ----cloglog:set prior means----------------------------------------------------
## Prior mean for intercept is set to point estimate 
## at reference age1 (on logit scale)
mu4[1,1]=bi_cloglog$linkfun((1-m1))

## Prior mean for slope is set to difference in point estimates
## on logit scale divided by Age_Diff

mu4[2,1]=(bi_cloglog$linkfun((1-m2))-bi_cloglog$linkfun((1-m1)))/Age_Diff 
print(mu4)
#>           Prior Mean
#> Intercept -0.3665129
#> Age2      -0.9419272


## ----Probit:set prior Variance Covariance matrix-------------------------------
## Implied standard deviations for point estimates on logit scale

sd_m1= (bi_cloglog$linkfun((1-m1_lower))-bi_cloglog$linkfun((1-m1)))/1.96
sd_m2= (bi_cloglog$linkfun((1-m2_lower))-bi_cloglog$linkfun((1-m2)))/1.96

## Implied Standard deviation for slope (using variance formula for difference between two variables)
a=(1/Age_Diff)
sd_slope=sqrt((a*sd_m1)^2+(a*sd_m2)^2-2*a*a*(sd_m1*sd_m2*m_corr))

#Cov(m1,slope)=cov(m1, a*(m2-m1)) =a*E[(m1-E[m1])((m2-m1)-E[m2-m1])]
#   =a*E[(m1-E[m1])(m2-E[m2])]- a* E[(m1-E[m1])(m1-E[m1])]
##   =a*Cov[m1,m2] - a*Var[m1]
##  =a*sd_m1*sd_m2*m_corr-a* sd_m1*sd_m1
cov_V4=a*sd_m1*sd_m2*m_corr-a* sd_m1*sd_m1

# Set covariance matrix
V4[1,1]=sd_m1^2
V4[2,2]=sd_slope^2
V4[1,2]=cov_V4
V4[2,1]=V4[1,2]
print(V4)
#>              Intercept         Age2
#> Intercept  0.079357097 -0.004625472
#> Age2      -0.004625472  0.081557487


## ----Run cloglog,results = "hide"-----------------------------------------------
Menarche_Model_Data=data.frame(Age=menarche$Age,Total=menarche$Total,
                               Menarche=menarche$Menarche,Age2)

glmb.out4<-glmb(n=1000,cbind(Total-Menarche,Menarche) ~Age2,family=binomial(cloglog),
                pfamily=dNormal(mu=mu4,Sigma=V4),data=Menarche_Model_Data)

print(glmb.out4)
#> 
#> Call:  glmb(formula = cbind(Total - Menarche, Menarche) ~ Age2, family = binomial(cloglog), 
#>     pfamily = dNormal(mu = mu4, Sigma = V4), n = 1000, data = Menarche_Model_Data)
#> 
#> Posterior Mean Coefficients:
#> (Intercept)         Age2  
#>     -0.5791      -1.0780  
#> 
#> Effective Number of Parameters: 2.023384 
#> Expected Residual Deviance: 36.67336 
#> DIC: 122.7485
summary(glmb.out4)
#> Call
#> glmb(formula = cbind(Total - Menarche, Menarche) ~ Age2, family = binomial(cloglog), 
#>     pfamily = dNormal(mu = mu4, Sigma = V4), n = 1000, data = Menarche_Model_Data)
#> 
#> Expected Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -2.9157034 -0.8127396 -0.4476041  0.4919149  2.4723620 
#> 
#> Prior and Maximum Likelihood Estimates with Standard Deviations
#> 
#>             Null Mode Prior Mean Prior.sd Max Like. Like.sd
#> (Intercept)   -0.6364    -0.3665   0.2817   -0.5836   0.043
#> Age2           0.0000    -0.9419   0.2856   -1.0790   0.036
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>             Post.Mode Post.Mean   Post.Sd  MC Error Pr(Null_tail) SE(tail)
#> (Intercept) -0.577562 -0.579122  0.042663  0.001349      0.088000        0
#> Age2        -1.075289 -1.078039  0.034946  0.001105      0.000000        0
#>             Pr(Prior_tail)    
#> (Intercept)         <2e-16 ***
#> Age2                <2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>   Directional Tail Summaries:
#> 
#>                Metric vs Null vs Prior
#>  Mahalanobis Distance 32.7030   5.0037
#>      Tail Probability  0.0000   0.0000
#>   [Tail probabilities are P(delta^T * Z <= 0) in whitened space]
#> 
#> 
#> Distribution Percentiles
#> 
#>                1.0%    2.5%    5.0%  Median   95.0%   97.5%  99.0%
#> (Intercept) -0.6723 -0.6616 -0.6505 -0.5785 -0.5088 -0.4957 -0.481
#> Age2        -1.1545 -1.1427 -1.1332 -1.0784 -1.0180 -1.0105 -0.992
#> 
#> Effective Number of Parameters: 2.023384 
#> Expected Residual Deviance: 36.67336 
#> DIC: 122.7485 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.276

Comparing the DIC Statistics

The Deviance Information Criterion (DIC) statistic is a Bayesian version of the AIC useful to compare different model specifications for the same likelihood function. Generally, smaller DIC statistic are preferred. Like other information criteria, the DIC favors models with a better but includes a penalty for model complexity. The DIC is computed from the Monte Carlo simulation output. In addition to the DIC itself, the extractAIC function provides an “expected number of parameter” output (which tends to be more relevant for Hierarchical models).

As we can see from the below output, the DIC for the two Clog-Log models are much worse than for the other two models suggesting they do a worse job of approximating the relationship between age and the percentage of girls who had experienced their first period. The difference in the DIC between the Logit and Probit models are smaller although the Probit model appears to have the lower DIC. Below the DIC table, the also provide the same kind of output from the Classical models and the associated AIC, which shows a similar pattern.

DIC_Out=rbind(
  extractAIC(glmb.out1),
  extractAIC(glmb.out2),
  extractAIC(glmb.out3),
  extractAIC(glmb.out4))

rownames(DIC_Out)=c("Logit","Probit","Clog-Log","Reverse Clog-Log")
DIC_Out
#>                        pD      DIC
#> Logit            1.943787 114.7201
#> Probit           1.943412 110.8597
#> Clog-Log         1.980864 206.8901
#> Reverse Clog-Log 2.023384 122.7485


glm.out1=glm(cbind(Menarche, Total-Menarche) ~Age2,family=binomial(logit),data=Menarche_Model_Data)
glm.out2=glm(cbind(Menarche, Total-Menarche) ~Age2,family=binomial(probit),data=Menarche_Model_Data)
glm.out3=glm(cbind(Menarche, Total-Menarche) ~Age2,family=binomial(cloglog),data=Menarche_Model_Data)
#> Warning: glm.fit: fitted probabilities numerically 0 or 1 occurred
glm.out4=glm(cbind(Total-Menarche,Menarche ) ~Age2,family=binomial(cloglog),data=Menarche_Model_Data)

AIC_Out=rbind(extractAIC(glm.out1),
extractAIC(glm.out2),
extractAIC(glm.out3),
extractAIC(glm.out4))

rownames(AIC_Out)=c("Logit","Probit","Clog-Log","Reverse Clog-Log")
colnames(AIC_Out)=c("edf","AIC")
AIC_Out
#>                  edf      AIC
#> Logit              2 114.7553
#> Probit             2 110.9392
#> Clog-Log           2 206.8726
#> Reverse Clog-Log   2 122.6905

Plotting Predictions and Residuals For the Additional Models

Probit Model

Predictions

## Predict and Simulate for newdata

mod_Object=glmb.out2
pred1=predict(glmb.out2,type="response")
pred1_m=colMeans(pred1)
n=nrow(mod_Object$coefficients)
pred_menarche=predict(mod_Object,newdata=newdata,olddata=olddata,type="response")
pred_y=matrix(0,nrow=n,ncol=length(Age_New))
for(i in 1:n){
  pred_y[i,1:length(Age_New)]=rbinom(length(Age_New),size=obs_size,
  prob=pred_menarche[i,1:length(Age_New)])
}


## Produce mean and quantile information

pred_m=colMeans(pred_menarche)
pred_y_m=colMeans(pred_y/obs_size)
quant1_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)
quant1_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)


plot(Menarche/Total ~ Age, data=menarche,main="Percentage of girls Who have had their first period")
lines(Age_New, pred_m,lty=1)
lines(Age_New, quant1_m,lty=2)
lines(Age_New, quant2_m,lty=2)
lines(Age_New, quant1_m_y,lty=2)
lines(Age_New, quant2_m_y,lty=2)

Residuals

## Get Residuals for model, their means, and credible bounds
res_out=residuals(glmb.out2)
res_mean=colMeans(res_out)
sqrt(mean(res_mean^2))
#> [1] 0.9584681
res_low1=apply(res_out,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high1=apply(res_out,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

## Simulate new data and get residuals for simulated data

ysim1=simulate(glmb.out2,nsim=1,seed=10402L,pred=pred1,family="binomial",
                        prior.weights=weights(glmb.out2))
res_ysim_out1=residuals(glmb.out2,ysim=ysim1)
res_low=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

# Plot Credible Interval bounds for Deviance Residuals

plot(res_mean~Age,ylim=c(-2.5,2.5),
  main="Credible Interval Bounds for Probit Model Deviance Residuals",
  xlab = "Age", ylab = "Dev. Residuals")
lines(Age, 0*res_mean,lty=1)
lines(Age, res_low,lty=1)
lines(Age, res_high,lty=1)
lines(Age, res_low1,lty=2)
lines(Age, res_high1,lty=2)

clog-log Model

Predictions

## Predict and Simulate for newdata
mod_Object=glmb.out3
pred1=predict(glmb.out3,type="response")
pred1_m=colMeans(pred1)
n=nrow(mod_Object$coefficients)
pred_menarche=predict(mod_Object,newdata=newdata,olddata=olddata,type="response")
pred_y=matrix(0,nrow=n,ncol=length(Age_New))
for(i in 1:n){
  pred_y[i,1:length(Age_New)]=rbinom(length(Age_New),size=obs_size,
  prob=pred_menarche[i,1:length(Age_New)])
}


## Produce mean and quantile information

pred_m=colMeans(pred_menarche)
pred_y_m=colMeans(pred_y/obs_size)
quant1_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)
quant1_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)


plot(Menarche/Total ~ Age, data=menarche,main="Percentage of girls Who have had their first period")
lines(Age_New, pred_m,lty=1)
lines(Age_New, quant1_m,lty=2)
lines(Age_New, quant2_m,lty=2)
lines(Age_New, quant1_m_y,lty=2)
lines(Age_New, quant2_m_y,lty=2)

Residuals

## Get Residuals for model, their means, and credible bounds
res_out=residuals(glmb.out3)
res_mean=colMeans(res_out)
sqrt(mean(res_mean^2))
#> [1] 2.181503
res_low1=apply(res_out,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high1=apply(res_out,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

## Simulate new data and get residuals for simulated data

ysim1=simulate(glmb.out3,nsim=1,seed=10403L,pred=pred1,family="binomial",
                        prior.weights=weights(glmb.out3))
res_ysim_out1=residuals(glmb.out3,ysim=ysim1)
res_low=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

# Plot Credible Interval bounds for Deviance Residuals

plot(res_mean~Age,ylim=c(-4.0,4.0),
  main="Credible Interval Bounds for cloglog Model Deviance Residuals",
  xlab = "Age", ylab = "Dev. Residuals")
lines(Age, 0*res_mean,lty=1)
lines(Age, res_low,lty=1)
lines(Age, res_high,lty=1)
lines(Age, res_low1,lty=2)
lines(Age, res_high1,lty=2)

Reverse clog-log Model

Predictions

## Predict and Simulate for newdata
mod_Object=glmb.out4
pred1=predict(glmb.out4,type="response")
pred1_m=colMeans(pred1)
n=nrow(mod_Object$coefficients)
pred_menarche=predict(mod_Object,newdata=newdata,olddata=olddata,type="response")
pred_y=matrix(0,nrow=n,ncol=length(Age_New))
for(i in 1:n){
  pred_y[i,1:length(Age_New)]=rbinom(length(Age_New),size=obs_size,
  prob=pred_menarche[i,1:length(Age_New)])
}


## Produce mean and quantile information

pred_m=colMeans(pred_menarche)
pred_y_m=colMeans(pred_y/obs_size)
quant1_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m=apply(pred_menarche,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)
quant1_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
quant2_m_y=apply(pred_y/obs_size,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)


plot(Menarche/Total ~ Age, data=menarche,main="Percentage of girls Who have had their first period")
lines(Age_New, 1-pred_m,lty=1)
lines(Age_New, 1-quant1_m,lty=2)
lines(Age_New, 1-quant2_m,lty=2)
lines(Age_New, 1-quant1_m_y,lty=2)
lines(Age_New, 1-quant2_m_y,lty=2)

Residuals

## Get Residuals for model, their means, and credible bounds
res_out=residuals(glmb.out4)
res_mean=colMeans(res_out)
sqrt(mean(res_mean^2))
#> [1] 1.178371
res_low1=apply(res_out,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high1=apply(res_out,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

## Simulate new data and get residuals for simulated data

ysim1=simulate(glmb.out4,nsim=1,seed=10404L,pred=pred1,family="binomial",
                        prior.weights=weights(glmb.out4))
res_ysim_out1=residuals(glmb.out4,ysim=ysim1)
res_low=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.025),na.rm=TRUE)
res_high=apply(res_ysim_out1,2,FUN=quantile,probs=c(0.975),na.rm=TRUE)

# Plot Credible Interval bounds for Deviance Residuals

plot(-res_mean~Age,ylim=c(-3.0,3.0),
  main="Credible Interval Bounds - Rev. clog-log Model Deviance Residuals",
  xlab = "Age", ylab = "Dev. Residuals")
lines(Age, 0*res_mean,lty=1)
lines(Age, -res_low,lty=1)
lines(Age, -res_high,lty=1)
lines(Age, -res_low1,lty=2)
lines(Age, -res_high1,lty=2)

References

McCullagh, P., and J. A. Nelder. 1989. Generalized Linear Models. Chapman; Hall.

Nygren, K. N., and L. M. Nygren. 2006. “Likelihood Subgradient Densities.” Journal of the American Statistical Association 101 (475): 1144–56. https://doi.org/10.1198/016214506000000357.

Spiegelhalter, David J., Nicky G. Best, Bradley P. Carlin, and Angelika van der Linde. 2002. “Bayesian Measures of Model Complexity and Fit.” Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (4): 583–639. https://doi.org/10.1111/1467-9868.00353.

Chapter 04: Reviewing Model Predictions, Deviance Residuals and Model Statistics

Kjell Nygren

2026-04-30

Introduction

Reviewing Model Predictions

Predictions for Actual Data

Predictions for “New” Data

Reviewing Model Residuals

Using the DIC Statistic

Running The Probit Model

Running The Clog-Log Model

Running The Reverse Clog-Log Model

Comparing the DIC Statistics

Plotting Predictions and Residuals For the Additional Models

Probit Model

clog-log Model

Reverse clog-log Model

References