Chapter A01: A detailed overview of the glmbayes package

Kjell Nygren

2026-04-30

Introduction

The glmbayes package produces iid samples for Bayesian Linear Models and Bayesian Genereralized Linear Models by providing Bayesian versions of the lmb and glm functions for classical models. Extensive efforts have been made to make the new functions, lmb and glmb, as similar as possible to their classical cousins. To that end, the functions add a single required argument (pfamily) and one optional argument (n) to those typically used by the classical functions. The Bayesian functions also generally inherit and/or provide implemented methods for nearly all of the methods available for the lm and glm functions so that the same type of outputs as from the classical functions can be readily produced.

This vignette is designed to provide users a detailed overview of the package and the various functionalities available. In the rest of this introduction, we briefly touch on the basics of loading the package and accessings various help resources and demos (which we recommend as a next step for users after they read this vignette). The next section then gives an overview of all the core functions that most users are likely to use. This includes the two aforementioned functions, functions used to specify and check prior distributions, and methods available for the two functions. Most users can likely stop after that section and can then turn to either the help pages or the next set of vignettes that go into a more detailed comparisons of how the methods are used for these Bayesian functions vs. for their classical cousins.

For advanced users who are interested in implementing Block-Gibbs sampling procedures or in doing simulation for other more complex models, we recommend continuing on to the third section in this vignette that covers the two functions rlmb and rglmb and are designed for implementation of precisely that type of sampling. The third section discusses how those functions can be called, references some examples/demos where they are used, and describes how to use the output from the functions to access similar output to that produced by the lmb and glmb functions.

For users who are interested in the details of how the simulation is performed there is the final section that contains a discussion of the various simulation functions that are used to sample from the posterior distributions. This includes a discussion of three core function that simulate for the currently implemented prior families and a couple of functions related to what we call the Central Normal distribution (the normal distribution between a lower and an upper bound). It also includes a discussion of a set of functions that are used in order to utilize the enveloping functions needed to get iid samples for some of the posterior distributions. That material is rather technical; the likelihood-subgradient foundation is given by (Nygren and Nygren 2006), with estimation architecture summarized in (Nygren 2025a) and envelope details in (Nygren 2025b).

Function Type Functions Vignettes
Prior Setup function Prior_Setup Chapter 01 — Getting Started with glmbayes
Main Model Interfaces (GLM / LM) glmb Chapter 01 — Getting Started with glmbayes
Methods print.glmb Chapter 01 — Getting Started with glmbayes
pfamilies dNormal Chapter 02 — Estimating Bayesian Linear Models
Main Model Interfaces (GLM / LM) lmb Chapter 02 — Estimating Bayesian Linear Models
Methods summary.lmb Chapter 02 — Estimating Bayesian Linear Models
Prior Setup function Prior_Setup Chapter 03 — Tailoring Priors - Leveraging the Prior_Setup Function; Chapter A12 — Technical Derivations for Priors Returned by Prior_Setup
Methods print.PriorSetup Chapter 03 - Tailoring Priors - Leveraging the Prior_Setup Function
Methods predict.glmb, residuals.glmb, extractAIC.glmb Chapter 04 — Predictions, Deviance Residuals, Model Statistics
Main Model Interfaces (GLM / LM) glmb Chapter 06 — Estimating Bayesian Generalized Linear Models
Methods summary.glmb Chapter 06 — Estimating Bayesian Generalized Linear Models
Main Model Interfaces (GLM / LM) glmb Chapter 07 — Models for the Binomial Family
Main Model Interfaces (GLM / LM) glmb Chapter 08 — Models for the Poisson Family
Main Model Interfaces (GLM / LM) glmb Chapter 09 — Models for the Gamma Family
Lower-Level Interfaces (GLM / LM) rlmb Chapter 13 — Hierarchical Linear Models
Lower-Level Interfaces (GLM / LM) rglmb Chapter 14 — Hierarchical Generalized Linear Models
Model-Specific Simulation Functions rNormal_reg, rGamma_reg, rNormal_Gamma_reg, rindependent_norm_gamma_reg Chapter A02: Overview of Estimation Procedures
pfamilies dGamma, dNormal_Gamma, dIndependent_Normal_Gamma Chapter 11 — Estimating Models with Unknown Dispersion Parameters
Methods Various method functions for glmbayes Chapter A03 — Methods Available in glmbayes
Envelope Related Functions Various Functions Chapter A08 — Overview of Envelope Related Functions
Parallel Simulation Related Functions Various Functions Chapter A09: Parallel Sampling Implementation using RcppParallel
Accelerated EnvelopedBuild Related Functions Various Functions Chapter A10: Accelerated EnvelopeBuild Implementation using OpenCL
Utilities add_to_libpath_linux, add_to_path, add_to_path_linux, add_to_path_windows Chapter 00 — Introduction
Chapter 01 — Getting Started
Datasets AMI, carinsca, Cleveland Chapter 01 — Getting Started

Loading the package

The package curently requires the MASS package so it also needs to be installed and gets loaded as part of the loading of the glmbayes package (see below).

library(glmbayes)

Accessing package information, general documentation, and demos

To access information for the package outside of the information contained in these vignettes, we suggest users review three sets of help pages. A call to help(glmbayes) yields a page with a detailed discussion of the package with an example at the bottom. Calling help(package=“glmbayes”) brings the user to a list of Help pages for specific functions as well as links to a general description file, code demos, and these vignettes. Part of the objective of this vignette is to provide more organized layout of the functions than the general list of help pages provided on the documentation page.

A call to demo(package=“glmbayes”), finally, brings up a list of available demos for the package. A useful start for users might be to go through the specific demos as they are set up to illustrate the functionality of the package.

Using the lmb and glmb functions and their methods

As noted above, the lmb and glmb functions are the two core functions in this package. More detailed side-by-side comparisons of the use of methods for these functions and their classical cousins are provided in the next two vignettes. Here we provide a brief overview of supporting functions available to facilitate the use of the core functions. In the first subsection, we show how the core functions are called (and how this compares to calls for the classical functions). We then give a brief discussion of how priors are specified and some related functions designed to assist with that task as well as with the task of viewiw prior specifications after the modeling is completed.

The lmb and glmb functions

The code snippets below illustrates how model specification for these models can be done by simply providind an additional argument, a pfamily, that provides information on the type of prior distribution that has been selected and the parameter values that are desired. A separate vignette is provided to give guidance to users on how to select appropriate prior parameters (and some standardization steps that may be useful as part of that).

[NEED TO MODIFY Prior_Setup PRINT FUNCTION AND HOW IT HANDLES CASE WHERE X IS MISSING] [MAY ALSO WANT TO HAVE IT INITIALIZE SHAPE AND RATE PARAMETERS]

Calling lm vs. calling lmb

Here is the call for the lmb function and how it compares to the call to the classical lm function. The recommended prior here is the dNormal_Gamma pfamily which requires four parameters (2 for he Normal component, and 2 for the Gamma component).

# Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
##Classical model-call
lm.D9 <- lm(weight ~ group)

# Bayesian Prior Setup and Call-Prior Should Generally be Customized
p_info=Prior_Setup(weight~group, family=gaussian())
#> Using default pwt = 0.01 (low-d default).
mu1=p_info$mu    
Sigma1=p_info$Sigma

#lmb.D9 <- lmb(weight ~ group,pfamily=dNormal_Gamma(mu1,Sigma1,shape=4,rate=0.1))

Calling glm vs. calling glmb

The call for the glmb function is quite similar and is here compared to the corresponding call for the glm function. For the binomial and Poisson families, the recommended prior here is the dNormal pfamily which requires 2 parameters (a prior mean vector and a prior Variance-Covariance matrix).

## Dobson (1990) Page 93: Randomized Controlled Trial :
counts <- c(18,17,15,20,10,20,25,13,12)
outcome <- gl(3,1,9)
treatment <- gl(3,3)
##Classical model-call
glm.D93 <- glm(counts ~ outcome + treatment, family = poisson())

# Bayesian Prior Setup and Call-Prior Should Generally be Customized
p_info2=Prior_Setup(counts ~ outcome + treatment,family=poisson())
#> Using default pwt = 0.01 (low-d default).
mu2=p_info2$mu    
Sigma2=p_info2$Sigma

glmb.D93<-glmb(counts ~ outcome + treatment, family = poisson(),pfamily=dNormal(mu=mu2,Sigma=Sigma2))

Specifying Priors

Prior Family Functions

The pfamily arguments described above are actually functions that provide needed information to the lmb and glmb functions. Since they are are functions, they can also be called directly, as seen below, where they return lists with items related to the prior (see documentation of these pfamily functions for details). These functions are also discussed in more detail in a latter chapter.

While the current package only provides 3 implemented pfamiles, this is actually not a severe limitations as users easily can write their own prior families and pass those to the lmb and glmb functions. One of our exaxmple includes an illustration of this and this approach is discussed more extensively in one of our other vignettes. Below, we simply illustrate how the functions can be called directly.

NOTE: IMPLEMENT THE EXAMPLE DISCUSSED ABOVE AND WRITE A SEPARATE VIGNETTE TO ACCOMPANY THE EXAMPLE.

Calling the dNormal prior family

This prior is typically used as a prior for the Poisson and Binomial families when calling either the glmb or rglm functions. Calling it directly provides a list with, among other items, the prior specification. the documentation for the pfamilies provides additional information on what the returned list contains and what required and optional arguments are.

dNormal(mu=mu2,Sigma=Sigma2,dispersion=NULL)

*Calling the dNormal_Gamma prior family

This prior is typically used as a prior when calling either the lmb or rglmb functions or when using the gaussian() family in the glm and rglmb functions. Calling it directly


dNormal_Gamma(mu=mu1,Sigma_0=Sigma1,shape=4,rate=0.1)

*Calling the dGamma prior family

This prior is typically used when giving a prior for the dispersion parameter in the lmb, rlmb, glmb, and rglmb functions (in the latter case for the gaussian() and Gamma() families). Typicall this would be done as part of Block-Gibbs sampling where the regression coefficients are updated in a separate block.

b <- p_info$coefficients
dGamma(shape=4,rate=0.1,beta=b)

Method functions

The lm and glm functions come with an extensive set of method functions that can be used to conduct supplemental analysis (typically during the post-modeling phase). Because the lmb and glmb functions are structured to return many of the same items as the classical functions and because they inherit methods from glm and lm, many of the methods available for the classical functions are directly available for the glmb and lmb functions. As some of the output does tend to be different (in particular, coefficients are returned as random draws as opposed to a single maximum likelihood estimate), we have implemented specific methods for the “glmb” and “lmb” classes of objects. The current set of methods for all four classes of objects are displayed below.

Two separate vignettes illustrate how calls to these methods can be used to replicate the output from the classical models. It is worth noting that a smaller subset of the classical methods are not yet implemented (although plans are in place to replicate some of those as well). Those are discussed in the separate vignette’s as well. Here we provide lists of the methods available for classical lm and glm functions (many of which are inherited by lmb and glmb) as well as methods that have been developed specifically for glmb and rglmb.

[NOTE: AS NEEDED, ADD ADDITIONAL METHODS FOR lmb. THE LIST OF METHODS FOR glmb SHOULD BE FAIRLY COMPLETE]

Methods for class=“lm”

methods(class="lm")
#>  [1] add1           addterm        alias          anova          boxcox        
#>  [6] case.names     coerce         confint        cooks.distance deviance      
#> [11] dfbeta         dfbetas        dffits         drop1          dropterm      
#> [16] dummy.coef     effects        extractAIC     family         formula       
#> [21] hatvalues      influence      initialize     kappa          labels        
#> [26] logLik         logtrans       model.frame    model.matrix   nobs          
#> [31] plot           predict        print          proj           qr            
#> [36] residuals      rstandard      rstudent       show           simulate      
#> [41] slotsFromS3    summary        variable.names vcov          
#> see '?methods' for accessing help and source code

Methods for class=“glm”

methods(class="glm")
#>  [1] add1           addterm        anova          coerce         confint       
#>  [6] cooks.distance deviance       dfbetas        dffits         drop1         
#> [11] dropterm       effects        extractAIC     family         formula       
#> [16] gamma.shape    influence      initialize     logLik         model.frame   
#> [21] nobs           predict        print          profile        residuals     
#> [26] rstandard      rstudent       show           sigma          slotsFromS3   
#> [31] summary        vcov           weights       
#> see '?methods' for accessing help and source code

Methods for class=“glmb”

While most of the methods for the glm class (when it has its own method) or the lm class (when it does not) work properly for objects of class glmb, some methods do not. For that reason, we have implemented the below methods specifically for the glmb class of objects. Future enhancements to this package may enhance these methods. There are also additional methods for lm and/or glm that do not currently have functioning methods for the glmb class. This includes the add1, drop1, and dropterm functions (which should be implementable fairly easily) and the various influence.measure methods (influence, rstandard, rstudent, dfbeta, dfbetas, cooks.distance, and hatvalues) which may require more work. Plans do exist to add the former and perhaps the latter (which likely would look at influence on posterior modes and not on posterior means).

methods(class="glmb")
#>  [1] anova          case.names     confint        cooks.distance dfbetas       
#>  [6] dummy.coef     extractAIC     influence      logLik         plot          
#> [11] predict        print          residuals      rstandard      rstudent      
#> [16] simulate       summary        variable.names vcov          
#> see '?methods' for accessing help and source code

NOTE: CHECK WHICH METHODS FOR lmb that are inherited and functions vs. those that may require additional development.

methods(class="lmb")
#> [1] print     residuals
#> see '?methods' for accessing help and source code

The rlmb and rglmb functions and their summary functions

Because the functions lmb and glmb are set up to mirror their corresponding classical cousins, they include a great deal of overhead for pre and post processing that are unsuitable if the functions need to be called repetitively (say during Block-Gibbs sampling). To faciliate Block-Gibbs sampling and other simulation, we provide more minimalistic interfaces to the simulation procedures in the form of the corresponding functions rlmb and rglmb.

These functions are called internally by lmb and glmb in much the same way that lm.fit and glm.fit are called by the lm and glm functions so in a way they are Bayesian versions of those functions. We choose the r prefix for these functions to symbolize that they also represent random draws from probability distributions in much the same way that rnorm, rgamma, and other functions in the stats package do.

Calling the rlmb and rglmb functions

Instead of passing a formula to these functions, the user instead passes the dependent and independent variables as a dependent variable vector y and as an independent variable design matrix x respectively. A useful approach that can help the user specify the design matrix involves sequential calls to model.frame and model.matrix as seen below.

NOTE 1: SHORTEN THIS CODE NOTE 2: The SUMMARY FUNCTION IS CURRENTLY NOT WORKING FOR rlmb

Calling the rlmb function

## Annette Dobson (1990) "An Introduction to Generalized Linear Models".
## Page 9: Plant Weight Data.
ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
## Classical call
lm.D9 <- lm(weight ~ group,x=TRUE,y=TRUE)
lm_summary=summary(lm.D9)
## Bayesian Call
n<-10000
y<-lm.D9$y
#x<-as.matrix(lm.D9$x)

### Set old regression and dispersion coefficients

b_old=lm.D9$coefficients
v_old=lm_summary$sigma^2

#### Set up for rGamma_reg

n0=0.1
shape=n0/2
rate=shape*v_old

rate/(shape-1)
rate/shape

#a1<-shape+n1/2
#b1<-rate+sum(SS)/2

#out<-1/rgamma(n,shape=a1,rate=b1)
## v0=sum(SS)/n1 ~ (2*rate+sum(SS))/(2*shape+n1)=[(2*rate+sum(SS))/2]/((2*shape+n1)/2) 
n1=length(y)
SS=v_old*n1

n1 # This is equal to 20


n0=2 # Prior observations
v_prior=v_old  # Prior point estimate for variance (the mean of (1/dispersion=1/v_prior))
wt0=(n0/n1)  

## set shape=0.01*(n1/2)
## set rate= 0.01*SS/2

shape=wt0*(n1/2)   ###  Shape is prior observations /2
rate=shape*v_prior  ### rate is essentiall prior SS - V in rmultireg should be this
rate/shape ## Should match v_prior (currently also v_old)

mu<-c(0,0)
mu=b_old  ### For testing purposes, set prior=b_old
P<-0.1*diag(2)
# Bayesian Prior Setup and Call-Prior Should Generally be Customized
p_info=Prior_Setup(weight~group)
#> Using default pwt = 0.01 (low-d default).
x=p_info$x
outtemp4=rlmb(n=1000,y=y,x=x,pfamily=dNormal_Gamma(mu=mu,Sigma_0=solve(P),shape=shape,rate=rate))

#summary(outtemp4)

Calling the rglmb function

data(menarche,package="MASS")
Age2=menarche$Age-13

y=menarche$Menarche/menarche$Total
wt=menarche$Total

## Use model.frame and model.matrix to derive x
#mf=model.frame(formula)
#x=model.matrix(formula,mf)

x<-matrix(as.numeric(1.0),nrow=length(Age2),ncol=2)
x[,2]=Age2

## Modify Prior_Setup so it can take a model matrix as well as an model object
mu<-matrix(as.numeric(0.0),nrow=2,ncol=1)
V1<-1*diag(as.numeric(2.0))
mu[2,1]=(log(0.9/0.1)-log(0.5/0.5))/3

# 2 standard deviations for prior estimate at age 13 between 0.1 and 0.9
## Specifies uncertainty around the point estimates

V1[1,1]<-((log(0.9/0.1)-log(0.5/0.5))/2)^2 
V1[2,2]=(3*mu[2,1]/2)^2  # Allows slope to be up to 3 times as large as point estimate 

out<-rglmb(n = 1000, y=y, x=x, pfamily=dNormal(mu=mu,Sigma=V1), weights = wt, 
           family = binomial(logit)) 

summary(out)
#> Call
#> rglmb(n = 1000, y = y, x = x, family = binomial(logit), pfamily = dNormal(mu = mu, 
#>     Sigma = V1), weights = wt)
#> 
#> Expected Deviance Residuals:
#>        Min         1Q     Median         3Q        Max 
#> -2.0502951 -1.0082616 -0.4821518  0.7618818  1.3543865 
#> 
#> Prior Estimates with Standard Deviations
#> 
#>    Prior Mean Prior.sd Max Like. Like.sd
#> V1    0.00000  1.09861  -0.01081   0.063
#> V2    0.73241  1.09861   1.63197   0.059
#> 
#> Bayesian Estimates Based on 1000 iid draws
#> 
#>    Post.Mode Post.Mean   Post.Sd MC Error Pr(tail)    
#> V1 -0.010696 -0.009177  0.065664    0.002 0.446553    
#> V2  1.629390  1.630505  0.059787    0.002 0.000999 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Distribution Percentiles
#> 
#>         1.0%      2.5%      5.0%    Median     95.0%     97.5% 99.0%
#> V1 -0.167143 -0.140983 -0.114516 -0.008605  0.097706  0.115446 0.131
#> V2  1.495827  1.516834  1.535443  1.629163  1.728268  1.745358 1.766
#> 
#> Effective Number of Parameters: 2.111456 
#> Expected Residual Deviance: 28.81616 
#> DIC: 114.9794 
#> 
#> Expected Mean dispersion: 1 
#> Sq.root of Expected Mean dispersion: 1 
#> 
#> Mean Likelihood Subgradient Candidates Per iid sample: 1.293

Methods for The rlmb, rglmb, and their summary functions

To allow for the output from the rglmb and rlmb functions to benefit from the wide range of methods available for their lmb and glmb cousins, we implement a few additional methods for these functions that enable most of the methods to also work for these objects post modeling.

methods(class="rglmb")
#> [1] deviance   extractAIC print      residuals  summary   
#> see '?methods' for accessing help and source code
methods(class="summary.rglmb")
#> [1] formula print  
#> see '?methods' for accessing help and source code
methods(class="rlmb")
#> [1] print
#> see '?methods' for accessing help and source code

Advanced Topics

In this section, we cover additional functions included in the package that most users likely would not call directly but which provide visibility to the simulation methods utilized as part of the package. These functions can be broadly broken down into four distinct categories:.

Simulation Functions

One of the key items returned by the pfamily functions is an assigned simulation function to be called during the estimation process (one for each pfamily). These functions can also be called directly (though we don’t recommend it). The Function documentation for each of these underlying functions contains information on the type of algorihm used to generate the sample. We refer the reader to the documentation for each of these functions but note the following relationship:

Normal_ct Functions

For estimation of Bayesian generalized linear models with Normal Priors but non-gaussian families, accept-reject procedures are used where candidates are generated from mixtures of restricted multivariate normal distributions. Two functions for univariate restricted normal distributions (rnorm_ct and pnorm_ct) play a key role during this process and can be viewed as generalizations of the rnorm and pnorm functions.

Utility Functions

Finally, the package contains a small set of functions that are used by various method functions. While these are currently exported, we may in the future choose not to export them as they are unlikely to be useful for users of the package as standalone functions.

References

Nygren, K. N., and L. M. Nygren. 2006. Likelihood Subgradient Densities.” Journal of the American Statistical Association 101 (475): 1144–56. https://doi.org/10.1198/016214506000000357.
Nygren, Kjell. 2025a. Chapter A02: Overview of Estimation Procedures. Vignette in the glmbayes R package.
Nygren, Kjell. 2025b. Chapter A08: Overview of Envelope Related Functions. Vignette in the glmbayes R package.