This vignette describes the computational architecture used by the
glmbayes package to generate independent and identically distributed
(iid) samples from Bayesian generalized linear models (GLMs). The
central idea is that posterior simulation is handled by a small set of
low-level simulation functions (“simfunctions”) that implement either
conjugate sampling formulas or envelope-based accept-reject methods.
These simfunctions are called by higher-level modeling functions such as
glmb() and lmb(), which provide formula
interfaces, model frames, and S3 methods.
The simfunctions implement both classical conjugate Bayesian updating and the likelihood-subgradient envelope methods of (Nygren and Nygren 2006). Companion vignettes document coefficient envelopes (Nygren 2025a), Gamma-family dispersion envelopes (Nygren 2025b), and joint envelopes for Gaussian regression with independent Normal–Gamma priors (Nygren 2025c).
This chapter begins by describing the role of rglmb(),
rlmb()and the pfamily system, which together
determine which simfunction is used for a given model. Later sections
describe the simfunctions themselves, the mapping between likelihood
families and prior families, and the C++ routines used for envelope
construction and sampling.
The functions rglmb() and rlmb()are the
minimalistic engines for Bayesian GLM simulation. They are called
internally by glmb() and lmb() respectively,
which provides formula-based interfaces similar to glm()
and lm() respectively. In contrast, rglmb()
and rlmb()works directly with numeric inputs and avoids the
overhead of model frames, formulas, and method dispatches. Their main
purpose is to generate iid posterior draws given:
yxgaussian(),
poisson(), binomial())pfamily)Unlike glmb()and
lmb(),rglmb()andrlmb()` do not compute fitted
values, residuals, or diagnostics. They simply dispatche to the correct
simfunctions and return the resulting posterior draws. The lower
overhead associated with these functions make them a good choice when
calling the estimation procedures repeatedly (say as part of Gibbs
sampling procedures).
The functions glmb() and lmb() are the
high-level modeling interface. They:
Once the model frames are constructed, glmb() and
lmb()call rglmb() and
rlmb()respectively to generate the posterior samples.
Thus:
glmb() –> rglmb() –> simfunction –> posterior draws
and
lmb() –> rlmb() –> simfunction –> posterior draws
respectively.
This separation mirrors the classical distinction between
glm()/ lm()and the underlying IRLS
routines.
In classical GLMs, the family object determines the
likelihood and link. In glmbayes, the pfamily object plays
the same role for priors. Each pfamily object:
pfamily$pfamily)pfamily$prior_list)pfamily$okfamilies)pfamily$plinks)pfamily$simfun)This design mirrors the structure of glm():
family determines the likelihood, and pfamily
determines the prior and the sampling method. The combination
(family, pfamily) uniquely determines which simfunction is
used.
To utilize pfamilies when calling the higher level
glmb() and lmb() or the lower level
rglmb() and rlmb() functions, users must
provide the hyperparameters needed to populate the
pfamily$prior_list. These hyperparameters vary across
pfamilies largely based on whether the prior specified sets a prior for
regression coefficients, dispersion parameters, or both jointly.
The pfamilies currently implemented in glmbayes differ primarily in whether they place priors on regression coefficients, dispersion parameters, or both. The table below summarizes the hyperparameters associated with each pfamily.
| pfamily name Prior components in prior_list |
dNormal mu: prior mean vector for coefficients Sigma: prior covariance matrix dispersion: fixed dispersion (if applicable)
dGamma shape: shape parameter for Gamma prior on precision rate: rate parameter for Gamma prior on precision beta: regression coefficients (used when updating dispersion separately) max_disp_perc: percentile used to truncate the posterior dispersion distribution disp_lower, disp_upper: optional user-specified truncation bounds
dNormal_Gamma mu: prior mean vector for coefficients Sigma_0: prior
covariance on precision-weighted scale shape: shape parameter for Gamma
prior on precision rate: rate parameter for Gamma prior on precision
(prior_list still uses component name
Sigma.)
dIndependent_Normal_Gamma mu: prior mean vector for coefficients Sigma: prior covariance matrix shape: shape parameter for Gamma prior on precision rate: rate parameter for Gamma prior on precision max_disp_perc: percentile used to truncate the posterior dispersion distribution disp_lower, disp_upper: optional user-specified truncation bounds
These pfamilies determine which simfunction is used. For example:
Because each pfamily embeds its own simulation function,
rglmb() and rlmb() do not need to contain any
special-case logic. Once the user supplies a pfamily object, the correct
simfunction is already encoded inside it. This design mirrors the
classical GLM architecture, where the family object determines the
likelihood and link, but extends it to the Bayesian setting by allowing
the prior to determine the sampling algorithm.
As setting the prior parameters required for each pfamily can be a
bit involved, the package provides the Prior_Setup()
function which be used to extract default prior settings for these
parameters while also allowing tailoring of the prior specification.
Some of our earlier vignettes already covered this in details so it
won’t be discussed further here.
All simfunctions in simfunctions.R share the same
signature:
simfun(n, y, x, prior_list, offset = NULL, weights = 1, family = gaussian(), Gridtype = 2, n_envopt = NULL, use_parallel = TRUE, use_opencl = FALSE, verbose = FALSE)
This uniform interface allows rglmb() and
rlmb() to dispatch to any simfunction without special-case
logic. It also allows users to change priors or likelihoods without
altering their workflow, and it makes the simfunctions suitable for use
inside Gibbs samplers or other custom algorithms.
The internal logic of rglmb() is:
family and a prior
pfamily.Prior_Setup() constructs a prior_list
containing the hyperparameters.rglmb() extracts from the pfamily object:
okfamilies)plinks)prior_list)simfun)The simfunction performs the actual sampling, either by conjugate formulas or by envelope-based accept-reject methods.
The pfamily system dispatches to one of four simfunctions:
rNormal_regrNormalGamma_regrindepNormalGamma_regrGamma_regThese functions implement the sampling algorithms described in Appendices A2, A5, A6, and A7. The next section provides tables showing how likelihood families and pfamilies map to these simfunctions.
simfunction() GenericThe glmbayes package provides an S3 interface for inspecting which
low-level simulation function (“simfunction”) was used during a call to
rglmb() or rlmb(). This interface is accessed
through the exported generic:
simfunction(object, ...)
When applied to an rglmb or rlmb result,
this function returns an object of class "simfunction"
containing:
rNormal_reg)prior_list
and family objectsThis provides a transparent view of the internal mechanics of the simulation step, which is useful for debugging, reproducibility, and understanding how pfamilies dispatch to specific simfunctions.
For example:
fit <- rglmb(n = 10, y = y, x = X,
family = gaussian(),
pfamily = dNormal(mu, Sigma))
simfunction(fit)
prints the simfunction name, the call, and all arguments used during sampling.
The print.simfunction() method formats this information
in a readable way, including a recursive display of the prior
hyperparameters stored in prior_list. This interface is
intentionally lightweight: it does not re-run the simfunction or modify
the model object. It simply reports the simulation metadata recorded
during the original call.
The combination of glmb()/lmb(),
rglmb()/rlmb(), the pfamily system, and the
simfunctions forms a flexible and extensible framework for Bayesian GLM
simulation. The high-level modeling functions handle formula parsing and
model construction, while the low-level simfunctions implement efficient
posterior sampling using either conjugate formulas or envelope-based
methods.
The next section provides detailed mapping tables showing how families and pfamilies determine the appropriate simfunction and sampling method.
The table below summarizes how likelihood families, link functions,
and pfamilies map to the simulation functions in
simfunctions.R. The tables also indicates whether the
sampling is conjugate or envelope-based.
| Likelihood Family | Link | Pfamily | Simfunction | Method | |
| Gaussian | identity | dNormal | rNormal_reg | Conjugate MVN [C++] | |
| Gaussian | identity | dGamma | rGamma_reg | Conjugate Gamma [R] | |
| Gaussian | identity | dNormal_Gamma | rNormalGamma_reg | Conjugate Normal-Gamma [R] | |
| Gaussian | identity | dIndependent_Normal_Gamma | rindepNormalGamma_reg | Envelope-based joint sampler [C++] | |
| Poisson | log | dNormal | rNormal_reg | Envelope-based [C++] | |
| Quasi-Poisson | log | dNormal | rNormal_reg | Envelope-based [C++] | |
| Binomial | logit, probit, cloglog | dNormal | rNormal_reg | Envelope-based [C++] | |
| Quasi-Binomial | logit, probit, cloglog | dNormal | rNormal_reg | Envelope-based [C++] | |
| Gamma | log | dNormal | rNormal_reg | Envelope-based [C++] | |
| Gamma | log | dGamma | rGamma_reg | Envelope-based [C++] | |
The table below lists the simulation functions that call C++ code, the likelihood families for which the C++ path is used, and the specific C++ source files and function names invoked. Only the primary simulation routines are shown here; envelope construction routines are documented separately in Appendices A5, A6, and A7.
| Simfunction | Family | C++ File | C++ Function(s) Called | |
| rNormal_reg | Gaussian | rNormalReg.cpp | rNormalReg_cpp_export | |
| rNormal_reg | All non-Gaussian | rNormalGLM.cpp | rNormalGLM_cpp_export | |
| rindepNormalGamma_reg | Gaussian | rIndepNormalGammaReg.cpp | rIndepNormalGammaReg_std_cpp_export | | rIndepNormalGammaReg_std_parallel_cpp_export | ||
| rGamma_reg | Gaussian | (none) | (uses R-level rgamma or ctrgamma) | |
| rGamma_reg | Gamma | (none) | (uses R-level ctrgamma) | |
Notes: - rNormal_reg uses different C++ backends
depending on whether the likelihood is Gaussian or non-Gaussian. -
rindepNormalGamma_reg always uses C++ for the joint
sampler. - rGamma_reg does not call a dedicated C++
simulation routine, but it does use the R-level ctrgamma
function for truncated Gamma proposals. - Envelope construction routines
(EnvelopeBuild, EnvelopeDispersionBuild_cpp,
etc.) are documented separately and are not included in this table.
Several combinations of likelihood family and prior lead to closed-form posterior distributions:
In these cases, the simfunctions use direct sampling from multivariate normal or gamma distributions. No envelope construction is required.
When the posterior is not available in closed form, the simfunctions use the likelihood subgradient approach of Nygren and Nygren (2006). These cases include:
The envelope-based samplers rely on:
These methods guarantee iid sampling even in high dimensions and for non-conjugate models.
The simulation functions are grounded in the following theoretical results:
(Nygren and Nygren 2006): likelihood-subgradient densities and bounds on acceptance rates.
(Nygren 2025a):
implementation-oriented companion (this series, Chapter A05) for
likelihood-subgradient samplers and the .rNormalGLM_cpp
pipeline.
(Nygren 2025b): accept–reject
sampling for dispersion in Gamma regression
(rGamma_reg).
(Nygren 2025c): joint sampling for
Gaussian regression with independent Normal–Gamma priors
(rindepNormalGamma_reg).
Together these provide the mathematical and implementation justification for the envelope-based methods in the simfunctions.
As noted above, the function signatures for the simulation functions are the same. However, the functions differ in terms of the specific familes/links supported and in terms of the set of prior parameters required as part of the prior_list argument. Here we provide a summary view of the families/links supported by each function and then a more detailed overview of each function.
| Simfunction Supported Families and Links |
rNormal_reg gaussian(identity) poisson(log) quasipoisson(log) binomial(logit, probit, cloglog) quasibinomial(logit, probit, cloglog) Gamma(log)
rNormalGamma_reg gaussian(identity)
rindepNormalGamma_reg gaussian(identity)
rGamma_reg gaussian(identity) Gamma(log)
Purpose:
Generates iid samples of regression coefficients beta for GLMs with
Normal priors.
Function signature: rNormal_reg(n, y, x, prior_list, offset = NULL, weights = 1, family = gaussian(), Gridtype = 2, n_envopt = NULL, use_parallel = TRUE, use_opencl = FALSE, verbose = FALSE)
Required prior_list components (from dNormal): - mu: prior mean vector for coefficients - Sigma: prior covariance matrix - dispersion: fixed dispersion (if applicable)
Conjugate case:
Gaussian likelihood with Normal prior.
Sampling uses direct multivariate normal draws via
.rNormalReg_cpp.
Envelope-based case:
All non-Gaussian GLMs.
Uses .rNormalGLM_cpp and EnvelopeBuild to
construct a likelihood-subgradient envelope.
Implementation sketch:
1. Compute posterior mode using optim (non-Gaussian) or
lm.fit (Gaussian).
2. Standardize model using glmb_Standardize_Model.
3. Build envelope using EnvelopeBuild.
4. Generate iid samples using accept-reject sampling.
References:
(Nygren and Nygren 2006); (Nygren 2025a).
Purpose:
Conjugate Normal-Gamma regression for Gaussian likelihoods.
Function signature: rNormalGamma_reg(n, y, x, prior_list, offset = NULL, weights = 1, family = gaussian(), Gridtype = 2, n_envopt = NULL, use_parallel = TRUE, use_opencl = FALSE, verbose = FALSE)
Required prior_list components (from dNormal_Gamma): - mu: prior mean vector for coefficients - Sigma: prior covariance matrix - shape: shape parameter for Gamma prior on precision - rate: rate parameter for Gamma prior on precision
Method:
Closed-form posterior for (beta, tau).
Draw tau from Gamma, then draw beta from conditional Normal.
Implementation sketch:
1. Fit weighted least squares to obtain Btilde, IR, and S.
2. Compute posterior Gamma parameters for tau.
3. Draw tau and then beta = Btilde + IR %% rnorm
sqrt(dispersion).
References:
Conjugate Normal–Gamma regression (Raiffa and
Schlaifer 1961; O’Hagan and Forster 2004); see also (Lindley and Smith 1972).
Purpose:
Joint sampling of (beta, phi) for Gaussian regression with independent
Normal-Gamma priors.
Function signature: rindepNormalGamma_reg(n, y, x, prior_list, offset = NULL, weights = 1, family = gaussian(), Gridtype = 2, n_envopt = NULL, use_parallel = TRUE, use_opencl = FALSE, verbose = FALSE)
Required prior_list components (from dIndependent_Normal_Gamma): - mu: prior mean vector for coefficients - Sigma: prior covariance matrix - shape: shape parameter for Gamma prior on precision - rate: rate parameter for Gamma prior on precision - max_disp_perc: percentile used to truncate dispersion support - disp_lower, disp_upper: optional user-specified truncation bounds
Method:
Envelope-based joint sampler using a shared envelope over beta and
dispersion bounds.
Backend:
- .rIndepNormalGammaReg_std_cpp
- EnvelopeDispersionBuild_cpp
- EnvelopeBuild
Implementation sketch:
1. Iteratively anchor dispersion using repeated Gaussian fits.
2. Standardize model and compute posterior mode.
3. Build coefficient envelope (EnvelopeBuild).
4. Build dispersion envelope (EnvelopeDispersionBuild_cpp).
5. Sample jointly from the standardized model.
6. Transform back to original scale.
References:
(Nygren 2025c); (Griffin and Brown 2010).
Purpose:
Simulates dispersion parameters for Gaussian and Gamma families.
Function signature: rGamma_reg(n, y, x, prior_list, offset = NULL, weights = 1, family = gaussian(), Gridtype = 2, n_envopt = NULL, use_parallel = TRUE, use_opencl = FALSE, verbose = FALSE)
Required prior_list components (from dGamma): - shape: shape parameter for Gamma prior on precision - rate: rate parameter for Gamma prior on precision - beta: regression coefficients (used when updating dispersion separately) - max_disp_perc: percentile used to truncate dispersion support - disp_lower, disp_upper: optional user-specified truncation bounds
Gaussian case:
Conjugate Gamma posterior for precision tau.
Uses direct sampling or truncated sampling via
ctrgamma.
Gamma family case:
Non-conjugate.
Uses the A6 envelope-based sampler with a tangent envelope for the
precision.
Backend:
- ctrgamma
- A6 routines
- log-space utilities logdiffexp
Implementation sketch:
1. Compute posterior mode for precision.
2. Build Gamma surrogate and tangency point.
3. Construct bounding function log_h.
4. Accept-reject sampling for precision.
5. Transform to dispersion.
References:
(Nygren 2025b).
The simfunctions are designed to be lightweight and fast. They avoid the overhead of:
This makes them ideal for:
They return only the essential components needed for further computation: coefficients, dispersion draws, posterior modes, envelopes, and iteration counts.
The core simulation routines are implemented in:
simfunctions.RThis file contains the four primary simfunctions
(rNormal_reg, rNormalGamma_reg,
rindepNormalGamma_reg, rGamma_reg) and the
R-level logic that interfaces with the C/C++ backends.
Several simfunctions rely on optimized C/C++ routines for envelope construction, posterior mode evaluation, and sampling. These routines are located in the following source files:
Coefficient samplers (Normal priors): -
rNormalReg.cpp
Backend for Gaussian–Normal conjugate sampling. -
rNormalGLM.cpp
Backend for non-Gaussian Normal-prior envelope sampling.
Joint Normal–Gamma samplers: -
rIndepNormalGammaReg.cpp
Standard joint sampler for independent Normal–Gamma priors. -
rIndepNormalGammaReg.cpp
Parallelized version of the joint sampler.
Envelope construction routines: -
EnvelopeBuild_c.cpp
Coefficient envelopes (Appendix A5). -
EnvelopeDispersionBuild_cpp.cpp
Dispersion envelopes for Gamma-family models (Appendix A7).
Model standardization: -
glmb_Standardize_Model.R
R-level preprocessing used by all envelope-based samplers.
The simfunctions also rely on several numerically robust utility routines:
ctrgammarGamma_reg.logdiffexpp_inv_gamma, q_inv_gamma,
r_invgammaThese utilities support the envelope-based samplers, truncated distributions, and dispersion updates used throughout the package.
The simulation functions described in this chapter form the computational backbone of the glmbayes package. They implement both classical conjugate sampling and modern envelope-based iid sampling. They are the recommended interface for custom samplers, Gibbs sampling, and research applications requiring direct access to the underlying algorithms.
For further details, see the JASA paper and Appendices A5, A6, and A7.