Chapter 12: Large Models: GPU Acceleration using OpenCL

Introduction

This chapter describes how to enable GPU acceleration in glmbayes using OpenCL. GPU acceleration can dramatically reduce computation time for large envelope models, especially when working with high‑dimensional predictors or repeated model fits. OpenCL is used because it is vendor‑neutral and works across NVIDIA, AMD, and Intel hardware. CPU‑only execution remains fully supported, but may be significantly slower for large models.

Installing and Using glmbayes with OpenCL Support

Follow these steps to install and verify glmbayes with OpenCL:


1. Download and install build tools

1.2 Linux: Install compiler toolchain and R development headers

  • Ubuntu/Debian: sudo apt-get install build-essential r-base-dev

    (If you need the latest R, add the CRAN repo first — see “Installing the latest R on Ubuntu/Debian” below.)

  • Fedora: sudo dnf groupinstall “Development Tools” sudo dnf install R-devel

  • Arch Linux: sudo pacman -S base-devel r

Installing the latest R on Ubuntu/Debian (optional but recommended) Ubuntu’s default repositories often contain older R versions. To install the current CRAN release:

sudo apt-get install –no-install-recommends dirmngr gnupg ca-certificates software-properties-common sudo gpg –keyserver keyserver.ubuntu.com –recv-key E298A3A825C0D65DFD57CBB651716619E084DAB9 sudo gpg -a –export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo tee /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc sudo add-apt-repository “deb https://CRAN.R-project.org/bin/linux/ubuntu/ jammy-cran40/” sudo apt-get update sudo apt-get install build-essential r-base-dev

1.3 macOS: Install Xcode Command Line Tools and GCC

macOS requires both the Xcode Command Line Tools and GCC when installing glmbayes from source. This is because the package’s configure script uses GCC to detect system include and library paths for OpenCL.

Install Xcode Command Line Tools:

xcode-select --install

Install GCC via Homebrew:

brew install gcc

Binary installs (e.g., CRAN or R‑Universe macOS binaries) do not require GCC.

2. Install OpenCL Components

glmbayes must currently be installed from source because neither CRAN nor R‑universe build packages with OpenCL GPU support. Their build systems do not provide OpenCL headers or development libraries, so any precompiled binary from those repositories will have OpenCL disabled. To enable GPU acceleration, you must install glmbayes from source on a system where a complete OpenCL development environment is available.

A full OpenCL development environment includes:

  1. OpenCL header files (needed at compile time)
  2. The OpenCL runtime / ICD loader (needed at run time)
  3. The OpenCL development library providing the unversioned linker symlink libOpenCL.so (needed for linking)

Most GPU drivers provide only the vendor‑specific OpenCL runtime, not the headers or the development symlink. The following subsections describe how to install the required OpenCL components on Windows, Linux, and macOS.

2.1 Windows

Choose one of:

Note: When installing the CUDA Toolkit, Intel OpenCL SDK, or Khronos OpenCL headers,
you can accept all default installation options. The default settings include the
OpenCL header files (such as CL/cl.h) required to compile glmbayes from source.

On Windows, installing the CUDA Toolkit or Intel OpenCL SDK is sufficient; no additional OpenCL runtime or development packages are required.

AMD’s Windows driver package includes the OpenCL runtime and ICD components automatically. No additional installation or PATH configuration is required. If OpenCL is not detected, updating to the latest AMD Software (Adrenalin Edition) typically resolves the issue. The diagnose_glmbayes() function will report whether the AMD OpenCL components are correctly installed.

2.2 Linux

To compile and run glmbayes with OpenCL support on Linux, you must install:

  1. An OpenCL implementation (vendor runtime)
  2. OpenCL header files (compile‑time)
  3. The OpenCL ICD loader (runtime)
  4. The OpenCL development library providing libOpenCL.so (linking)

The correct OpenCL implementation depends on your GPU vendor.
For NVIDIA and Intel, the OpenCL runtime is installed automatically with the GPU driver.
For AMD, you must choose the correct OpenCL stack — see Appendix A.


2.2.1 OpenCL Implementation (Vendor Runtime)

An OpenCL implementation must be present for your GPU vendor:

  • NVIDIA: Installed automatically with the proprietary driver.
  • Intel: Installed automatically with the Intel GPU driver.
  • AMD: Requires installing ROCm OpenCL. See Appendix A for details.

The remaining subsections describe the generic OpenCL development components required to compile glmbayes from source.


2.2.2 OpenCL Header Files (for compilation)

These provide CL/cl.h, CL/cl_platform.h, and related headers.

Install:

  • Ubuntu / Debian

      sudo apt-get install opencl-headers
  • Fedora

      sudo dnf install opencl-headers
  • Arch Linux

      sudo pacman -S opencl-headers

If these headers are missing, glmbayes cannot compile GPU support.


2.2.3 OpenCL Runtime (ICD Loader)

This provides the shared library:

libOpenCL.so.1

which is required for OpenCL to function at runtime.

Install:

  • Ubuntu / Debian

      sudo apt-get install ocl-icd-libopencl1
  • Fedora

      sudo dnf install ocl-icd
  • Arch Linux

      sudo pacman -S opencl-icd-loader

Without the runtime, OpenCL calls (e.g., clGetPlatformIDs()) will fail.


2.3 macOS

macOS includes the OpenCL headers and the system OpenCL runtime as part of the Xcode Command Line Tools, so no additional installation is required for compilation. However, Apple has deprecated OpenCL, and modern Apple Silicon systems do not provide hardware OpenCL support. As a result, glmbayes will compile successfully on macOS, but GPU acceleration is not guaranteed and may fall back to CPU execution.

3. Install glmbayes from source via R‑universe

Because CRAN and R‑universe do not build packages with OpenCL support, installing from source is required to enable GPU acceleration.

install.packages(
  "glmbayes",
  repos = c("https://cloud.r-project.org", "https://knygren.r-universe.dev"),
  type = "source"
)

4. Load the package

library(glmbayes)

5. Check for OpenCL availability

On Linux, if you have shell access (including Jupyter Terminal on a cloud instance), running clinfo before opening R is recommended: confirm Number of platforms >= 1 (see §2.2.5). That catches missing vendor ICDs early, independent of the R session.

has_opencl()

5.1 If FALSE, follow helper functions for next steps

If has_opencl() returns FALSE, run:

diagnose_glmbayes()

This function performs a full OpenCL environment check and reports whether your system is correctly configured for GPU acceleration.

A clean diagnostic on Linux looks like:

=== glmbayes Diagnostic Report ===
Environment: linux

GPU: NVIDIA
  [OK] Driver installed
  [OK] OpenCL headers found (CL/cl.h)
  [OK] OpenCL runtime found (OpenCL.dll / ICD)
  [OK] OpenCL fully available (headers + runtime)
  [OK] Required PATH and library dirs present
  [OK] OpenCL runtime probe succeeded (platform available)

[OK] glmbayes was compiled with OpenCL support.

Explanation of each line:

  • Environment: linux
    Confirms the operating system detected by glmbayes.

  • GPU: NVIDIA
    Identifies the GPU vendor detected by the OpenCL runtime.
    (May show Intel, AMD, or “CPU” depending on the system.)

  • [OK] Driver installed
    The system GPU driver is present and exposes an OpenCL platform.

  • [OK] OpenCL headers found (CL/cl.h)
    The OpenCL development headers required for compilation are installed.

  • [OK] OpenCL runtime found (OpenCL.dll / ICD)
    The OpenCL ICD loader (libOpenCL.so.1 on Linux) is available at runtime.

  • [OK] OpenCL fully available (headers + runtime)
    Confirms that both compile‑time and run‑time OpenCL components are present.

  • [OK] Required PATH and library dirs present
    The library search paths include the directories where OpenCL is installed.

  • [OK] OpenCL runtime probe succeeded (platform available)
    A live OpenCL call (clGetPlatformIDs()) succeeded, meaning the system can enumerate OpenCL devices.

  • [OK] glmbayes was compiled with OpenCL support.
    Confirms that the package was built with OpenCL enabled (i.e., the linker found libOpenCL.so).

If any line shows a warning or error, the diagnostic output will indicate which component is missing and how to correct it.

6. Run an example to test OpenCL functionality

example(Cleveland)

Cleveland heart data: CPU vs OpenCL (illustrative)

The chunks below are not executed during vignette builds (runtime). They mirror a large glmb CPU vs OpenCL comparison; for an interactive GPU check use example(Cleveland) (§6 above).

# Cleveland GPU-accelerated example (same pattern as example(Cleveland))
# (Not executed in the vignette build.)

library(glmbayes)

# Load the dataset
data("Cleveland")

# ------------------------------------------------------------------
# OpenCL-accelerated Bayesian logistic regression example
# This example only runs if OpenCL is available.
# ------------------------------------------------------------------

  # Prior setup for the full model
  ps <- Prior_Setup(
    hd ~ age + sex + cp + trestbps + chol +
      fbs + restecg + thalach + exang + oldpeak + slope + ca + thal,
    family = binomial(logit),
    data = Cleveland
  )

t_non_opencl <- system.time({
  fit_non_opencl <- glmb(
    hd ~ age + sex + cp + trestbps + chol +
      fbs + restecg + thalach + exang + oldpeak + slope + ca + thal,
    family       = binomial(link = "logit"),
    pfamily      = dNormal(mu = ps$mu, Sigma = ps$Sigma),
    data         = Cleveland,
    n            = 20000,
    Gridtype     = 2,
    use_parallel = TRUE,
    use_opencl   = FALSE,
    verbose      = FALSE
  )
})

t_non_opencl

t_opencl <- system.time({
  fit_opencl <- glmb(
    hd ~ age + sex + cp + trestbps + chol +
      fbs + restecg + thalach + exang + oldpeak + slope + ca + thal,
    family       = binomial(link = "logit"),
    pfamily      = dNormal(mu = ps$mu, Sigma = ps$Sigma),
    data         = Cleveland,
    n            = 20000,
    Gridtype     = 2,
    use_parallel = TRUE,
    use_opencl   = TRUE,
    verbose      = FALSE
  )
})

t_opencl


summary(fit_opencl)