---
title: "Getting Started with clmstan"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting Started with clmstan}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

## Introduction

**clmstan** is an R package for fitting Cumulative Link Models (CLMs) using
Bayesian inference via CmdStan. It provides:
- **11 link functions**: 5 standard + 6 flexible parametric links
- **3 threshold structures**: flexible, equidistant, symmetric
- **Bayesian estimation**: Full posterior inference with MCMC
- **Pre-compiled Stan models**: Fast execution via the instantiate package

### Comparison with ordinal::clm()
| Feature | ordinal::clm() | clmstan |
|---------|----------------|---------|
| Inference | Maximum likelihood | Bayesian (MCMC) |
| Uncertainty | Asymptotic SE | Posterior distribution |
| Link functions | 5 standard | 11 (5 standard + 6 flexible) |
| Estimation of link parameters | No | Yes |

## Installation

Before using clmstan, you need to install CmdStan:

```{r install}
# Step 1: Install cmdstanr
install.packages("cmdstanr",
                 repos = c("https://stan-dev.r-universe.dev",
                           getOption("repos")))

# Step 2: Install CmdStan (only needed once)
library(cmdstanr)
install_cmdstan()

# Step 3: Install clmstan
# devtools::install_github("t-momozaki/clmstan")
```

## Quick Start

Fit a cumulative link model in just a few lines:

```{r quickstart}
library(clmstan)
library(ordinal)
data(wine)

# Fit a model with logit link
fit <- clm_stan(rating ~ temp + contact, data = wine,
                chains = 4, iter = 2000, warmup = 1000)

# View results
print(fit)
summary(fit)
```

The `wine` dataset contains ratings (1-5) of 72 wine samples with two
predictors: temperature (cold/warm) and contact (yes/no).

## For ordinal::clm() Users

If you're familiar with the ordinal package, the transition to clmstan is
straightforward:

### API Correspondence
| ordinal::clm() | clmstan::clm_stan() |
|----------------|---------------------|
| `clm(y ~ x, data)` | `clm_stan(y ~ x, data)` |
| `link = "logit"` | `link = "logit"` |
| `threshold = "flexible"` | `threshold = "flexible"` |
| `summary(fit)` | `summary(fit)` |
| `coef(fit)` | `coef(fit)` |

### Example Comparison

```{r comparison}
library(ordinal)
library(clmstan)
data(wine)

# ordinal::clm (frequentist)
fit_freq <- clm(rating ~ temp + contact, data = wine, link = "logit")

# clmstan::clm_stan (Bayesian)
fit_bayes <- clm_stan(rating ~ temp + contact, data = wine, link = "logit",
                      chains = 4, iter = 2000, warmup = 1000)

# Compare coefficients
coef(fit_freq)
coef(fit_bayes)
```

### Key Differences
1. **Posterior distributions**: clmstan provides full posterior samples,
   enabling Bayesian inference (credible intervals, posterior predictive checks)
2. **Reproducibility**: Set `set.seed()` before `clm_stan()` for reproducible
   results
3. **Computation time**: MCMC sampling is slower than MLE, but provides richer
   inference

## Link Functions

clmstan supports 11 link functions, more than any other CLM package.

### Standard Links (5)
```{r standard-links}
# View available link functions
supported_links("standard")

# Logit (default) - proportional odds model
fit_logit <- clm_stan(rating ~ temp, data = wine, link = "logit",
                      chains = 2, iter = 1000, warmup = 500)

# Probit - normal distribution
fit_probit <- clm_stan(rating ~ temp, data = wine, link = "probit",
                       chains = 2, iter = 1000, warmup = 500)

# Complementary log-log - asymmetric (right-skewed)
fit_cloglog <- clm_stan(rating ~ temp, data = wine, link = "cloglog",
                        chains = 2, iter = 1000, warmup = 500)

# Log-log - asymmetric (left-skewed)
fit_loglog <- clm_stan(rating ~ temp, data = wine, link = "loglog",
                       chains = 2, iter = 1000, warmup = 500)

# Cauchit - heavy tails
fit_cauchit <- clm_stan(rating ~ temp, data = wine, link = "cauchit",
                        chains = 2, iter = 1000, warmup = 500)
```

### Flexible Links (6)
Flexible links have parameters that can be fixed or estimated. For theoretical
background, see Wang & Dey (2011) for GEV, Jiang & Dey (2015) for SP, and
Naranjo et al. (2015) for AEP.

```{r flexible-links}
# View flexible link functions
supported_links("flexible")

# t-link with fixed df (heavier tails than logit)
fit_t8 <- clm_stan(rating ~ temp, data = wine, link = "tlink",
                   link_param = list(df = 8),
                   chains = 2, iter = 1000, warmup = 500)

# t-link with estimated df (data-driven tail behavior)
fit_t_est <- clm_stan(rating ~ temp, data = wine, link = "tlink",
                      link_param = list(df = "estimate"),
                      chains = 2, iter = 1000, warmup = 500)

# GEV link with estimated shape parameter
fit_gev <- clm_stan(rating ~ temp, data = wine, link = "gev",
                    link_param = list(xi = "estimate"),
                    chains = 2, iter = 1000, warmup = 500)

# Aranda-Ordaz link (asymmetric)
fit_ao <- clm_stan(rating ~ temp, data = wine, link = "aranda_ordaz",
                   link_param = list(lambda = "estimate"),
                   chains = 2, iter = 1000, warmup = 500)

# Symmetric Power link (skewness adjustment)
fit_sp <- clm_stan(rating ~ temp, data = wine, link = "sp",
                   base = "logit",
                   link_param = list(r = "estimate"),
                   chains = 2, iter = 1000, warmup = 500)

# Log-gamma link
fit_lg <- clm_stan(rating ~ temp, data = wine, link = "log_gamma",
                   link_param = list(lambda = "estimate"),
                   chains = 2, iter = 1000, warmup = 500)

# AEP link (asymmetric exponential power)
fit_aep <- clm_stan(rating ~ temp, data = wine, link = "aep",
                    link_param = list(theta1 = "estimate", theta2 = "estimate"),
                    chains = 2, iter = 1000, warmup = 500)
```

### Link Function Summary

| Link | Parameters | Special Cases |
|------|------------|---------------|
| tlink | df > 0 | df = Inf -> probit |
| aranda_ordaz | lambda > 0 | lambda = 1 -> logit |
| gev | xi (real) | xi = 0 -> loglog |
| sp | r > 0, base | r = 1 -> base distribution |
| log_gamma | lambda (real) | lambda = 0 -> probit |
| aep | theta1, theta2 > 0 | Both = 2 -> similar to probit |

## Threshold Structures

clmstan supports three threshold parameterizations:

```{r thresholds}
# View available threshold structures
supported_thresholds()
```

### Flexible (Default)
Each threshold is freely estimated (K-1 parameters for K categories):

```{r threshold-flexible}
fit_flex <- clm_stan(rating ~ temp, data = wine, threshold = "flexible",
                     chains = 2, iter = 1000, warmup = 500)
```

### Equidistant
Thresholds are equally spaced (2 parameters: start + interval):

```{r threshold-equidistant}
# Useful for Likert scales with assumed equal intervals
fit_equi <- clm_stan(rating ~ temp, data = wine, threshold = "equidistant",
                     chains = 2, iter = 1000, warmup = 500)
```

### Symmetric
Thresholds are symmetric around zero (useful for scales with a neutral center):

```{r threshold-symmetric}
fit_sym <- clm_stan(rating ~ temp, data = wine, threshold = "symmetric",
                    chains = 2, iter = 1000, warmup = 500)
```

## Prior Specification

clmstan uses weakly informative default priors, but you can customize them.

### Default Priors

| Parameter | Default Prior |
|-----------|---------------|
| Coefficients (beta) | normal(0, 2.5) |
| Thresholds (c) | normal(0, 10) |
| t-link df | gamma(2, 0.1) |
| GEV xi | normal(0, 2) |
| SP r | gamma(0.5, 0.5) |
| AEP theta1, theta2 | gamma(2, 1) |

### Using prior() Function (Recommended)
The `prior()` function provides a brms-like interface:

```{r priors-modern}
# Single prior
fit <- clm_stan(rating ~ temp, data = wine,
                prior = prior(normal(0, 1), class = "b"),
                chains = 2, iter = 1000, warmup = 500)

# Multiple priors
fit <- clm_stan(rating ~ temp, data = wine,
                prior = c(
                  prior(normal(0, 1), class = "b"),
                  prior(normal(0, 5), class = "Intercept")
                ),
                chains = 2, iter = 1000, warmup = 500)

# Prior for link parameters
fit_gev <- clm_stan(rating ~ temp, data = wine, link = "gev",
                    link_param = list(xi = "estimate"),
                    prior = prior(normal(0, 0.5), class = "xi"),
                    chains = 2, iter = 1000, warmup = 500)
```

### Available Distributions

```{r dist-helpers}
normal(mu, sigma)      # Normal distribution
gamma(alpha, beta)     # Gamma distribution (shape, rate)
student_t(df, mu, sigma)  # Student-t distribution
cauchy(mu, sigma)      # Cauchy distribution
flat()                 # Flat (improper) prior
```

### Prior Classes

| Class | Description | Compatible Distributions |
|-------|-------------|--------------------------|
| `"b"` | Regression coefficients | normal, student_t, cauchy, flat |
| `"Intercept"` | Thresholds (flexible) | normal, student_t, cauchy, flat |
| `"c1"` | First threshold (equidistant) | normal, student_t, cauchy, flat |
| `"d"` | Interval (equidistant) | gamma |
| `"cpos"` | Positive thresholds (symmetric) | normal, student_t, cauchy, flat |
| `"df"` | t-link degrees of freedom | gamma |
| `"xi"` | GEV shape parameter | normal, student_t, cauchy |
| `"r"` | SP skewness parameter | gamma |
| `"theta1"`, `"theta2"` | AEP shape parameters | gamma |

### Legacy API (clm_prior)
For backward compatibility:

```{r priors-legacy}
fit <- clm_stan(rating ~ temp, data = wine,
                prior = clm_prior(beta_sd = 1, c_sd = 5),
                chains = 2, iter = 1000, warmup = 500)
```

## Interpreting Results

### Basic Methods

```{r results-basic}
fit <- clm_stan(rating ~ temp + contact, data = wine,
                chains = 4, iter = 2000, warmup = 1000)

# Quick overview
print(fit)

# Detailed summary with credible intervals
summary(fit)
summary(fit, probs = c(0.025, 0.5, 0.975))

# Extract point estimates
coef(fit)                    # Posterior mean (default)
coef(fit, type = "median")   # Posterior median
```

### Diagnostic Plots

```{r plots}
# Trace plots (check mixing)
plot(fit, type = "trace")

# Posterior density
plot(fit, type = "dens")

# Posterior intervals
plot(fit, type = "intervals")

# Autocorrelation (check for high autocorrelation)
plot(fit, type = "acf")

# Select specific parameters
plot(fit, type = "trace", pars = c("beta[1]", "beta[2]"))
```

### Convergence Diagnostics

```{r diagnostics}
# Quick summary
diagnostics(fit)

# Detailed output
diagnostics(fit, detail = TRUE)
```

Key diagnostics to check:
- **Rhat**: Should be < 1.01 (values > 1.01 indicate poor convergence)
- **ESS (bulk/tail)**: Should be > 400 (low values suggest inefficient sampling)
- **Divergences**: Should be 0 (divergences indicate sampling problems)

## Prediction

### Category Prediction

```{r predict-class}
fit <- clm_stan(rating ~ temp + contact, data = wine,
                chains = 4, iter = 2000, warmup = 1000)

# Predict most likely category
pred_class <- predict(fit, type = "class")
head(pred_class)

# Prediction for new data
newdata <- data.frame(
  temp = factor("warm", levels = levels(wine$temp)),
  contact = factor("yes", levels = levels(wine$contact))
)
predict(fit, newdata = newdata, type = "class")
```

### Probability Prediction

```{r predict-probs}
# Predicted probabilities for each category
pred_probs <- predict(fit, type = "probs")
head(pred_probs)

# Alternative: fitted values
fitted_vals <- fitted(fit)
head(fitted_vals)
```

### Posterior Predictive Distribution

```{r posterior-predict}
# Draw from posterior predictive distribution
y_rep <- posterior_predict(fit)
dim(y_rep)  # draws x observations

# Uncertainty in predictions
pred_draws <- predict(fit, type = "class", summary = FALSE)
```

## Model Comparison

Use LOO-CV (Leave-One-Out Cross-Validation) for model comparison:

```{r model-comparison}
# Fit competing models
fit1 <- clm_stan(rating ~ temp, data = wine, link = "logit",
                 chains = 4, iter = 2000, warmup = 1000)
fit2 <- clm_stan(rating ~ temp + contact, data = wine, link = "logit",
                 chains = 4, iter = 2000, warmup = 1000)
fit3 <- clm_stan(rating ~ temp + contact, data = wine, link = "probit",
                 chains = 4, iter = 2000, warmup = 1000)

# Compute LOO-CV
loo1 <- loo(fit1)
loo2 <- loo(fit2)
loo3 <- loo(fit3)

# View results
print(loo1)

# Pareto k diagnostic plot
plot(loo1)

# Compare models
loo::loo_compare(loo1, loo2, loo3)

# WAIC is also available
waic(fit1)
```

**Interpreting LOO results:**
- Lower `elpd_loo` (expected log pointwise predictive density) indicates better
  fit
- Pareto k values > 0.7 suggest unreliable LOO estimates for those observations

## Practical Workflow

Here's a recommended workflow for ordinal data analysis:

```{r workflow}
library(clmstan)
library(ordinal)
data(wine)

# Step 1: Fit baseline model
set.seed(123)
fit_base <- clm_stan(rating ~ temp + contact, data = wine,
                     link = "logit", threshold = "flexible",
                     chains = 4, iter = 2000, warmup = 1000)

# Step 2: Check convergence
diagnostics(fit_base)
plot(fit_base, type = "trace")

# Step 3: Review results
summary(fit_base)

# Step 4: Try alternative link functions
fit_probit <- clm_stan(rating ~ temp + contact, data = wine,
                       link = "probit",
                       chains = 4, iter = 2000, warmup = 1000)

fit_gev <- clm_stan(rating ~ temp + contact, data = wine,
                    link = "gev", link_param = list(xi = "estimate"),
                    chains = 4, iter = 2000, warmup = 1000)

# Step 5: Compare models
loo_base <- loo(fit_base)
loo_probit <- loo(fit_probit)
loo_gev <- loo(fit_gev)
loo::loo_compare(loo_base, loo_probit, loo_gev)

# Step 6: Final model interpretation
summary(fit_base)
plot(fit_base, type = "intervals")
```

## Troubleshooting

### CmdStan Not Found

```{r troubleshoot-cmdstan}
# Check CmdStan installation
cmdstanr::cmdstan_path()
cmdstanr::cmdstan_version()

# If not found, install CmdStan:
cmdstanr::install_cmdstan()
```

### Convergence Issues

If you see Rhat > 1.01 or low ESS:

```{r troubleshoot-convergence}
# Increase iterations and warmup
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 4,
                iter = 4000,
                warmup = 2000)

# Increase adapt_delta (helps with divergences)
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 4,
                iter = 2000,
                warmup = 1000,
                adapt_delta = 0.95)

# Increase max_treedepth
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 4,
                iter = 2000,
                warmup = 1000,
                max_treedepth = 12)
```

### Divergent Transitions

Divergent transitions indicate that the sampler had difficulty exploring the
posterior:

1. **Increase `adapt_delta`**: Try 0.95, 0.99, or even 0.999
2. **Reparameterize**: Try different threshold structures
3. **Simplify priors**: Use more informative priors
4. **Check data**: Look for separation or outliers

```{r troubleshoot-divergence}
# High adapt_delta
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 4,
                iter = 2000,
                warmup = 1000,
                adapt_delta = 0.99)
```

### Memory Issues

For large datasets:

```{r troubleshoot-memory}
# Reduce number of chains
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 2,
                iter = 2000,
                warmup = 1000)

# Run chains in parallel (default)
fit <- clm_stan(rating ~ temp, data = wine,
                chains = 4,
                parallel_chains = 4)
```

## References and Resources

### Package Resources
- [clmstan GitHub Repository](https://github.com/t-momozaki/clmstan)
- [Stan User's Guide](https://mc-stan.org/users/documentation/)
- [cmdstanr Documentation](https://mc-stan.org/cmdstanr/)

### Related Packages
- [ordinal](https://cran.r-project.org/package=ordinal): Frequentist CLMs
- [brms](https://paulbuerkner.com/brms/): Bayesian regression (rstan-based)

### Statistical References
- Agresti, A. (2010). *Analysis of Ordinal Categorical Data* (2nd ed.)
- Christensen, R.H.B. (2019). *ordinal: Regression Models for Ordinal Data*
- Jiang, X., & Dey, D.K. (2015). Symmetric power link with ordinal response
  model. In S.K. Upadhyay et al. (Eds.), *Current trends in Bayesian
  methodology with applications* (pp. 335-346). CRC Press.
- Naranjo, L., Perez, C.J., & Martin, J. (2015). Bayesian analysis of some
  models that use the asymmetric exponential power distribution. *Statistics
  and Computing*, 25(3), 497-514.
- Wang, X., & Dey, D.K. (2011). Generalized extreme value regression for
  ordinal response data. *Environmental and Ecological Statistics*, 18(4),
  619-634.
