---
title: "Optimization Path Example: Portfolio Selection via Variance Minimization"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Optimization Path Example: Portfolio Selection via Variance Minimization}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>"
)
```

## Overview

ROOT is a **general functional optimization framework**: by supplying a custom
objective function, it can be applied to any problem where the goal is to learn
an interpretable binary weight function $w(X) \in \{0, 1\}$ over a set of
groups by covariates of interest.

In general optimization mode (`generalizability_path = FALSE`), the user
provides a `data.frame` with an optional column `vsq` (a per-unit variance
proxy, or the outcome to minimize) and any covariates to split on. ROOT 
searches over tree-structured weight functions to minimize the supplied 
objective, then returns a Rashomon set of near-optimal trees and a single 
summary tree characterizing the final weight assignments. The decision to vote
on a single summary tree can be by default majority vote, or the user can 
specify their own voting functions.

This vignette demonstrates ROOT in general optimization mode using a
**portfolio selection** problem: given a universe of 100 simulated assets
characterized by their market beta and annualized volatility, ROOT learns
an interpretable rule for which assets to include ($w = 1$) in order to
minimize portfolio return variance.

---

## Problem Setup

### Why portfolio selection?

Constructing a minimum-variance portfolio could be framed as an optimization problem.
The standard approach could produce weights that are continuous and can be hard 
to interpret or communicate. ROOT offers a complementary perspective: it returns a 
**binary inclusion rule** - a sparse decision tree that describes, in plain language, 
which types of assets belong in the low-variance portfolio.

### Mapping to ROOT's framework

The default global objective function ROOT minimizes is as follows: $L(w, D) = \sqrt{\frac{\sum_i w_i \cdot v_i^2}{\left(\sum_i w_i\right)^2}}$

where $v_i^2$ (`vsq`) is a pseudo outcome value, known as the per-unit variance 
proxy in the optimization example. In the portfolio context, we set `vsq` to the 
historical return variance of each asset. ROOT then finds the binary weight 
$w(X) \in \{0, 1\}$, described as a decision tree over asset features such as beta 
and volatility that minimizes this quantity. Users can also supply the global 
objective function they wish to minimize as well.

---

## Simulating the Data

We simulate 100 assets with two features: annualized volatility and market
beta. Returns are generated as a market factor model, and per-asset return
variance is computed from 1,000 simulated return observations.

```{r simulate}
library(ROOT)
set.seed(123)

n_assets <- 100

# Asset features
volatility <- runif(n_assets, 0.05, 0.40)  # annualised volatility
beta       <- runif(n_assets, 0.5,  1.8)   # market beta
sector     <- sample(c("Tech", "Finance", "Energy", "Health"),
                     n_assets, replace = TRUE)

# Simulate returns: r_i = beta_i * r_market + epsilon_i
market      <- rnorm(1000, 0.0005, 0.01)
returns_mat <- sapply(seq_len(n_assets), function(i)
  beta[i] * market + rnorm(1000, 0, volatility[i] / sqrt(252))
)

# Per-asset return variance (the objective proxy ROOT will minimize)
vsq <- apply(returns_mat, 2, var)

dat_portfolio <- data.frame(
  vsq    = vsq,
  vol    = volatility,
  beta   = beta,
  sector = as.integer(factor(sector))
)

head(dat_portfolio)
```

The `vsq` column is recognized by ROOT's default objective function as the
per-unit variance proxy. The columns `vol`, `beta`, and `sector` are the
splitting features available to the tree.

### Distribution of asset risk

```{r risk-plot, fig.width = 6, fig.height = 4, fig.alt = "Scatter plot of asset beta vs volatility"}
plot(
  dat_portfolio$beta, dat_portfolio$vol,
  xlab = "Market beta", ylab = "Annualised volatility",
  pch  = 16, col = "#4E79A7AA",
  main = "Asset universe: volatility vs beta"
)
```

We expect ROOT to identify the high-beta, high-volatility corner of this space
as the region to exclude.

---

## Fitting ROOT

```{r fit, message = FALSE, warning = FALSE}
portfolio_fit <- ROOT(
  data        = dat_portfolio,
  num_trees   = 20,
  top_k_trees = TRUE,
  k           = 10,
  seed        = 42
)
```

ROOT grows 20 trees and selects the 10 with the lowest objective value as the
Rashomon set. Final asset weights $w_{\text{opt}}$ are determined by majority
vote across those 10 trees.

---

## Inspecting the Results

### Print summary

```{r print}
print(portfolio_fit)
```

### Detailed summary

```{r summary}
summary(portfolio_fit)
```

The **Diagnostics** section confirms that 20 trees were grown, 10 were
selected into the Rashomon set, and 96% of assets received $w_{\text{opt}} = 1$
(included in the portfolio). Only the 4 most risk-concentrated assets are
screened out.

---

## Visualizing the Characterized Tree

```{r plot, fig.width = 7, fig.height = 5, fig.alt = "Characterized tree for portfolio selection"}
plot(portfolio_fit)
```

The tree encodes a simple, actionable rule:

- **`beta < 1.7`**: include the asset unconditionally (88 assets).
- **`beta >= 1.7` and `vol < 0.33`**: include the asset despite high market
  sensitivity, because some risk may be controlled (8 assets).
- **`beta >= 1.7` and `vol >= 0.33`**: exclude the asset — high market
  sensitivity combined with high volatility drives up portfolio
  variance (4 assets, $w = 0$).

---

## Examining the Weights

The final weight assignments are stored in `portfolio_fit$D_rash$w_opt`. We
can compare the included and excluded assets directly.

```{r weights}
dat_portfolio$w_opt <- portfolio_fit$D_rash$w_opt

# Summary statistics by inclusion decision
included <- dat_portfolio[dat_portfolio$w_opt == 1, ]
excluded <- dat_portfolio[dat_portfolio$w_opt == 0, ]

cat("Included assets (w = 1):", nrow(included), "\n")
cat("  Mean beta:       ", round(mean(included$beta), 3), "\n")
cat("  Mean volatility: ", round(mean(included$vol),  3), "\n\n")

cat("Excluded assets (w = 0):", nrow(excluded), "\n")
cat("  Mean beta:       ", round(mean(excluded$beta), 3), "\n")
cat("  Mean volatility: ", round(mean(excluded$vol),  3), "\n")
```

The excluded assets have substantially higher beta and volatility than the
included ones, confirming that ROOT correctly targets the risk-concentrated
corner of the asset universe.

### Visualizing the inclusion decision

```{r weights-plot, fig.width = 6, fig.height = 4, fig.alt = "Asset universe with inclusion decisions highlighted"}
plot(
  dat_portfolio$beta, dat_portfolio$vol,
  xlab = "Market beta", ylab = "Annualised volatility",
  pch  = ifelse(dat_portfolio$w_opt == 1, 16, 4),
  col  = ifelse(dat_portfolio$w_opt == 1, "#4E79A7", "#E15759"),
  main = "Portfolio inclusion decisions"
)
legend(
  "topleft",
  legend = c("w = 1 (included)", "w = 0 (excluded)"),
  pch    = c(16, 4),
  col    = c("#4E79A7", "#E15759"),
  bty    = "n"
)
```

---

## Using a Custom Objective Function

ROOT's general optimization mode is not limited to the default variance
objective. You can supply any function of the form `function(D) -> numeric`
where `D` is the working data frame with a column `w` containing the current
weight assignments.

For example, suppose we want to minimize the **interquartile range** of
portfolio returns rather than variance. We can define a custom objective:

```{r custom-obj, message = FALSE, warning = FALSE}
iqr_objective <- function(D) {
  w <- D$w
  if (sum(w) == 0) return(Inf)
  # Weighted IQR: compute quantiles using the included assets only
  included_vsq <- D$vsq[w == 1]
  diff(quantile(included_vsq, probs = c(0.25, 0.75)))
}

portfolio_fit_iqr <- ROOT(
  data                = dat_portfolio,
  global_objective_fn = iqr_objective,
  num_trees           = 20,
  top_k_trees         = TRUE,
  k                   = 10,
  seed                = 112
)
```

The custom objective illustrates ROOT's flexibility: any scalar-valued
function of the weighted dataset can be used as the optimization target,
and the resulting characterized tree would still be interpretable.

---

## Key Parameters

| Parameter            | Role                                                                 | Default        |
|:---------------------|:---------------------------------------------------------------------|:---------------|
| `num_trees`          | Number of trees to grow in the forest                                | `10`           |
| `top_k_trees`        | If `TRUE`, select the top `k` trees by objective value               | `FALSE`        |
| `k`                  | Rashomon set size when `top_k_trees = TRUE`                          | `10`           |
| `cutoff`             | Rashomon threshold when `top_k_trees = FALSE`; `"baseline"` uses objective at $w \equiv 1$ | `"baseline"` |
| `vote_threshold`     | Fraction of Rashomon-set trees that must vote $w = 1$ for inclusion  | `2/3`          |
| `global_objective_fn`| Custom objective `function(D) -> numeric`; if `NULL`, uses default variance objective | `NULL` |
| `seed`               | Random seed for reproducibility                                      | `NULL`         |
| `feature_est`        | Feature importance method for split selection (`"Ridge"`, `"GBM"`, or custom) | `"Ridge"` |
| `leaf_proba`         | Controls tree depth by increasing the probability of stopping at a leaf | `0.25`      |

---

## Reference

Parikh, H., Ross, R. K., Stuart, E., & Rudolph, K. E. (2025). Who Are We
Missing?: A Principled Approach to Characterizing the Underrepresented
Population. *Journal of the American Statistical Association*, 120(551),
1414–1423. <https://doi.org/10.1080/01621459.2025.2495319>