---
title: "Introduction to Quantile-on-Quantile Regression"
author: "Dr. Merwan Roudane"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction to Quantile-on-Quantile Regression}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5
)
```

## Overview

The **QuantileOnQuantile** package implements the Quantile-on-Quantile (QQ) regression methodology developed by Sim and Zhou (2015). This approach estimates the effect that quantiles of one variable have on quantiles of another, capturing the dependence between their distributions.

### Why Quantile-on-Quantile Regression?

Traditional regression methods like OLS estimate the effect of independent variables on the *conditional mean* of the dependent variable. Quantile regression extends this by estimating effects on *conditional quantiles*. However, both approaches treat the independent variable as a single entity, ignoring the possibility that the relationship may vary depending on whether the independent variable takes extreme or moderate values.

The QQ approach addresses this limitation by:

1. Modeling the **quantile of Y** (the dependent variable) as the outcome
2. Examining the effect of different **quantiles of X** (the independent variable)
3. Producing coefficients indexed by both $\theta$ (quantile of Y) and $\tau$ (quantile of X)

This allows researchers to ask questions like:
- Do large positive shocks in X affect Y differently than large negative shocks?
- Does the relationship between X and Y depend on market conditions (bull vs bear)?
- Is the dependence between variables stronger in the tails of their distributions?

### The Original Application: Oil Prices and Stock Returns

Sim and Zhou (2015) applied this methodology to examine the relationship between oil price shocks and US stock returns. They found that:

1. Large negative oil price shocks (low $\tau$) can positively affect US equities when the stock market is performing well (high $\theta$)
2. Positive oil price shocks have weak effects regardless of market conditions
3. The relationship is asymmetric and depends on both the nature of oil shocks and market state

## Installation

```{r, eval=FALSE}
# Install from CRAN (when available)
install.packages("QuantileOnQuantile")

# Install from GitHub
# install.packages("devtools")
devtools::install_github("merwanroudane/qq")
```

## Quick Start

### Basic Usage

```{r basic_example}
library(QuantileOnQuantile)

# Generate example data
set.seed(42)
n <- 300
x <- rnorm(n)
y <- 0.5 * x + 0.3 * x * (x < 0) + rnorm(n, sd = 0.5)  # Asymmetric relationship

# Run QQ regression
result <- qq_regression(y, x, verbose = FALSE)

# Print summary
print(result)
```

### Summary Statistics

```{r statistics}
# Get detailed summary
summary(result)

# Get statistics as data frame
stats <- qq_statistics(result)
print(stats)
```

## Visualization

The package provides several interactive visualization options using plotly.

### 3D Surface Plot

The 3D surface plot is the signature visualization of the QQ approach, showing how coefficients vary across both dimensions.

```{r 3d_plot, eval=FALSE}
# Coefficient surface with MATLAB-style Jet colorscale
plot_qq_3d(result, type = "coefficient", colorscale = "Jet")

# R-squared surface with Viridis colorscale
plot_qq_3d(result, type = "rsquared", colorscale = "Viridis")

# P-value surface
plot_qq_3d(result, type = "pvalue", colorscale = "Plasma")
```

### Available Color Scales

The package supports several color scales:

```{r colorscales}
qq_colorscales()
```

- **Jet**: MATLAB-style rainbow (blue -> cyan -> green -> yellow -> red)
- **BlueRed**: Diverging scale, useful for coefficients centered around zero
- **Viridis**: Perceptually uniform, colorblind-friendly
- **Plasma**: High contrast, perceptually uniform

### Heatmaps

Heatmaps provide a 2D view of the results:

```{r heatmap, eval=FALSE}
# Coefficient heatmap
plot_qq_heatmap(result, type = "coefficient", colorscale = "Viridis")

# R-squared heatmap
plot_qq_heatmap(result, type = "rsquared", colorscale = "Plasma")

# P-value heatmap
plot_qq_heatmap(result, type = "pvalue", colorscale = "Jet")
```

### Contour Plot

Contour plots show level curves of the coefficient surface:

```{r contour, eval=FALSE}
plot_qq_contour(result, colorscale = "Jet", show_labels = TRUE)
```

### Quantile Correlation

The correlation heatmap shows the relationship between quantiles of both variables:

```{r correlation, eval=FALSE}
plot_qq_correlation(y, x, quantiles = seq(0.1, 0.9, by = 0.1))
```

## Detailed Example: Simulating Asymmetric Relationships

Let's create data that mimics the oil-stock relationship from the original paper:

```{r detailed_example}
set.seed(2015)
n <- 500

# Generate "oil shocks"
oil_shock <- rnorm(n)

# Generate "stock returns" with asymmetric response
stock_return <- numeric(n)
for (i in 1:n) {
  # Base return
  base_return <- 0.01
  
  # Asymmetric effect
  if (oil_shock[i] < quantile(oil_shock, 0.3)) {
    # Large negative oil shocks have positive effect
    effect <- -0.02 * oil_shock[i]
  } else if (oil_shock[i] > quantile(oil_shock, 0.7)) {
    # Large positive oil shocks have weak effect
    effect <- -0.005 * oil_shock[i]
  } else {
    # Moderate shocks have little effect
    effect <- -0.001 * oil_shock[i]
  }
  
  stock_return[i] <- base_return + effect + rnorm(1, sd = 0.04)
}

# Run QQ regression with finer grid
result_oil <- qq_regression(
  y = stock_return, 
  x = oil_shock,
  y_quantiles = seq(0.1, 0.9, by = 0.1),
  x_quantiles = seq(0.1, 0.9, by = 0.1),
  verbose = FALSE
)

# Print summary
print(result_oil)
```

## Working with Results

### Extracting Results

The results are stored in a data frame:

```{r extract_results}
# Access raw results
head(result_oil$results)

# Convert to matrix format
coef_matrix <- qq_to_matrix(result_oil, type = "coefficient")
print(round(coef_matrix, 4))
```

### Exporting Results

```{r export, eval=FALSE}
# Export to CSV
qq_export(result_oil, file.path(tempdir(), "qq_results.csv"))
```

## Customizing Quantile Grids

You can customize the quantile grid for more or less granularity:

```{r custom_grid}
# Coarse grid (faster computation)
result_coarse <- qq_regression(
  y = stock_return,
  x = oil_shock,
  y_quantiles = seq(0.2, 0.8, by = 0.2),
  x_quantiles = seq(0.2, 0.8, by = 0.2),
  verbose = FALSE
)

# Fine grid (more detail, slower)
result_fine <- qq_regression(
  y = stock_return,
  x = oil_shock,
  y_quantiles = seq(0.05, 0.95, by = 0.05),
  x_quantiles = seq(0.05, 0.95, by = 0.05),
  verbose = FALSE
)

cat("Coarse grid combinations:", nrow(result_coarse$results), "\n")
cat("Fine grid combinations:", nrow(result_fine$results), "\n")
```

## Methodology Details

### The QQ Model

The QQ approach is based on the following model:

$$r_t^\theta = \beta^\theta(Oil_t) + \alpha^\theta r_{t-1} + v_t^\theta$$

where $r_t^\theta$ is the $\theta$-quantile of the return and $\beta^\theta(\cdot)$ is an unknown function.

Taking a Taylor expansion around the $\tau$-quantile of oil shocks ($Oil^\tau$):

$$\beta^\theta(Oil_t) \approx \beta_0(\theta, \tau) + \beta_1(\theta, \tau)(Oil_t - Oil^\tau)$$

The key insight is that $\beta_0(\theta, \tau)$ and $\beta_1(\theta, \tau)$ are *doubly indexed* by $\theta$ and $\tau$, capturing the dependence between both distributions.

### Estimation

The estimation proceeds by:

1. For each $\tau$ (quantile of X): subset data where X <= quantile(X, $\tau$)
2. For each $\theta$ (quantile of Y): perform quantile regression
3. Extract coefficients and compute pseudo R-squared

The pseudo R-squared is computed as:

$$R^2 = 1 - \frac{\sum \rho_\theta(y - \hat{y})}{\sum \rho_\theta(y - Q_\theta(y))}$$

where $\rho_\theta(u) = u(\theta - I(u < 0))$ is the check function.

## Comparison with Standard Methods

### OLS vs Quantile Regression vs QQ

| Method | What it estimates | Captures heterogeneity in... |
|--------|------------------|------------------------------|
| OLS | E[Y|X] | None (constant effect) |
| Quantile Regression | Q_theta[Y|X] | Y distribution |
| QQ Regression | Q_theta[Y|X_tau] | Both Y and X distributions |

### When to Use QQ Regression

Use QQ regression when you suspect that:

1. The effect of X on Y varies across the distribution of Y (e.g., bull vs bear markets)
2. The effect differs for large vs small values of X (e.g., large vs small shocks)
3. There is asymmetry (e.g., positive vs negative shocks have different effects)
4. You want to understand the complete dependence structure between two variables

## References

Sim, N. and Zhou, H. (2015). Oil Prices, US Stock Return, and the Dependence Between Their Quantiles. *Journal of Banking & Finance*, 55, 1-12. doi:10.1016/j.jbankfin.2015.01.013

Koenker, R. (2005). *Quantile Regression*. Cambridge University Press.

Koenker, R. and Xiao, Z. (2006). Quantile Autoregression. *Journal of the American Statistical Association*, 101, 980-990.

## Session Info

```{r session_info}
sessionInfo()
```
