---
title: "Reporting with tidylearn"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Reporting with tidylearn}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 5,
  message = FALSE,
  warning = FALSE
)
```

## Overview

tidylearn is designed so that analysis results flow directly into reports.
Every model produces tidy tibbles, ggplot2 visualisations, and — with the
`tl_table_*()` functions — polished `gt` tables, all with a consistent
interface. This vignette walks through the reporting tools available.

```{r setup}
library(tidylearn)
library(dplyr)
library(ggplot2)
library(gt)
```

---

## Plots

tidylearn's `plot()` method dispatches to the right visualisation for each
model type. All plots are ggplot2 objects — themeable, composable, and
convertible to plotly.

### Regression

```{r plot-regression}
model_reg <- tl_model(mtcars, mpg ~ wt + hp, method = "linear")

# Actual vs predicted — one call
plot(model_reg, type = "actual_predicted")
```

### Classification

```{r plot-classification}
split <- tl_split(iris, prop = 0.7, stratify = "Species", seed = 42)
model_clf <- tl_model(split$train, Species ~ ., method = "forest")

plot(model_clf, type = "confusion")
```

### PCA

```{r plot-pca}
pca <- tidy_pca(USArrests, scale = TRUE)

tidy_pca_screeplot(pca)
tidy_pca_biplot(pca, label_obs = TRUE)
```

### Regularisation

```{r plot-lasso}
model_lasso <- tl_model(mtcars, mpg ~ ., method = "lasso")

tl_plot_regularization_path(model_lasso)
tl_plot_regularization_cv(model_lasso)
```

---

## Tables

The `tl_table()` family mirrors the plot interface but produces formatted
`gt` tables instead. Like `plot()`, `tl_table()` dispatches based on model
type and a `type` parameter:

```{r table-auto, eval = FALSE}
tl_table(model)                       # auto-selects the best table type
tl_table(model, type = "coefficients") # specific type
```

### Evaluation Metrics

```{r table-metrics}
tl_table_metrics(model_reg)
```

### Coefficients

For linear and logistic models, the table includes standard errors, test
statistics, and p-values, with significant terms highlighted:

```{r table-coef}
tl_table_coefficients(model_reg)
```

For regularised models, coefficients are sorted by magnitude and zero
coefficients are greyed out:

```{r table-coef-lasso}
tl_table_coefficients(model_lasso)
```

### Confusion Matrix

A formatted confusion matrix with correct predictions highlighted on the
diagonal:

```{r table-confusion}
tl_table_confusion(model_clf, new_data = split$test)
```

### Feature Importance

A ranked importance table with a colour gradient:

```{r table-importance}
tl_table_importance(model_clf)
```

### PCA Variance Explained

Cumulative variance is coloured green to highlight how many components are
needed:

```{r table-variance}
pca_model <- tl_model(USArrests, method = "pca")
tl_table_variance(pca_model)
```

### PCA Loadings

A diverging red–blue colour scale highlights strong positive and negative
loadings:

```{r table-loadings}
tl_table_loadings(pca_model)
```

### Cluster Summary

Cluster sizes and mean feature values:

```{r table-clusters}
km <- tl_model(iris[, 1:4], method = "kmeans", k = 3)
tl_table_clusters(km)
```

### Model Comparison

Compare multiple models side-by-side:

```{r table-comparison}
m1 <- tl_model(split$train, Species ~ ., method = "logistic")
m2 <- tl_model(split$train, Species ~ ., method = "forest")
m3 <- tl_model(split$train, Species ~ ., method = "tree")

tl_table_comparison(
  m1, m2, m3,
  new_data = split$test,
  names = c("Logistic", "Random Forest", "Decision Tree")
)
```

---

## Interactive Reporting with plotly

Because all plot functions return ggplot2 objects, converting to interactive
plotly charts is a one-liner:

```{r plotly, eval = FALSE}
library(plotly)

ggplotly(plot(model_reg, type = "actual_predicted"))
ggplotly(tidy_pca_biplot(pca, label_obs = TRUE))
ggplotly(tl_plot_regularization_path(model_lasso))
```

---

## Putting It Together

A typical reporting workflow combines plots and tables for the same model.
Because the interface is consistent, the same pattern works regardless of
the algorithm:

```{r workflow}
# Fit
model <- tl_model(split$train, Species ~ ., method = "forest")

# Evaluate
tl_table_metrics(model, new_data = split$test)

# Visualise
plot(model, type = "confusion")

# Drill into feature importance
tl_table_importance(model, top_n = 4)
```

Swap `method = "forest"` for `method = "tree"` or `method = "svm"` and the
reporting code above works without modification.
