---
title: "Advanced Survival Models"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Advanced Survival Models}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

## Introduction

`singleEventSurvival()` supports non-parametric, semi-parametric, and parametric
survival estimators through a common interface. This vignette shows how to compare
those models once a survival dataset has already been prepared from Eunomia.

## Example Input

All examples below assume you already created a survival dataset with the internal
`addCohortSurvival()` helper and included age and gender columns.

```{r setup}
library(OdysseusSurvivalModule)

survivalData <- data.frame(
  subject_id = 1:8,
  time = c(15, 21, 40, 55, 60, 74, 90, 120),
  status = c(1, 0, 1, 0, 1, 0, 1, 0),
  age_years = c(44, 51, 67, 39, 73, 58, 62, 47),
  gender = c("Female", "Male", "Female", "Male", "Female", "Male", "Female", "Male")
)
```

## Model Choices

Supported values for `model` are:

- `"km"`
- `"cox"`
- `"weibull"`
- `"exponential"`
- `"lognormal"`
- `"loglogistic"`

All of them return the same high-level structure: a named list with `data` and
`summary` per stratum, plus `overall`.

## Cox Model

```{r cox-model}
coxFit <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "cox",
  covariates = c("age_years")
)

coxFit[["overall"]]$summary
head(coxFit[["overall"]]$data)
```

The Cox path uses covariates when fitting the model, but the returned object is still
survival-oriented. It does not expose regression coefficients or hazard ratios.

## Parametric Models

```{r parametric-models}
weibullFit <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "weibull",
  covariates = c("age_years")
)

lognormalFit <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "lognormal",
  covariates = c("age_years")
)

weibullFit[["overall"]]$summary
lognormalFit[["overall"]]$summary
```

Parametric models also return a `data` table with estimated survival, hazard, and
cumulative hazard evaluated on the observed event-time grid.

## Comparing Models

One practical way to compare models is to extract the same summary fields from each fit.

```{r model-comparison}
modelNames <- c("km", "cox", "weibull", "lognormal")

fits <- lapply(modelNames, function(modelName) {
  singleEventSurvival(
    survivalData = survivalData,
    timeScale = "days",
    model = modelName,
    covariates = if (modelName == "km") NULL else c("age_years")
  )
})
names(fits) <- modelNames

comparison <- data.frame(
  model = names(fits),
  medianSurvival = vapply(fits, function(x) x[["overall"]]$summary$medianSurvival, numeric(1)),
  meanSurvival = vapply(fits, function(x) x[["overall"]]$summary$meanSurvival, numeric(1)),
  stringsAsFactors = FALSE
)

comparison
```

## Stratified Fitting

`strata` accepts `"gender"` and `"age_group"`. When both are supplied, the package
fits them separately, not as joint interaction strata.

```{r stratified-models}
stratifiedFit <- singleEventSurvival(
  survivalData = survivalData,
  timeScale = "days",
  model = "weibull",
  covariates = c("age_years"),
  strata = c("gender", "age_group"),
  ageBreaks = list(c(18, 49), c(50, 64), c(65, Inf))
)

names(stratifiedFit)
stratifiedFit[["gender=Female"]]$summary
stratifiedFit[["age_group=65+"]]$summary
stratifiedFit$logrank_test_gender
stratifiedFit$logrank_test_age_group
```

## Working with Returned Curves

Each fitted entry can be plotted from its `data` component.

```{r curve-plot}
curveData <- weibullFit[["overall"]]$data

plot(
  curveData$time,
  curveData$survival,
  type = "l",
  xlab = "Time (days)",
  ylab = "Survival probability",
  main = "Weibull survival curve"
)
```

## When to Use Which Model

- Use `"km"` for descriptive, assumption-light summaries.
- Use `"cox"` when covariates matter but you still want a survival-curve summary.
- Use parametric models when you want a fully specified survival shape and smoother
  predicted curves.

## Summary

The advanced usage pattern is mostly about choosing the right `model` value and then
extracting comparable summaries from the returned list structure.
```

## Summary

This vignette covered:

1. **Cox Regression** - Interpreting hazard ratios and multiple covariates
2. **Parametric Models** - Weibull, exponential, lognormal, loglogistic
3. **Model Comparison** - When to use each approach
4. **Stratification** - Gender, age, and multi-dimension strata
5. **Visualization** - Plotting and comparing survival curves
6. **Diagnostics** - Checking modeling assumptions

For getting started, see the "Getting Started" vignette.
