---
title: "Monitoring and Retrieving Extraction Jobs"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Monitoring and Retrieving Extraction Jobs}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  eval     = FALSE
)
```

## Overview

When `extract_batch()` submits a table-exporter job, it runs asynchronously on the RAP cloud. The `job_*` functions let you monitor progress, inspect job history, and load results once the job completes.

---

## Typical Workflow

```{r workflow}
library(ukbflow)

# 1. Submit extraction job
job_id <- extract_batch(c(31, 53, 21022, 22189), file = "ukb_demographics")

# 2. Wait for completion
job_wait(job_id)

# 3. Load result (RAP only)
df <- job_result(job_id)
```

---

## Monitoring a Job

### Check status

`job_status()` returns the current state of a job:

```{r job-status}
job_status(job_id)
#> job-XXXXXXXXXXXX
#>            done
```

Possible states:

| State | Meaning |
|---|---|
| `idle` | Queued, waiting to be scheduled |
| `runnable` | Resources being allocated |
| `running` | Actively executing |
| `done` | Completed successfully |
| `failed` | Failed — see failure message |
| `terminated` | Manually terminated |

For failed jobs, the error message is accessible via:

```{r job-failed}
s <- job_status(job_id)
if (s == "failed") cli::cli_inform(attr(s, "failure_message"))
```

### Wait for completion

`job_wait()` polls at regular intervals until the job reaches a terminal state:

```{r job-wait}
job_wait(job_id)                    # wait indefinitely (default)
job_wait(job_id, interval = 60)     # poll every 60 seconds
job_wait(job_id, timeout = 7200)    # give up after 2 hours
```

`job_wait()` stops with an error if the job fails or is terminated, so you can safely chain it with `job_result()`:

```{r job-wait-chain}
job_wait(job_id)
df <- job_result(job_id)
```

---

## Retrieving Results

### Get the file path

`job_path()` returns the `/mnt/project/` path of the output CSV on RAP:

```{r job-path}
path <- job_path(job_id)
#> "/mnt/project/results/ukb_demographics.csv"
```

Use this to read the file directly or pass it to other tools:

```{r job-path-read}
df <- data.table::fread(job_path(job_id))
```

### Load into R

`job_result()` combines `job_path()` and `fread()` in one step. Must be run inside the RAP environment:

```{r job-result}
df <- job_result(job_id)
# returns a data.table, e.g. 502353 rows x 5 cols (incl. eid)
```

---

## Browsing Job History

`job_ls()` returns a summary of recent jobs:

```{r job-ls}
job_ls()          # last 20 jobs
job_ls(n = 5)     # last 5 jobs

# Filter by state
job_ls(state = "failed")
job_ls(state = c("done", "failed"))
```

The result is a data.frame with columns:

| Column | Description |
|---|---|
| `job_id` | Job ID, e.g. `job-XXXXXXXXXXXX` |
| `name` | Job name (typically `Table exporter`) |
| `state` | Current state |
| `created` | Job creation time (`POSIXct`) |
| `runtime` | Runtime string, e.g. `0:04:36` (`NA` if still running) |

---

## Getting Help

- `?job_status`, `?job_wait`, `?job_path`, `?job_result`, `?job_ls`
- `vignette("extract")` — submitting extraction jobs
- [GitHub Issues](https://github.com/evanbio/ukbflow/issues)