---
title: "Getting started with rcloner"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Getting started with rcloner}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(rcloner)
has_rclone <- rclone_available()
if (!has_rclone) {
  message("rclone is not installed on this system. ",
          "Code chunks that require rclone are skipped. ",
          "Install with install_rclone().")
}
```

## Overview

`rcloner` provides an R interface to [rclone](https://rclone.org), a
command-line program that supports over 40 cloud storage backends, including:

- **S3-compatible** stores: Amazon S3, MinIO, Ceph, Cloudflare R2, Backblaze B2, …
- **Google Cloud Storage** and Google Drive
- **Azure Blob Storage**
- **Dropbox**, **OneDrive**, **Box**, and many others

All file operations (copy, sync, list, move, delete, …) use the same
consistent interface regardless of the storage backend.

## Installation

Install from CRAN:

```r
install.packages("rcloner")
```

Or the development version from GitHub:

```r
# install.packages("pak")
pak::pak("boettiger-lab/rcloner")
```

### Installing the rclone binary

`rcloner` automatically locates a system-installed rclone binary.  If rclone
is not already on your `PATH`, install it with:

```{r install, eval=FALSE}
install_rclone()
```

This downloads the appropriate pre-built binary from
<https://downloads.rclone.org> for your operating system and architecture and
stores it in a user-writable directory — no system privileges required.

Check the installed version:

```{r version, eval = has_rclone}
rclone_version()
```

## Configuring a remote

`rcloner` manages cloud storage credentials through *remotes*, which are named
configurations stored in rclone's config file.

### Amazon S3

```{r s3-config, eval=FALSE}
rclone_config_create(
  "aws",
  type     = "s3",
  provider = "AWS",
  access_key_id     = Sys.getenv("AWS_ACCESS_KEY_ID"),
  secret_access_key = Sys.getenv("AWS_SECRET_ACCESS_KEY"),
  region            = "us-east-1"
)
```

### MinIO / S3-compatible

```{r minio-config, eval=FALSE}
rclone_config_create(
  "minio",
  type     = "s3",
  provider = "Minio",
  access_key_id     = Sys.getenv("MINIO_ACCESS_KEY"),
  secret_access_key = Sys.getenv("MINIO_SECRET_KEY"),
  endpoint          = "https://minio.example.com"
)
```

### Listing configured remotes

```{r listremotes, eval=FALSE}
rclone_listremotes()
```

## Listing objects

`rclone_ls()` returns a data frame of objects at a given path.

### Local paths (no credentials needed)

```{r ls-local, eval = has_rclone}
# List a local directory
rclone_ls(tempdir(), files_only = TRUE)
```

### Remote paths

```{r ls-remote, eval=FALSE}
# List a bucket on a configured remote
rclone_ls("aws:my-bucket")

# Recursive listing
rclone_ls("aws:my-bucket/data/", recursive = TRUE)

# Directories only
rclone_lsd("aws:my-bucket")
```

`rclone_ls()` parses `rclone lsjson` output and returns a data frame with
columns `Path`, `Name`, `Size`, `MimeType`, `ModTime`, and `IsDir`.

## Copying and syncing files

### Copy

`rclone_copy()` copies files from source to destination, skipping identical
files.  It never deletes destination files.

```{r copy-local, eval = has_rclone}
src  <- tempfile()
dest <- tempfile()
dir.create(src)
dir.create(dest)
writeLines("hello from rcloner", file.path(src, "readme.txt"))

rclone_copy(src, dest)
list.files(dest)
```

```{r cleanup-copy, echo = FALSE, eval = has_rclone}
unlink(src,  recursive = TRUE)
unlink(dest, recursive = TRUE)
```

### Copy to/from the cloud

```{r copy-cloud, eval=FALSE}
# Upload a local directory to S3
rclone_copy("/local/data", "aws:my-bucket/data")

# Download a file from S3
rclone_copy("aws:my-bucket/report.csv", "/local/downloads/")

# Copy a URL directly to cloud storage (no local intermediate)
rclone_copyurl(
  "https://raw.githubusercontent.com/tidyverse/readr/main/inst/extdata/mtcars.csv",
  "aws:my-bucket/mtcars.csv"
)
```

### Sync

`rclone_sync()` makes the destination *identical* to the source, deleting
destination files that are not in the source.  Use with care.

```{r sync, eval=FALSE}
rclone_sync("aws:my-bucket/data", "/local/backup")
```

### Move

`rclone_move()` copies files and then deletes the source.

```{r move, eval=FALSE}
rclone_move("aws:staging/file.csv", "aws:archive/2024/file.csv")
```

## Other file operations

```{r ops, eval=FALSE}
# Read a remote file into R
contents <- rclone_cat("aws:my-bucket/config.yaml")

# Get metadata for an object
rclone_stat("aws:my-bucket/data.csv")

# Total size of a path
rclone_size("aws:my-bucket")

# Create a bucket/directory
rclone_mkdir("aws:new-bucket")

# Delete files (keeps directories)
rclone_delete("aws:my-bucket/old-data/")

# Remove a path and all its contents
rclone_purge("aws:my-bucket/scratch")

# Generate a public link (where supported)
rclone_link("aws:my-bucket/report.html")

# Get storage quota info
rclone_about("aws:")
```

## Using the low-level rclone() wrapper

Every rclone subcommand is accessible via the `rclone()` function, which
accepts a character vector of arguments:

```{r lowlevel, eval=FALSE}
# Equivalent to: rclone version
rclone("version")

# Run any rclone command
rclone(c("check", "aws:bucket", "/local/backup", "--one-way"))
```

## Migrating from minioclient

If you are migrating from the `minioclient` package, the function mapping is:

| `minioclient`       | `rcloner`                |
|---------------------|--------------------------|
| `mc_alias_set()`    | `rclone_config_create()` |
| `mc_cp()`           | `rclone_copy()`          |
| `mc_mv()`           | `rclone_move()`          |
| `mc_mirror()`       | `rclone_sync()`          |
| `mc_ls()`           | `rclone_ls()`            |
| `mc_cat()`          | `rclone_cat()`           |
| `mc_mb()`           | `rclone_mkdir()`         |
| `mc_rb()`           | `rclone_purge()`         |
| `mc_rm()`           | `rclone_delete()`        |
| `mc_du()`           | `rclone_size()`          |
| `mc_stat()`         | `rclone_stat()`          |
| `mc()`              | `rclone()`               |

The main difference is that `rcloner` uses *remotes* (e.g. `"aws:bucket"`)
rather than *aliases* (e.g. `"alias/bucket"`).  Remote configuration is done
with `rclone_config_create()` instead of `mc_alias_set()`.
