---
title: "Associated data"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Associated data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{css, echo=FALSE}
.table{
  width: auto;
  font-size: 14px;
}
.table caption {
  font-size: 1em;
}
```

Here in this article, we describe the data associated with `expowo` package that 
are helpful to enhance performance before and while mining the global diversity 
and distribution data.\

There are three associated data package, all of which can be accessed and imported 
into the R environment by simply calling the `utils` function `data`. Below, we 
explain how to use the data package `POWOcodes`, `angioData`, and `botregions`.\

\

## Setup

Install the latest development version of __expowo__ from 
[GitHub](https://github.com/):

``` r
#install.packages("devtools")
devtools::install_github("DBOSlab/expowo")
```
```{r package load, message=FALSE, warning=FALSE}
library(expowo)
```

\

## `POWOcodes` - A complete list of vascular plant families and associated URI addresses

The `POWOcodes` data is a dataframe containing a list of families of vascular 
plants available in the POWO database. The flowering plant families are according 
to [APG IV](https://doi.org/10.1111/boj.12385)'s classification. It was produced 
by using the functions `apgFamilies` and `get_pow`
of the package [taxize](https://docs.ropensci.org/taxize/index.html). `POWOcodes` 
data is very useful when the purpose is to query many plant families or all of 
them. In addition to the names of POWO's accepted plant families, which are 
required in any search with `expowo`'s functions, this data package has the 
corresponding family's URI, which is used internally in the function for opening 
a connection with each family page at Plants of The World Online.  In addition, 
plant families are classified into three groups: "angiosperm", 
"gymnosperm", and "fern and lycophyte".

To call this associated data to the R environment, use the code below:\

```{r, eval = FALSE}
utils::data("POWOcodes")
```

\

## `angioData` - A dataframe of Angiosperm species

The `angioData` is a dataframe-formatted list of species of some families of 
flowering plants and associated data, as retrieved from POWO database using the 
`expowo`'s function `powoSpecies`. For publication purposes and package size 
limitation, we selected some families to include in this associated data and let 
the information easily available. Currently, there are species from five plant 
families as listed in Table 1. The last update of this data was at Nov 2022.\
\

TABLE 1. The plant families and species number within `angioData`.
\

| family | species_number | family | species_number |
|:---:|:---:|:---:|:---:|
| Lecythidaceae | 381 | Aristolochiaceae | 727 |
| Martyniaceae | 14 | Cabombaceae | 7 | 
| Begoniaceae | 1993 | | |

\

In addition to all accepted species (excluding hybrid species) in the families 
above, the `angioData` also contains the publication, authorship, and global 
geographic distribution at country or botanical level for each species. The global
classification of botanical divisions follows the [World Geographical Scheme](https://www.tdwg.org/standards/wgsrpd/) for Recording Plant Distributions,
which is already associated with each taxon's distribution in POWO.\

The example below shows how to call `angioData`: \

```{r, eval = FALSE}
utils::data("angioData")
```

\

## `angioGenera` - A data frame of Angiosperm genera

The `angioGenera` comprises the genera of some families of flowering plants and 
associated data as mined from POWO database using `powoGenera`. For publication 
purposes and package size limitation, we selected some families to include in 
this associated  data and let the information easily available. Currently, there 
are six plant families (Lecythidaceae, Aristolochiaceae, Begoniaceae, Martyniaceae,
Dipterocarpaceae, and Fagaceae) as listed in Table 2. The last update of this 
data was at Nov 2022.\

\

TABLE 2. The plant families and genera within `angioGenera`.\
```{r, echo = FALSE, warning = FALSE}
utils::data("angioGenera")
df <- angioGenera

knitr::kable(df[-c(2, 3, 7, 8, 9, 10, 11, 12, 13)],
             row.names = FALSE)
```

\

In addition to list all accepted genera in the above families, the dataframe-formatted
`angioGenera` also contains publication, authorship, species number and global 
geographic distribution at country or botanical level for each genus. The global 
classification of botanical divisions follows the [World Geographical Scheme](https://www.tdwg.org/standards/wgsrpd/) for Recording Plant Distributions,
which is already associated with each taxon's distribution in POWO.\

The example below shows how to call `angioGenera`:\

```{r, eval = FALSE}
utils::data("angioGenera")
```

\

## `botregions` - Countries and associated classification of botanical divisions

The `botregions` data package is a dataframe containing all countries of the 
World and associated classification of botanical divisions according to the 
[World Geographical Scheme](https://www.tdwg.org/standards/wgsrpd/) for Recording
Plant Distributions. It was built using a custim script with specific functions 
from the packages `rgdal`, `ggplot2`, `broom`, `sf` and `raster` (original script
can be made available upon request). This dataset is useful when the purpose is 
to query the distribution data by both divisions of the world and for the `expowo`
's functions convert the country names.\

To call this associated data to your R environment, use the code below:\

```{r, eval = FALSE}
utils::data("botregions")
```
\

## Reference

POWO (2019). "Plants of the World Online. Facilitated by the Royal Botanic Gardens, Kew. Published on the Internet; http://www.plantsoftheworldonline.org/ Retrieved Nov 2022."
