---
title: "Read and manipulate a tabular-data-resource"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Read and manipulate a tabular-data-resource}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r}
library(fr)
```

The {fr} package comes with an example frictionless tabular-data-resource (tdr) named `hamilton_poverty_2020`.  On disk, a tdr is composed of a folder containing a data CSV file (both named based on the `name` of the tdr) *and* a `tabular-data-resource.yaml` file, which contains the metadata descriptors:

```{r}
fs::dir_tree(fs::path_package("fr", "hamilton_poverty_2020"), recurse = TRUE)
```

Read the `hamilton_poverty_2020` tdr into R by specifying the location of the tabular-data-resource file *or* to a folder containing a `tabular-data-resource.yaml` file:

```{r}
d_fr <- read_fr_tdr(fs::path_package("fr", "hamilton_poverty_2020"))
```

Print the returned `fr_tdr` (frictionless tabular-data-resource) object to view all of the table-specific metadata descriptors and the underlying data:

```{r}
d_fr
```

Print the `schema` property to view the table-specific metadata:

```{r}
S7::prop(d_fr, "schema")
```

`fr_tdr` objects can be used mostly anywhere that the underlying data frame can be used because `as.data.frame` usually is used to coerce objects into data frames and works with `fr_tdr` objects:

```{r}
lm(fraction_poverty ~ year, data = d_fr)
```

Accessor functions (`[`, `[[`, `$`) work as they do with data frames and tibbles:

```{r}
head(d_fr$fraction_poverty)
```

In some cases, `fr_tdr` objects need to be disassociated into data and metadata before the data is manipulated and the metadata is rejoined:

```{r}
#| error: true
d_fr |>
  dplyr::mutate(high_poverty = fraction_poverty > median(fraction_poverty))
```

In this case, explicitly convert the `fr_tdr` object to a tibble by dropping the metadata attributes using `as_tibble`, `as_data_frame`, or `as.data.frame` and then use `as_fr_tdr()` while specifying the original `fr_tdr` object as a template to convert back to a `fr_tdr` object:

```{r}
d_fr |>
  tibble::as_tibble() |>
  dplyr::mutate(high_poverty = fraction_poverty > median(fraction_poverty)) |>
  as_fr_tdr(.template = d_fr)
```

Shortcuts are provided for some functions from {dplyr} (see `dplyr_methods()` for a full list). 

```{r}
d_fr |>
  fr_mutate(high_poverty = fraction_poverty > median(fraction_poverty)) |>
  fr_select(-year) |>
  fr_arrange(desc(fraction_poverty))
```

More complicated dplyr functions (e.g., `group_by()` and friends) as well as functions from other packages that do not coerce their inputs to data.frame objects will need to use the pattern above. Below is a simple example for `dplyr::left_join()`:

```{r}
library(dplyr, warn.conflicts = FALSE)

d_fr <- update_field(d_fr, "fraction_poverty", description = "the poverty fraction")

d_extant <-
  d_fr |>
  fr_mutate(score = 1 + fraction_poverty) |>
  fr_select(-fraction_poverty, -year) |>
  as_tibble()

d_fr_new <-
  left_join(
    as_tibble(d_fr),
    d_extant,
    by = join_by(census_tract_id_2020 == census_tract_id_2020)
  ) |>
  as_fr_tdr(.template = d_fr) |>
  update_field("score", description = "the score")

d_fr_new

S7::prop(d_fr_new, "schema")
```