---
title: "Curate File Collections"
output:
  rmarkdown::html_vignette:
    css: "custom.css"
    toc: true
    toc_float: false
    toc_depth: 4
    number_sections: false
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{Curate File Collections}
---

```{r, include=FALSE}
knitr::opts_chunk$set(
  comment = "#>",
  collapse = TRUE
)
```

Ideally, file collections are generated by passing proper file specifications
in a single call to `collate()`. In reality, post-processing of the
file collection objects is sometimes needed. Thus, we support a selected
number of file collection operations.

## Merge file collections

You can merge file collections after they are collated.
This operation returns the union of the files.

```{r}
library("pkglite")
pkg <- system.file("examples/pkg1/", package = "pkglite")

fc <- merge(
  pkg %>% collate(file_root_core()),
  pkg %>% collate(file_r()),
  pkg %>% collate(file_r(), file_man())
)

fc
```

By design, one file collection object only stores metadata of files from
a single package. Therefore, merging file collections from different
packages will result in an error.

## Prune file collections

To remove files from a file collection, use `prune()`:

```{r}
fc %>% prune(path = c("NEWS.md", "man/figures/logo.png"))
```

Only the files matching the exact relative path(s) will be removed.

The prune operation is type-stable. If all files in a file collection
are removed, an empty file collection is returned so that it can still
be merged with the other file collections.

```{r}
pkg %>%
  collate(file_data()) %>%
  prune(path = "data/dataset.rda")
```

## Sanitize file collections

A file collection might contain files that should almost always be excluded,
such as the files defined in `pattern_file_sanitize()`:

```{r}
pattern_file_sanitize()
```

You can use `sanitize()` to remove such files (if any) from a file collection:

```{r}
fc %>% sanitize()
```