---
title: "From R to CFF"
subtitle: "Crosswalk"
description: >-
  A comprehensive description of the internal mappings performed by 
  the cffr package.
author: Diego Hernangómez
bibliography: REFERENCES.bib
link-citations: yes
output: 
  rmarkdown::html_vignette:
    toc: true
vignette: >
  %\VignetteIndexEntry{From R to CFF}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(cffr)
```

The goal of this vignette is to provide an explicit map between the metadata
fields used by **cffr** and each one of the valid keys of the [Citation File
Format schema version
1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys).

## Summary {#summary}

We summarize here the fields that **cffr** can coerce and the original source of
information for each one of them. The details on each key are presented on the
next section of the document. The assessment of fields are based on the [Guide
to Citation File Format schema version
1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#valid-keys)
[@druskat_citation_2021].

```{r summary , echo=FALSE}
keys <- cff_schema_keys(sorted = TRUE)
origin <- vector(length = length(keys))
origin[keys == "cff-version"] <- "parameter on function"
origin[keys == "type"] <- "Fixed value: 'software'"
origin[keys == "identifiers"] <- "DESCRIPTION/CITATION files"
origin[keys == "references"] <- "DESCRIPTION/CITATION files"

origin[keys %in% c(
  "message",
  "title",
  "version",
  "authors",
  "abstract",
  "repository",
  "repository-code",
  "url",
  "date-released",
  "contact",
  "keywords",
  "license",
  "commit"
)] <- "DESCRIPTION file"

origin[keys %in% c(
  "doi",
  "preferred-citation"
)] <- "CITATION file"


origin[origin == FALSE] <- "Ignored by cffr"

df <- data.frame(
  key = paste0("<a href='#", keys, "'>", keys, "</a>"),
  source = origin
)


knitr::kable(df, escape = FALSE)
```

## Details

### abstract

This key is extracted from the `"Description"` field of the `DESCRIPTION` file.

<details>

<summary><strong>Example</strong></summary>

```{r abstract}
library(cffr)

# Create cffr for yaml

cff_obj <- cff_create("rmarkdown")

# Get DESCRIPTION of rmarkdown to check

pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))

cat(cff_obj$abstract)

cat(pkg$get("Description"))
```

</details>

[Back to summary](#summary).

### authors

This key is coerced from the `"Authors"` or `"Authors@R"` field of the
`DESCRIPTION` file. By default persons with the role `"aut"` or `"cre"` are
considered, however this can be modified via the `authors_roles` parameter.

<details>

<summary><strong>Example</strong></summary>

```{r authors}
# An example DESCRIPTION
path <- system.file("examples/DESCRIPTION_many_persons", package = "cffr")
pkg <- desc::desc(path)

# See persons listed
pkg$get_authors()


# Default behaviour, use authors and creators (maintainers)
cff_obj <- cff_create(path)
cff_obj$authors


# Use now Copyright holders and maintainers
cff_obj_alt <- cff_create(path, authors_roles = c("cre", "cph"))
cff_obj_alt$authors
```

</details>

[Back to summary](#summary).

### cff-version

This key can be set via the parameters of the `cff_create()`/`cff_write()`
functions:

<details>

<summary><strong>Example</strong></summary>

```{r cffversion}
cff_objv110 <- cff_create("jsonlite", cff_version = "v1.1.0")

cat(cff_objv110$`cff-version`)
```

</details>

[Back to summary](#summary).

### commit

This key is extracted from the `"RemoteSha"` field of the `DESCRIPTION` file.
This is the case of packages installed using the
[r-universe](https://r-universe.dev/search) or packages such as **remotes** or
**pak**.

<details>

<summary><strong>Example</strong></summary>

```{r commit}
# An example DESCRIPTION
path <- system.file("examples/DESCRIPTION_r_universe", package = "cffr")
pkg <- desc::desc(path)

# See RemoteSha
pkg$get("RemoteSha")


cff_read(path)
```

</details>

[Back to summary](#summary).

### contact

This key is coerced from the `"Authors"` or `"Authors@R"` field of the
`DESCRIPTION` file. Only persons with the role `"cre"` (i.e, the maintainer(s))
are considered.

<details>

<summary><strong>Example</strong></summary>

```{r contact}
cff_obj <- cff_create("rmarkdown")
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))

cff_obj$contact

pkg$get_author()
```

</details>

[Back to summary](#summary).

### date-released

This key is extracted from the `DESCRIPTION` file following this logic:

-   `"Date"` field or,
-   If not present, from `"Date/Publication"`. This is present on packages built
    on **CRAN** and **Bioconductor**. or,
-   If not present, from `"Packaged"`, that is present on packages built by the
    [r-universe](https://r-universe.dev/search).

<details>

<summary><strong>Example</strong></summary>

```{r date-released}
# From an installed package

cff_obj <- cff_create("rmarkdown")
pkg <- desc::desc(file.path(find.package("rmarkdown"), "DESCRIPTION"))


cat(pkg$get("Date/Publication"))


cat(cff_obj$`date-released`)



# A DESCRIPTION file without a Date
nodate <- system.file("examples/DESCRIPTION_basic", package = "cffr")
tmp <- tempfile("DESCRIPTION")

# Create a temporary file
file.copy(nodate, tmp)


pkgnodate <- desc::desc(tmp)
cffnodate <- cff_create(tmp)

# Won't appear
cat(cffnodate$`date-released`)

pkgnodate

# Adding a Date

desc::desc_set("Date", "1999-01-01", file = tmp)

cat(cff_create(tmp)$`date-released`)
```

</details>

[Back to summary](#summary).

### doi {#doi}

This key is coerced from the `"doi"` field of the
[preferred-citation](#preferred-citation) object. If not present and the package
is on **CRAN**, it would be populated with the doi provided by **CRAN** (e.g.
<https://doi.org/10.32614/CRAN.package.cffr>).

<details>

<summary><strong>Example</strong></summary>

```{r doi}
cff_doi <- cff_create("cffr")

cat(cff_doi$doi)

cat(cff_doi$`preferred-citation`$doi)
```

</details>

[Back to summary](#summary).

### identifiers

This key includes all the possible identifiers of the package:

-   From the `DESCRIPTION` field, it includes all the urls not included in
    [url](#url) or [repository-code](#repository-code).

-   From the `CITATION` file, it includes all the dois not included in
    [doi](#doi) and the identifiers (if any) not included in the `"identifiers"`
    key of [preferred-citation](#preferred-citation).

-   If the package is on **CRAN** and it has a `CITATION` file providing a doi,
    the doi provided by **CRAN** would be added as well.

<details>

<summary><strong>Example</strong></summary>

```{r identifiers}
file <- system.file("examples/DESCRIPTION_many_urls", package = "cffr")

pkg <- desc::desc(file)

cat(pkg$get_urls())

cat(cff_create(file)$url)

cat(cff_create(file)$`repository-code`)

cff_create(file)$identifiers
```

</details>

[Back to summary](#summary).

### keywords

This key is extracted from the `DESCRIPTION` file. The keywords should appear in
the `DESCRIPTION` as:

```         
...
X-schema.org-keywords: keyword1, keyword2, keyword3
```

<details>

<summary><strong>Example</strong></summary>

```{r keyword}
# A DESCRIPTION file without keywords
nokeywords <- system.file("examples/DESCRIPTION_basic", package = "cffr")
tmp2 <- tempfile("DESCRIPTION")

# Create a temporary file
file.copy(nokeywords, tmp2)


pkgnokeywords <- desc::desc(tmp2)
cffnokeywords <- cff_create(tmp2)

# Won't appear
cat(cffnokeywords$keywords)

pkgnokeywords

# Adding Keywords

desc::desc_set("X-schema.org-keywords", "keyword1, keyword2, keyword3",
  file = tmp2
)

cat(cff_create(tmp2)$keywords)
```

</details>

Additionally, if the source code of the package is hosted on GitHub, **cffr**
can retrieve the topics of your repo via the [GitHub
API](https://docs.github.com/en/rest) and include those topics as keywords. This
option is controlled via the `gh_keywords` parameter:

<details>

<summary><strong>Example</strong></summary>

```{r ghkeyword}
# Get cff object from jsonvalidate

jsonval <- cff_create("jsonvalidate")

# Keywords are retrieved from the GitHub repo

jsonval

# Check keywords
jsonval$keywords

# The repo
jsonval$`repository-code`
```

</details>

[Back to summary](#summary).

### license

This key is extracted from the `"License"` field of the `DESCRIPTION` file.

<details>

<summary><strong>Example</strong></summary>

```{r license}
cff_obj <- cff_create("yaml")

cat(cff_obj$license)

pkg <- desc::desc(file.path(find.package("yaml"), "DESCRIPTION"))

cat(pkg$get("License"))
```

</details>

[Back to summary](#summary).

### license-url

This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#license-url).

> -   **description**: The URL of the license text under which the software or
>     dataset is licensed (only for non-standard licenses not included in the
>     [SPDX License
>     List](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#definitionslicense-enum)).
>
> -   **usage**:
>
>     ``` yaml
>     license-url: "https://obscure-licenses.com?id=1234"
>     ```

[Back to summary](#summary).

### message

This key is extracted from the `DESCRIPTION` field, specifically as:

```{r eval=FALSE}
msg <- paste0(
  'To cite package "',
  "NAME_OF_THE_PACKAGE",
  '" in publications use:'
)
```

<details>

<summary><strong>Example</strong></summary>

```{r message}
cat(cff_create("jsonlite")$message)
```

</details>

[Back to summary](#summary).

### preferred-citation {#preferred-citation}

This key is extracted from the `CITATION` file. If several references are
provided, it would select the first citation as the `"preferred-citation"` and
the rest of them as [references](#references).

<details>

<summary><strong>Example</strong></summary>

```{r preferred-citation}
cffobj <- cff_create("rmarkdown")

cffobj$`preferred-citation`

citation("rmarkdown")[1]
```

</details>

[Back to summary](#summary).

### references {#references}

This key is extracted from the `CITATION` file if several references are
provided. The first citation is considered as the
[preferred-citation](#preferred-citation) and the rest of them as
`"references"`. It also extracts the package dependencies and adds those to this
fields using `citation(auto = TRUE)` on each dependency.

<details>

<summary><strong>Example</strong></summary>

```{r references}
cffobj <- cff_create("rmarkdown")

cffobj$references

citation("rmarkdown")[-1]
```

</details>

[Back to summary](#summary).

### repository

This key is extracted from the `"Repository"` field of the `DESCRIPTION` file.
Usually, this field is auto-populated when a package is hosted on a repo (like
**CRAN** or the [r-universe](https://r-universe.dev/search)). For packages
without this field on the `DESCRIPTION` (that is the typical case for an
in-development package), **cffr** would try to search the package on any of the
default repositories specified on `options("repos")`.

In the case of [Bioconductor](https://bioconductor.org/) packages, those are
identified if a
["biocViews"](https://contributions.bioconductor.org/description.html#biocviews)
is present on the `DESCRIPTION` file.

If **cffr** detects that the package is available on **CRAN**, it would return
the canonical url form of the package (i.e.
<https://CRAN.R-project.org/package=jsonlite>).

<details>

<summary><strong>Example</strong></summary>

```{r repository}
# Installed package

inst <- cff_create("jsonlite")

cat(inst$repository)

# Demo file downloaded from the r-universe

runiv <- system.file("examples/DESCRIPTION_r_universe", package = "cffr")
runiv_cff <- cff_create(runiv)

cat(runiv_cff$repository)

desc::desc(runiv)$get("Repository")

# For in development package

norepo <- system.file("examples/DESCRIPTION_basic", package = "cffr")

# No repo
norepo_cff <- cff_create(norepo)

cat(norepo_cff[["repository"]])

# Change the name to a known package on CRAN: ggplot2

tmp <- tempfile("DESCRIPTION")
file.copy(norepo, tmp)


# Change name
desc::desc_set("Package", "ggplot2", file = tmp)

cat(cff_create(tmp)[["repository"]])
```

</details>

[Back to summary](#summary).

### repository-artifact

This key is not extracted from the metadata of the package. See the description
on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#repository-artifact).

> -   **description**: The URL of the work in a build artifact/binary repository
>     (when the work is software).
>
> -   **usage**:
>
>     ``` yaml
>     repository-artifact: "https://search.maven.org/artifact/org.corpus-tools/cff-maven-plugin/0.4.0/maven-plugin"
>     ```

[Back to summary](#summary).

### repository-code {#repository-code}

This key is extracted from the `"BugReports"` or `"URL"` fields on the
`DESCRIPTION` file. **cffr** tries to identify the url of the source on the
following repositories:

-   [GitHub](https://github.com/).
-   [GitLab](https://about.gitlab.com/).
-   [R-Forge](https://r-forge.r-project.org/).
-   [Bitbucket](https://bitbucket.org/).
-   [Codeberg](https://codeberg.org/).

<details>

<summary><strong>Example</strong></summary>

```{r repository-code}
# Installed package on GitHub

cat(cff_create("jsonlite")$`repository-code`)



# GitLab

gitlab <- system.file("examples/DESCRIPTION_gitlab", package = "cffr")

cat(cff_create(gitlab)$`repository-code`)


# Codeberg

codeberg <- system.file("examples/DESCRIPTION_codeberg", package = "cffr")

cat(cff_create(codeberg)$`repository-code`)
```

</details>

[Back to summary](#summary).

### title

This key is extracted from the `"Description"` field of the `DESCRIPTION` file.

```{r eval=FALSE}
title <- paste0(
  "NAME_OF_THE_PACKAGE",
  ": ",
  "TITLE_OF_THE_PACKAGE"
)
```

<details>

<summary><strong>Example</strong></summary>

```{r title}
# Installed package

cat(cff_create("testthat")$title)
```

</details>

[Back to summary](#summary).

### type

Fixed value equal to `"software"`. The other possible value is `"dataset"`. See
the description on the [Guide to CFF schema
v1.2.0](https://github.com/citation-file-format/citation-file-format/blob/main/schema-guide.md#type).

[Back to summary](#summary).

### url {#url}

This key is extracted from the `"BugReports"` or `"URL"` fields on the
`DESCRIPTION` file. It corresponds to the first url that is different to
[repository-code](#repository-code).

<details>

<summary><strong>Example</strong></summary>

```{r url}
# Many urls
manyurls <- system.file("examples/DESCRIPTION_many_urls", package = "cffr")

cat(cff_create(manyurls)$url)

# Check

desc::desc(manyurls)
```

</details>

[Back to summary](#summary).

### version

This key is extracted from the `"Version"` field on the `DESCRIPTION` file.

```{r version}
# Should be (>= 3.0.0)
cat(cff_create("testthat")$version)
```

[Back to summary](#summary).

## References