---
title: "Advanced Features"
author: "Koen Hufkens"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Advanced Features}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

opts <- options(keyring_warn_for_env_fallback = FALSE)

# load the library
library(ncdf4)
library(terra)
library(maps)
library(ecmwfr)
```

# Advanced Features

This is a brief overview of some of the more advanced options in the `ecmwfr` package.

## Piped requests

Another hidden feature of `ecmwfr` is the fact that the request is the first argument in the `wf_request()` function. This means that any valid list can be piped into this function (using the magrittr `%>%` or native pipe symbol `|>`).

```{r eval = FALSE}
list(
      product_type = 'reanalysis',
      variable = 'geopotential',
      year = '2024',
      month = '03',
      day = '01',
      time = '13:00',
      pressure_level = '1000',
      data_format = 'grib',
      dataset_short_name = 'reanalysis-era5-pressure-levels',
      target = 'test.grib'
) |>
  wf_request(path = "~")
```

## Dynamic request functions / archetypes

Once a valid request has been created it can be made into a dynamic function using `achetypes`. Archetype functions are build using a valid `ecmwfr` ECMWF or CDS request and the vector naming the field which are to be set as dynamic.

The `wf_archetype()` function creates a new function with as parameters the dynamic fields previously assigned. The below example show how to use the function to generate the custom `dynamic_request()` function. We then use this new function to alter the `day` and `target` fields and pipe (`|>`) into the `wf_request()` function to retrieve the data.

```{r eval = FALSE}
# this is an example of a request
dynamic_request <- wf_archetype(
  request = list(
      product_type = 'reanalysis',
      variable = 'geopotential',
      year = '2024',
      month = '03',
      day = '01',
      time = '13:00',
      pressure_level = '1000',
      data_format = 'grib',
      dataset_short_name = 'reanalysis-era5-pressure-levels',
      target = 'test.grib'
  ),
  dynamic_fields = c("day", "target"))

# change the day of the month
dynamic_request(day = "01", target = "new.grib") |>
  wf_request()
```

## Batch (parallel) requests

As of version `1.4.0` you can submit parallel batch requests. Using the archetypes, as discussed above, it was easy to request multiple data products. However, these requests would go through sequentially. The ECMWF CDS infrastructure allows up to 20 parallel requests in your queue. The speed of downloading data could be increased when submitting jobs in parallel rather than sequentially. A new function `wf_request_batch()` now implements parallel CDS requests, using lists of requests (potentially generated by an archetype as per above).

```{r eval = FALSE}
# creating a list of requests using wf_archetype()
# setting the day value
batch_request <- list(
  dynamic_request(day = "01"),
  dynamic_request(day = "02")
)

# submit a batch job using 2 workers
# one for each in the list (the number of workers
# can't exceed 20)
wf_request_batch(
  batch_request,
  workers = 2
  )
```

### Mixing data services in batch requests

It is allowed to mix data services in a batch requests. This allows you to formulate complex multi-service requests. Below you see a simple example using a batch requests for data from both the CDS and ADS services in one pass.

```{r eval=FALSE}
# CDS
cds_request <-
  list(
      product_type = 'reanalysis',
      variable = 'geopotential',
      year = '2024',
      month = '03',
      day = '01',
      time = '13:00',
      pressure_level = '1000',
      data_format = 'grib',
      dataset_short_name = 'reanalysis-era5-pressure-levels',
      target = 'test.grib'
)

# ADS
ads_request <- list(
  dataset_short_name = "cams-global-radiative-forcings",
  variable = "radiative_forcing_of_carbon_dioxide",
  forcing_type = "instantaneous",
  band = "long_wave",
  sky_type = "all_sky",
  level = "surface",
  version = "2",
  year = "2018",
  month = "06",
  target = "download.grib"
)


combined_request <- list(
  cds_request,
  ads_request
)


files <- wf_request_batch(
  combined_request
  )
```

## Date specification

For those familiar to ECMWF _mars_ syntax: CDS/ADS does not
accept `date = "2000-01-01/to/2000-12-31"` specifications at the moment. It is possible to specify one specific date via `date = "2000-01-01"` or multiple days via `date = ["2000-01-01","2000-01-02","2000-10-20"]` or `date = "YYYY-MM-DD/YYYY-MM-DD"` but not via `".../to/..."`.

## Environmental variables for API token

Alternatively to using `wf_set_key()`, you can set an environmental variable containing your API. 

```{r eval=FALSE}
 Sys.setenv(ecmwfr_PAT="abcd1234-foo-bar-98765431-XXXXXXXXXX")
```

This will need to be set at the beginning of each setting or added to the user
`.Renviron` file. Overall,  this is considered insecure, but might be the only 
option on some legacy or HPC systems to get full `ecmwfr` functionality. A good 
blog post on why you should not do this is 
provided by [Maëlle Salmon](https://blog.r-hub.io/2024/02/28/key-advantages-of-using-keyring/).