---
title: "Multiple sources of data"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Multiple sources of data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

Price indexes are usually made from several sources of data. An
important benefit of the usual two-step workflow to make price indexes
is that the elemental indexes can be built piecemeal---using different
sources of data and different index-number formulas---and then
aggregated with a consistent structure.

Let's extend the example in `vignette("piar")` by having an alternate
source of data for business `B5` that is always missing in the
`ms_prices` dataset.

```{r}
library(piar)

# Make an aggregation structure.
ms_weights[c("level1", "level2")] <- 
  expand_classification(ms_weights$classification)

pias <- ms_weights[c("level1", "level2", "business", "weight")] |>
  as_aggregation_structure()

# Make elemental index.
elementals <- ms_prices |>
  transform(
    relative = price_relative(price, period = period, product = product)
  ) |>
  elemental_index(relative ~ period + business, na.rm = TRUE)

elementals
```

Instead of using survey-like data for the other businesses, `B5` is made
from scanner-like data with many price and quantity observations at each
point in time.

```{r}
set.seed(12345)

scanner_prices <- data.frame(
  period = rep(c("201904", time(elementals)), each = 200),
  product = 1:200,
  price = round(rlnorm(5 * 200) * 10, 1),
  quantity = round(runif(5 * 200, 100, 1000))
)

head(scanner_prices)
```

These type of data often require the use of a multilateral index like
the GEKS. For the sake of illustration, we'll make a Fisher GEKS index
over a 3 quarter rolling window and use a mean splice to make a single
time series.

```{r}
library(gpindex)

geks_elementals <- with(
  scanner_prices,
  fisher_geks(price, quantity, period, product, window = 3)
) |>
  splice_index() |>
  t() |>
  as_index(chainable = FALSE) |>
  set_levels("B5") |>
  rebase("202001")

geks_elementals
```

These values can now be merged with the other elemental indexes, getting
turned into a period-over-period index in the process, and then
aggregated.

```{r}
merge(elementals, geks_elementals) |>
  aggregate(pias, na.rm = TRUE)
```