---
title: "MultiATSM package - General Guidelines"
author: Rubens Moura
date: "`r Sys.Date()`"
output: 
  bookdown::html_document2:
    base_format: rmarkdown::html_vignette
    number_sections: true
    df_print: paged
    toc: yes
    toc_depth: 2
    fig_caption: yes
vignette: >
  %\VignetteIndexEntry{General Guidelines}
  \usepackage[utf8]{inputenc} 
  %\VignetteEngine{knitr::rmarkdown}
bibliography: references.bib
editor_options: 
  markdown: 
    wrap: 72
---

This document aims at providing general guidance on the use of the
MultiATSM package available at the [CRAN
repository](https://cran.r-project.org/).

```{r}
library(MultiATSM)
```

The *MultiATSM* package offers estimation routines and various outputs
for single and multi-country affine term structure models (ATSMs). All
the frameworks of this package are based on the unspanned economic risk
framework from @JoslinPriebschSingleton2014  **(JPS, 2014)**. In essence, 
these models assume the absence of arbitrage opportunities, consider a 
linear state space representation of the yield curve dynamics and offer 
a tractable approach to simultaneously combine the traditional yield curve 
factors (spanned factors) along with macroeconomic variables
(unspanned factors).

The *MultiATSM* package currently supports the generation of outputs for
eight distinct classes of ATSMs. Specifically, the package accommodates
models estimated either on a *country-by-country* basis, following the
approach of JPS (2014), or *jointly* across all countries within the
economic system, as developed in **JLL (2015)** by
@JotikasthiraLeLundblad2015 and **CM (2024)** by @CandelonMoura2024.

Due to peculiar features of JPS-based specifications, an efficient
estimation of the parameters governing the risk-neutral ($Q$-measure)
and the physical ($P$-measure) probability measures can be carried out
rather independently. The only exception is the variance-covariance
matrix (sigma) term which is a common element to both the $P$ and the
$Q$ density functions. In all cases, the risk factor dynamics under the
$P$-measure follow a VAR(1) model of some sort. The table below
summarizes the general features of each one of the models available at
this package.

```{r ModFea, message=FALSE, echo=FALSE}
ModelLabels <- c("JPS original", "JPS global", "JPS multi", "GVAR single", "GVAR multi", 
                 "JLL original", "JLL No DomUnit", "JLL joint Sigma")

# Rows
Tab <- data.frame(matrix(nrow = length(ModelLabels), ncol = 0)) 
rownames(Tab) <- ModelLabels

# Empty columns
EmptyCol <- c("", "", "", "", "", "", "", "") 
Tab$EmptyCol0 <- EmptyCol
# P-dynamics + 2 empty spaces
Tab$PdynIndUnco <- c("x", "", "", "", "", "", "", "")
Tab$PdynIndCo <- c("", "", "", "", "", "", "", "")
Tab$PdynJointUnco <- c("", "x", "x", "", "", "", "", "")
Tab$PdynJointJLL <- c("", "", "", "", "", "x", "x", "x")
Tab$PdynJointGVAR <- c("", "", "", "x", "x", "", "", "")
Tab$EmptyCol1 <- EmptyCol
Tab$EmptyCol2 <- EmptyCol
# Q-dynamics + 2 empty spaces
Tab$QdynInd <- c("x", "x", "", "x", "", "", "", "")   
Tab$QdynJoint <- c("", "", "x", "", "x", "x", "x", "x") 
Tab$EmptyCol3 <- EmptyCol
Tab$EmptyCol4 <- EmptyCol
# Sigma + 2 empty spaces
Tab$Ponly <-  c("", "", "", "", "", "x", "x", "")
Tab$PandQ <- c("x", "x", "x", "x", "x", "", "", "x")
Tab$EmptyCol5 <- EmptyCol
Tab$EmptyCol6 <- EmptyCol
# Dominant Unit
Tab$DomUnit <- c("", "", "", "", "", "x", "", "x")

# Adjust column names
ColNames <- c("","","","","JLL", "GVAR", "", "", "", "", "", "", "","", "", "","")
colnames(Tab) <- ColNames


# Generate the table
suppressWarnings(library(magrittr))


kableExtra::kbl(Tab, align = "c", caption = "Model Features") %>%
  kableExtra::kable_classic("striped", full_width = F)  %>%
  kableExtra::row_spec(0, font_size = 10) %>%
  kableExtra::add_header_above(c(" "=2, "Unrestricted" = 1, "Restricted" = 1, "Unrestricted" = 1, "Restricted" = 2, " " = 11)) %>%
 kableExtra::add_header_above(c(" "=2, "Individual" = 2, "Joint" = 3, " "=2, "Individual" = 1, "Joint" = 1, " "=2, "P only" = 1, "P and Q" = 1, " " = 3))  %>%
  kableExtra::add_header_above(c( " "=2, "P-dynamics"= 5, " "=2, "Q-dynamics"= 2, " "=2, "Sigma matrix estimation" = 2, " "=2, "Dominant Country"=1), bold = T) %>%
kableExtra::pack_rows("Unrestricted VAR", 1, 3 , label_row_css = "background-color: #666; color: #fff;")  %>%
kableExtra::pack_rows("Restricted VAR (GVAR)", 4, 5, label_row_css = "background-color: #666; color: #fff;") %>%
kableExtra::pack_rows("Restricted VAR (JLL)", 6, 8, label_row_css = "background-color: #666; color: #fff;")
```

Some aspects of the multi-country frameworks are worth highlighting. As
for the models based on the setup of JLL (2015), the version "*JLL
original*" follows closely the seminal work of JLL (2015), i.e., it is
assumed an economic cohort containing a worldwide dominant economy and a
set of smaller countries, in addition to the estimation of the sigma
matrix be performed exclusively under the $P$-measure. The two other
alternative versions assume the absence of a dominant country ("*JLL No
DomUnit"*) and the estimation of sigma under both the $P$ and $Q$
measures ("*JLL joint Sigma*"), as in the standard single-country JPS
(2014) model. As for the remaining multi-country ATSMs, it is considered
that the dynamics of the risk factors under the $P$-measure evolves
according to a GVAR model. The version labeled *"GVAR multi"* is the one
presented in CM (2024).

In what follows, the focus will be devoted to describe the various
pieces that are necessary to implement ATSMs. Specifically, Section
\@ref(S:SectionData) describes the data-set available at this package
and the set of functions that are useful to retrieve data from Excel
files. Section \@ref(S:SectionInputs) exposes the necessary inputs that
have to be specified by the user. In Section \@ref(S:SectionEstimation)
the estimation procedure is detailed. Section
\@ref(S:SectionReplication) shows how to use the *MultiATSM* package to
estimate ATSMs from scratch.

# Data {#S:SectionData}

## Package data-set {#S:SectionDataPackage}

The *MultiATSM* package includes modified datasets compared to those
originally used in @CandelonMoura2023 (CM, 2023) and @CandelonMoura2024
(CM, 2024) due to limited availability of certain data sources. As such, 
in some cases, simulated data replicating the original methodology is 
employed. To simplify the exposition, the database of CM (2024) is used 
throughout the following sections to demonstrate the package's functions.
These datasets can be accessed using the `LoadData()` function with the
argument *CM_2024*, and similarly, the CM (2023) data with *CM_2023*.

```{r}
LoadData("CM_2024")
```

The previous function loads four types of datasets. The first dataset
consists of time series for zero-coupon bond yields from four emerging
markets: China, Brazil, Mexico, and Uruguay. For the purpose of the ATSM
estimation, it is important to note two requirements: *(i)* the bond
yield maturities must be consistent across all countries (although the
`DataForEstimation()` function can take different maturities as inputs,
it outputs bond yields with uniform maturities for all economies); and
*(ii)* the bond yields should be expressed in percentage terms per
annum, not in basis points. The *MultiATSM* package does not include
routines for bootstrapping zero-coupon yields from coupon bonds, so
users must handle this data manipulation themselves if needed.

```{r}
data('CM_Yields')
```

The second dataset focuses on the time series of risk factors. Using the
terminology of JPS (2014), this dataset includes *(i)* country-specific
**spanned factors** (commonly represented by level, slope, and curvature
-- see Section \@ref(S:SpaFac) for an intuitive explanation of these
factors) and *(ii)* a set of country-specific and global **unspanned
factors**, namely macroeconomic variables such as economic growth and 
inflation. Similar to the bond yield data, these economic indicators 
must be constructed by the user.

```{r}
data('CM_Factors')
```

The final two datasets are essential for estimating GVAR-based models.
The trade flows data, sourced from the International Monetary Fund
(IMF), presents the sum of the value of all goods imports and exports
between any two countries of the sample on a yearly basis since 1948.
All the values are free on board and are expressed in U.S. dollars.
These data are used to construct the transition matrix of the GVARs
models.

```{r}
data('CM_Trade')
```

The GVAR factors database casts country-specific lists of all the
factors that are used in the estimation of each country's VARX(1,1,1).
The function titled `DatabasePrep()` provides assistance to structure
the data in a similar fashion to this form. Moreover, a specific
function for computing the star risk factors is detailed in the Section
\@ref(S:SectionGVAR).

```{r}
data('CM_Factors_GVAR')
```

## Import data from Excel files {#S:SectionImport}

This package also offers an automatized procedure to extract data from
Excel files and to, subsequentially, prepare the risk factor databases
that are directly used in the model estimation. The use of the package
functions requires that the databases *(i)* are constructed in separate
Excel files for the unspanned factors, the term structure data, and the
measures of interdependence (for the GVAR-based models); *(ii)* contain,
in each Excel file, one separate tab per country (in addition to the
global variables, in the case of the unspanned factors' database); and
*(iii)* have identical variable labels across all the tabs within each
file. For illustration, see the Excel file available at the package. One
example of list of inputs to be provided is, for instance,

```{r}
Initial_Date <- "2006-09-01" # Format "yyyy-mm-dd"
Final_Date <- "2019-01-01" # Format "yyyy-mm-dd"
DataFrequency <- "Monthly"
GlobalVar <- c("GBC", "VIX") # Global Variables
DomVar <- c("Eco_Act", "Inflation", "Com_Prices", "Exc_Rates") #  Domestic variables
N <-  3 # Number of spanned factors per country
Economies <- c("China", "Mexico", "Uruguay", "Brazil", "Russia")
ModelType <- "JPS original"
```

Using these inputs, the variable *RiskFac_TS* can be constructed,
containing the full time series of risk factors required for the model.

```{r}
FactorLabels <- LabFac(N, DomVar, GlobalVar, Economies, ModelType)
RiskFac_TS <- DataForEstimation(Initial_Date, Final_Date, Economies, N, FactorLabels, ModelType,
                                DataFrequency)
```

# Required user inputs {#S:SectionInputs}

## Fundamental inputs required for estimation

In order to estimate any ATSM of this package, the user needs to specify
several general inputs, namely:

**1. Desired ATSM class**

-   *Model type*: a character vector indicating the model type to be
    estimated, as presented in Table \@ref(tab:ModFea).

**2. Risk factor set**

-   *Economies*: a character vector containing the names of the
    economies included in the system;

-   *Global variables*: a character vector containing the labels of the
    desired global unspanned factors used in the model estimation;

-   *Domestic variables*: a character vector containing the labels of
    the desired domestic unspanned factors used in the model estimation;

-   *Number of spanned factors (N)*: a scalar that represents the number
    of country-specific spanned factors. Note that *N* is equal across
    countries.

**3. Sample span**

-   *Initial sample date*: Start date of the sample period in the format
    "dd-mm-yyyy";

-   *End sample date*: End date of the sample period in the format
    "dd-mm-yyyy".

**4. Frequency of the data**:

-   a character vector that specifies the frequency of the time series
    data. The available options are: *"Annually"*, *"Quarterly"*,
    *"Monthly"*, *"Weekly"*, *"Daily Business Days"*, and *"Daily All
    Days"*.

**5. Stationary constraint under the** $Q$-**dynamics**:

-   A binary variable that equals $1$ if the user wishes to impose the
    constraint that the largest eigenvalue under the $Q$-measure is
    strictly less than $1$ otherwise, it should be set to $0$. While
    enforcing this stationary constraint may prolong model estimation,
    it can improve the convergence of the model.

**6. Output label**

-   Character vector of length one, specifying the name used in the file
    that stores the model outputs.

One possible example of the basic model inputs is

```{r}
# 1) Model type
ModelType <- "JPS original"

# 2) Risk factor set
Economies <- c("China", "Brazil", "Mexico", "Uruguay")
GlobalVar <- c("GBC", "CPI_OECD") # Global variables
DomVar <- c("Eco_Act", "Inflation") # Domestic variables 
N <- 3 # number of spanned factors per country

# 3) Sample span
Initial_Date <- "01-05-2005" # Format: "dd-mm-yyyy"
Final_Date <- "01-12-2019" # Format: "dd-mm-yyyy"

# 4) Frequency of the data
DataFrequency <- "Monthly"

# 5) Risk-neutral stationary constraint
StationarityUnderQ <- 0 # 1 = set stationary condition under the Q; 0 = no stationary condition

# 6) Output label
OutputLabel <- "Model_demo"
```

## Model-specific inputs required for estimation

### GVARlist and JLLlist {#S:SectionGVARJLLinputs}

Additional inputs are required for estimating GVAR-ATSM or JLL-ATSM
models. These inputs should be organized into separate lists, named
*GVARlist* and *JLLlist* for clarity. This section outlines the general
structure of both lists, while Sections \@ref(S:SectionGVAR) and
\@ref(S:SectionJLL) provide detailed explanations and available options.

For GVAR models, the nature of the inputs is twofold. First, the dynamic
structure of each country's VARX model must be specified. For example:

```{r}
VARXtype <- "unconstrained"
```

Second, the estimation of the GVAR requires specifying the necessary
inputs to build the `transition matrix()` . An example of this list of
inputs is

```{r}
W_type <- 'Sample Mean' # Method to compute the transition matrix
t_First_Wgvar <- "2000" # First year of the sample
t_Last_Wgvar <-  "2015" # Last year of the sample
DataConnectedness <- TradeFlows # Measure of connectedness across countries
```

Therefore, a complete *GVARlist* is

```{r}
GVARlist <- list( VARXtype = "unconstrained", W_type = "Sample Mean", t_First_Wgvar = "2000",
                   t_Last_Wgvar = "2015", DataConnectedness = TradeFlows) 
```

Regarding the JLL-ATSM frameworks, it is sufficient to specify the name
of the dominant economy (if applicable) within the economic system:

```{r}
JLLlist <- list(DomUnit =  "China")
```

### BRWlist {#S:SectionBRWinputs}

In an influential paper, @BauerRudebuschWu2012 **(BRW, 2012)** have
shown that the estimates from traditional ATSMs can be severely biased
due to the typical small-sample size used in these studies. As result,
such models may produce unreasonably stable expected future short-term
interest, therefore distorting the term premia estimates for
long-maturity bonds. To circumvent this problem, they propose a
bootstrap-based method that relies on indirect inference estimation that
corrects for the bias present in previous works.

From its version $0.3.1$, the *MultiATSM* package accommodates the
framework proposed by BRW (2012) to each one of its ATSM types. Some
additional inputs must be specified, if the user intends to perform the
model estimation along the lines of BRW (2012). They are:

-   *Mean or median of OLS estimates* *(flag_mean)*: the user must
    decide whether to compute the mean or median of the OLS estimates
    after each bootstrap iteration. To compute the mean (median), the
    user must set *TRUE* (*FALSE*);

-   *Adjustment parameter* *(gamma)*: this parameter controls the degree
    of shrinkage between the difference in the OLS estimates and the BRW
    (2012)'s bootstrap-based one after each iteration. The value of this
    parameter is fixed and must lie in the interval $]0;1[$;

-   *Number of iteration* *(N_iter)* : total amount of iterations used
    in the stochastic approximation algorithm after burn-in;

-   *Number of bootstrap samples (B)*: quantity of simulated samples
    used in each burn-in or actual iteration;

-   *Perform closeness check (checkBRW)*: flag whether the user wishes
    to check the root mean square distance between the model estimates
    produced by the bias-correction method and those generated by OLS.
    Default is set to *TRUE*;

-   *Number of bootstrap samples used in the closeness check (B_check)*:
    default is set to 100,000;

-   *Number of burn-in iteration (N_burn)*: quantity of the iterations
    to be discarded in the first stage of the bias-correction estimation
    process. The recommended number is $15\%$ of the total number of
    iterations.

```{r}
BRWlist <- within(list(flag_mean = TRUE, gamma = 0.2, N_iter = 500, B = 50, 
                       checkBRW = TRUE, B_check = 1000), N_burn <- round(N_iter * 0.15))
```

## The `InputsForOpt()` function

The `InputsForOpt()` function plays a crucial role in the functioning of
the *MultiATSM* package. Specifically, it gathers, transforms, and
generates the necessary inputs for building the likelihood function
based on the user-provided inputs described in the previous sections,
which are then directly utilized in the model optimization process. One
possible use of `InputsForOpt()` is:

```{r, eval=FALSE} 
LoadData("CM_2024") 

ModelType <- "JPS original"
Economies <- "Mexico"
t0 <- "01-05-2007" # Initial Sample Date (Format: "dd-mm-yyyy")
tF <- "01-12-2018" # Final Sample Date (Format: "dd-mm-yyyy")
N <- 3
GlobalVar <- c("Gl_Eco_Act") # Global Variables
DomVar <- c("Eco_Act") # Domestic Variables
FactorLabels <- LabFac(N, DomVar, GlobalVar, Economies, ModelType)

DataFreq <- "Monthly"

ATSMInputs <- InputsForOpt(t0, tF, ModelType, Yields, GlobalMacroVar, DomesticMacroVar,  
                           FactorLabels, Economies, DataFreq, CheckInputs = FALSE)
```

## Additional inputs for numerical and graphical outputs

Once the model parameters from the ATSM have been estimated, the
`NumOutputs()` function enables the compilation of numerical and
graphical of the following additional outputs:

-   Model fit of the bond yields;

-   Impulse response functions (IRFs);

-   Forecast error variance decompositions (FEVDs);

-   Generalized impulse response functions (GIRFs);

-   Generalized forecast error variance decompositions (GFEVDs);

-   Bond yields decomposition into their expected component and term
    premia.

For the numerical computation of the IRFs, GIRFs, FEVDs or GFEVDs, a
horizon of analysis has to be specified, e.g.:

```{r}
Horiz <- 100
```

For the graphical outputs, the user must indicate the desired types of
graphs in a string-based vector:

```{r}
DesiredGraphs <- c("Fit", "GIRF", "GFEVD") # Available options are: "Fit", "IRF", "FEVD", "GIRF", 
                                          # "GFEVD", "TermPremia".
```

Moreover, the user must select the types of variables of interest
(yields, risk factors or both) and, for the JLL type of models, whether
the orthogonalized version should be additionally included

```{r}
WishGraphRiskFac <- 0 #   YES: 1; No = 0.
WishGraphYields <- 1 #    YES: 1; No = 0.
WishOrthoJLLgraphs <- 0 # YES: 1; No = 0.
```

The function `InputsForOutputs()` can provide some guidance for
customizing the features of the wished outputs. Conditional to these
settings, individual folders are created at the user's *temporary
directory* (which can be accessed using `tempdir()` ) to store the
different types of the desired graphical outputs.

### Bond yield decomposition

In its current form, the *MultiATSM* package allows for the calculation
of two risk compensation measures: term premia and forward premia. In
particular, a $n$-maturity bond yield can be decomposed as the sum of
its expected short-rate ($Exp_t^{(n)}$) and term premia 
($TP_t^{(n)}$) components.
Technically: $$
y_t^{(n)} = Exp_t^{(n)} + TP_t^{(n)}
$$ In the package's standard form, the expected short rate term is
computed from time $t$ to $t+n$, which represents the bond's maturity:
$Exp_t^{(n)} = \sum_{h=0}^{n} E_t[y_{t+h}^{(1)}]$. Alternatively, one
could perfom such a decomposition at the forward rate level
($f_t^{(n)}$), specifically
$f_t^{(n)} = \sum_{h=m}^{n} E_t[y_{t+h}^{(1)}] + FP_t^{(n)}$ where
$FP_t^{(n)}$ corresponds to the forward premia. In this case, the user
must specify a binary variable set to $1$ if the computation of forward
premia is desired, or $0$ otherwise. If set to $1$, the user must also
provide a two-element numerical vector containing the maturities
corresponding to the starting and ending dates of the bond maturity.
Example:

```{r}
    WishFPremia <- 1 # Wish to estimate the forward premia: YES: 1, NO:0 
    FPmatLim <- c(60, 120)
```

### Bootstrap settings

An additional list of inputs is required if the user intends to 
generate confidence bands around the point estimates of the numerical 
outputs. These results are achieved using a bootstrap procedure.
Specifically, the list of inputs include:

-   The desired bootstrap procedure (*methodBS*): Available options are
    *(i)* standard residual bootstrap (*bs*); *(ii)* wild bootstrap
    (*wild*), and block bootstrap (*block*). If the latest is selected,
    then the block length (*BlockLength*) must be indicated;
-   The number of bootstrap draws (*ndraws*);
-   The confidence level expressed in percentage points (*pctg*).

```{r}
Bootlist <- list(methodBS = 'block', BlockLength = 4, ndraws =  50, pctg   =  95)
```

### Out-of-sample forecast settings

To perform out-of-sample forecasts of bond yields, the following
list-based features have to be detailed:

-   Number of periods-ahead that the forecasts are to be generated
    (*ForHoriz*);
-   Time-dimension index of the first observations belonging to the
    information set (*t0Sample*);
-   Time-dimension index of the last observation of the information set
    used to perform the first forecast set (*t0Forecast*);
-   Method used in the forecast computation (*ForType*): forecasts can
    be generated using rolling or expanding windows. Rolling window
    forecasting to be done by setting this parameter as *Rolling*. In
    such a case, the forecast length used in the forecasts is the one
    specified as in *t0Sample*. Should the user wishes the forecasts to
    be performed on a expanding window basis, then this input should be
    set as *Expanding*.

```{r}
ForecastList <- list(ForHoriz = 12, t0Sample = 1, t0Forecast = 70, ForType = "Rolling")
```

# Model Estimation {#S:SectionEstimation}

Once bond yields and other (macro)economic time series data are gathered 
and model features are selected, ATSM estimation proceeds in three steps. 
First, the package estimates the country-specific spanned factors for 
inclusion in the global ATSM. Next, it estimates the parameters governing 
risk factor dynamics under the $P-$measure. Finally, it optimizes the 
entire selected ATSM, including the $Q-$measure parameters.

## Spanned Factors {#S:SpaFac}

The spanned factors are yield-related variables that are used to fit the
cross-section dimensions of the term structures. Typically, the spanned
factors of country $i$, $P_{i,t}$, are computed as the *N* first
principal components (PCs) of the set of observed bond yields. Formally,
$P_{i,t}$ is constructed as $P_{i,t} = w_i Y_{i,t}$ where $w_i$ is the
PC weight matrix and $Y_{i,t}$ is a country-specific column-vector of
yields with increasing maturities.

For $N = 3$, the spanned factors are traditionally interpreted as level,
slope, and curvature.  This interpretation arises from the structure of 
the principal component (PC) weight matrix, as shown below :

```{r}
w <- pca_weights_one_country(Yields, Economy = "Uruguay") 
```

In matrix *w*, each row stores the weights used for constructing each
spanned factor. The entries of the first row are linked to the
composition of the *level* factor in that they load rather equally on
all yields. Accordingly, high (low) values of the level factor indicate
an overall high (low) value of yields across all maturities. In the
second row, the weights monotonically increase with the maturities and,
therefore, they capture the degree of steepness (*slope*) of the term
structure. High slope factor values imply a steep yield curve, whereas
low ones entail flat (or, possibly, downward) curves. In the third row,
the weights of the *curvature* factor are presented. The name of this
factor follows from the fact that the weights have a more pronounced
effect on the middle range maturities of the curve. These concepts are
graphically illustrated below.

```{r, fig.cap = "Yield loadings on the spanned factors", echo=FALSE}
LabSpaFac <- c("Level", "Slope", "Curvature")
N <- length(LabSpaFac)
 
w_pca <- data.frame(t(w[1:N,]))
colnames(w_pca) <- LabSpaFac
w_pca$mat <- c(0.25, 0.5, 1, 3, 5, 10) # vector of maturitie

# Prepare plots
colors <- c("Level" = "blue", "Slope" = "green", "Curvature" = "red")

g <-  ggplot2::ggplot(data = w_pca, ggplot2::aes(x=  mat)) +  
    ggplot2::geom_line(ggplot2::aes(y = Level, color = "Level"), linewidth = 0.7) + 
    ggplot2::geom_line(ggplot2::aes(y = Slope, color = "Slope"), linewidth = 0.7) +
    ggplot2::geom_line(ggplot2::aes(y = Curvature, color = "Curvature"),  linewidth = 0.7) +
      ggplot2::labs(color = "Legend") + ggplot2::scale_color_manual(values = colors) + ggplot2::theme_classic() +
    ggplot2::theme(axis.title.y= ggplot2::element_blank(), legend.position="top", legend.title=ggplot2::element_blank(), legend.text= ggplot2::element_text(size=8) ) + 
   ggplot2::xlab("Maturity (Years)") + ggplot2::geom_hline(yintercept=0)

print(g)
```

To directly obtain the time series of the country-specific spanned
factors, the user can use the `Spanned_Factors()` function as
follows:

```{r}
data('CM_Yields')
Economies <- c("China", "Brazil", "Mexico", "Uruguay")
N <- 3
SpaFact <- Spanned_Factors(Yields, Economies, N)
```

## The P-dynamics estimation

As presented in Table \@ref(tab:ModFea), the dynamics of the risk
factors under the $P$-measure evolve according to a VAR(1) model, which
may be fully unrestricted (as in the case of the JPS-related models) or
partially restricted (as in the GVAR and JLL frameworks). While the
focus here is their integration into an ATSM setup, each specification
can also be estimated independently to study dynamic systems. The usage
of these models is illustrated below.

### VAR {#S:SectionVAR}

Using the `VAR()` function of this package requires simply selecting the
set of risk factors for the desired estimated model. For instance, for
the models *JPS global* and *JPS multi*, the estimation of the
$P$-dynamics parameters is obtained as

```{r}
data("CM_Factors")
PdynPara <- VAR(RiskFactors, VARtype= "unconstrained")
```

whereas the estimation of a *JPS original* model for China is

```{r}
FactorsChina <- RiskFactors[1:7, ]
PdynPara <- VAR(FactorsChina, VARtype= "unconstrained")
```

In both cases, the outputs generated are the vector of intercepts and
both the feedback and variance-covariance matrices of an unconstrained
VAR model.

### GVAR {#S:SectionGVAR}

The `GVAR()` function is designed to estimate a first-order GVAR which
is built from country-specific VARX(1,1,1) models. For a comprehensive
discussion of GVAR models, see, for instance, @ChudikPesaran2016.

The `GVAR()` function requires two key inputs: the number of
country-specific spanned factors (N) and a set of supplementary inputs
stored in the *GVARinputs* list. The former is composed by four
elements:

1.  *Economies*: A character vector containing the names of the
    economies included in the system;

2.  *GVAR list of risk factors*: a list of risk factors sorted by
    country in addition to the global variables. See the outputs of the
    `DatabasePrep()` function;

3.  *VARX type*: a string-vector containing the desired estimated form
    of the VARX(1,1,1). Three possibilities are available. The
    *"unconstrained"* form estimates the model without any constrains,
    using standard OLS regressions for each model equation. The
    *"constrained: Spanned Factors"* form prevents
    foreign-spanned-factors to impact any domestic factor in the
    feedback matrix, whereas *"constrained: '* followed by the name of
    the risk factor restricts this same factor to be influenced only by
    its own lagged values and the lagged values of its own star
    variables. In the last two cases, the VARX(1,1,1) is estimated by
    restricted least squares.

```{r}
GVARinputs <- list(Economies = Economies, GVARFactors = FactorsGVAR, VARXtype ="constrained: Inflation")
```

4.  *Transition matrix*: The estimation of a GVAR model requires
    defining a measure of interdependence among the countries of the
    economic system. This information is reported in the transition
    matrix, the entries of which reflect the degree of interconnection
    across two entities of this same economic system. The GVAR setups
    can be estimated using fixed or, starting from package's version
    $0.3.3$, time-varying interdependence weights.

Time-varying interdependence implies that the star factors are
constructed by applying trade flow weights from the specific year to
adjust the risk factors of the corresponding year. To employ this
feature, users need to specify the transition matrix type as
*Time-varying* and select the *same specific date (year)* for both the
initial and final years within the transition matrix. This implies that
the trade weights of this year are used for solving (link matrices) the
GVAR system.

Below the illustration of the transition matrix is computed as the
average of the cross-border trade flow weights for the period spanning
the years from 2006 to 2019. Therefore, the degree of dependence across
countries is assumed to be fixed over the entire sample span. Note that
each row sums up to $1$.

```{r}
data("CM_Trade")
t_First <- "2006"
t_Last <-  "2019"
Economies <- c("China", "Brazil", "Mexico", "Uruguay")
type <- "Sample Mean"
W_gvar <- Transition_Matrix(t_First, t_Last, Economies, type, DataPath = NULL, TradeFlows)
print(W_gvar)
```

Having defined the form of the transition matrix, one can complete the
*GVARinputs* list to estimate a GVAR(1) model. Note that the
*CheckInputs* parameter is set as *TRUE* to perform a prior consistency
check on the inputs provided in *GVARinputs*.

```{r}
data("CM_Factors_GVAR")

GVARinputs <- list(Economies = Economies, GVARFactors = FactorsGVAR, VARXtype = "unconstrained", 
                   Wgvar = W_gvar)
N <- 3

GVARpara <- GVAR(GVARinputs, N, CheckInputs = TRUE)
```

A separate routine is provided for computing the foreign-specific
factors (also commonly referred to as the star variables) used in the
estimation of the VARX models.

```{r}
data('CM_Factors')
StaFac <- StarFactors(RiskFactors, Economies, W_gvar)
```

### JLL {#S:SectionJLL}

Calculating the $P$-dynamics parameters in the form proposed by JLL
(2015) requires the following inputs: *(i)* the time series of the risk
factors in non-orthogonalized form; *(ii)* the number of
country-specific spanned factors and *(iii)* the specification of the
*JLLinputs* list, which includes the following elements:

1.  *Economies*: A character vector containing the names of the
    economies included in the system;

2.  *Dominant Country*: a string-vector containing the name of the
    economy which is assigned as the dominant country (applicable for
    the *JLL original* and *JLL joint Sigma* models) or *None*
    (applicable for the *JLL No DomUnit*);

3.  *Wish the estimation of the sigmas matrices*: this is a binary
    variable which assumes value equal to $1$ if the user wishes the
    estimation of all JLL sigma matrices (i.e. variance-covariance and
    the Cholesky factorization matrices) and, $0$ otherwise. Since this
    estimation is done numerically, its estimation can take several
    minutes;

4.  *Sigma of the non-orthogonalized variance-covariance matrix*: to
    save time, the user may provide the variance-covariance matrix from
    the non-orthogonalized dynamics. Otherwise, this input should be set
    as *NULL*.

5.  *JLL type*: a string-vector containing the label of the JLL model to
    be estimated according to the features described in Table
    \@ref(tab:ModFea).

One possible list of *JLLinputs* is as follows:

```{r}
ModelType <- "JLL original"   
JLLinputs <- list(Economies = Economies, DomUnit = "China", WishSigmas = 1,  SigmaNonOrtho = NULL,
                  JLLModelType = ModelType)
```

As such, the $P-$dynamics dimension of a JLL-based setup can be computed
as:

```{r, eval=FALSE}
data("CM_Factors")
N <- 3
JLLpara <- JLL(RiskFactors, N, JLLinputs, CheckInputs = TRUE)
```

## ATSM estimation

ATSM estimation involves optimizing the log-likelihood function by
optimally selecting the model parameters. The optimization structure in
this package is based on routines from the term structure package by
@LeSingleton2018. The ATSM otimization is carried out using the
`Optimization()` function.

Broadly, the unspanned risk framework from JPS (2014) (and, therefore,
all its multi-country extensions) requires the estimation parameter set
consisting of 6 elements, namely: the risk-neutral long-run mean of the
short rate (*r0*), the risk-neutral feedback matrix (*K1XQ*), the
variance-covariance matrix (*SSZ*) from the VAR processes, the standard
deviation of the errors from the portfolios of yields observed with
error (*se*), in addition to the intercept (*K0Z*) and the feedback
(*K1Z*) matrices of the physical dynamics.

The parameters *K0Z* and *K1Z* are estimated as part of the model's
$P-$dynamics and are known in closed form. Likewise, *r0* and *se* have
analytical solutions and can be factored out of the log-likelihood
function. However, the remaining parameters (*K1XQ* and *SSZ*) require
numerical estimation. @LeSingleton2018 provide standardized routines
(integrated into this package) to set effective initial values,
facilitating the optimization process.

The *MultiATSM* package offers two optimization algorithms: *fminunc*
and *fminsearch*. By default, both algorithms are run sequentially twice
to ensure the model reaches the global optimum. Specifically in the
default bootstrap setup, only *fminunc* is used to save time.

# Examples of full implementation of ATSMs {#S:SectionReplication}

This section presents some examples on how to use the *MultiATSM*
package to fully implement ATSMs. The console displays messages to help
the user track the progress of the code execution at each step. Further
on, the functions of this package are used to replicate some of the
results presented in the original papers of JPS (2014), CM (2023) and CM
(2024). See the [Paper Replication Vignette](Paper-Replications.html) of
this package for a detailed reproduction.

## General template {#S:SectionTemplate}

```{r, eval=FALSE}
########################################################################################################
#################################### USER INPUTS #######################################################
########################################################################################################
# A) Load database data
LoadData("CM_2024")

# B) GENERAL model inputs
ModelType <- "JPS multi" # available options: "JPS original", "JPS global", "GVAR single", "JPS multi",
                          #"GVAR multi", "JLL original", "JLL No DomUnit", "JLL joint Sigma".

Economies <- c("China", "Brazil", "Mexico") # Names of the economies from the economic system
GlobalVar <- c("Gl_Eco_Act")# Global Variables
DomVar <- c("Eco_Act", "Inflation") # Country-specific variables
N <- 2  # Number of spanned factors per country

t0_sample <- "01-05-2005" # Format: "dd-mm-yyyy"
tF_sample <- "01-12-2019" # Format: "dd-mm-yyyy"

OutputLabel <- "Test" # label of the model for saving the file
DataFreq <-"Monthly" # Frequency of the data

StatQ <- 0 # Wish to impose stationary condition for the eigenvalues of each country: YES: 1,NO:0

# B.1) SPECIFIC model inputs
#################################### GVAR-based models ##################################################
GVARlist <- list( VARXtype = "unconstrained", W_type = "Sample Mean", t_First_Wgvar = "2005",
                  t_Last_Wgvar = "2019", DataConnectedness <- TradeFlows ) 
# VARXtype: Available options "unconstrained" or "constrained" (VARX)
# W_type: Method to compute the transition matrix. Options:"Time-varying" or "Sample Mean"
# t_First_Wgvar: First year of the sample (transition matrix)
# t_Last_Wgvar:  Last year of the sample (transition matrix)
# DataConnectedness: measure of connectedness across countries
#################################### JLL-based models ###################################################
JLLlist <- list(DomUnit =  "China")
# DomUnit: name of the economy of the economic system, or "None" for the model "JLL No DomUnit"
###################################### BRW inputs  ######################################################
WishBC <- 0 # Wish to estimate the model with the bias correction method of BRW (2012): #YES: 1, NO:0
BRWlist <- within(list(flag_mean = TRUE, gamma = 0.05, N_iter = 250, B = 50, checkBRW = TRUE,
                       B_check = 1000),  N_burn <- round(N_iter * 0.15))
# flag_mean: TRUE = compute the mean; FALSE = compute the median
# gamma: Adjustment parameter
# N_iter:  Number of iteration to be conserved
# N_burn:  Number of iteration to be discarded
# B: Number of bootstrap samples
# checkBRW: wishes to perform closeness check: TRUE or FALSE
# B_check: If checkBRW is chosen as TRUE, then choose number of bootstrap samples used in the check

# C) Decide on Settings for numerical outputs
WishFPremia <- 1 # Wish to estimate the forward premia: YES: 1, NO:0
FPmatLim <- c(60,120) #  If the forward premia is desired, then choose the Min and max maturities of the
                      # forward premia. Otherwise set NA
Horiz <- 30
DesiredGraphs <- c("Fit", "IRF", "TermPremia") # "Fit", "IRF", "FEVD", "GIRF", "GFEVD", "TermPremia"
WishGraphRiskFac <- 0
WishGraphYields <- 1
WishOrthoJLLgraphs <- 0

# D) Bootstrap settings
WishBootstrap <- 1 #  YES: 1; No = 0.
BootList <- list(methodBS = 'bs', BlockLength = 4, ndraws = 5, pctg =  95)
# methodBS: bootstrap method. Available options: (i) 'bs' ; (ii) 'wild'; (iii) 'block'
# BlockLength: Block length, necessary input for the block bootstrap method
# ndraws: number of bootstrap draws
# pctg: confidence level

# E) Out-of-sample forecast
WishForecast <- 1 #  YES: 1; No = 0.
ForecastList <- list(ForHoriz = 12,  t0Sample = 1, t0Forecast = 162, ForType = "Rolling")
# ForHoriz: forecast horizon
# t0Sample:   initial sample date
# t0Forecast:  last sample date for the first forecast
# ForType: Available options are "Rolling" or "Expanding"

#########################################################################################################
############################### NO NEED TO MAKE CHANGES FROM HERE #######################################
#########################################################################################################

# 2) Minor preliminary work: get the sets of factor labels and  a vector of common maturities
FactorLabels <- LabFac(N, DomVar, GlobalVar, Economies, ModelType)

# 3) Prepare the inputs of the likelihood function
ATSMInputs <- InputsForOpt(t0_sample, tF_sample, ModelType, Yields, GlobalMacroVar, DomesticMacroVar,
                           FactorLabels, Economies, DataFreq, GVARlist, JLLlist, WishBC, BRWlist)

# 4) Optimization of the ATSM (Point Estimates)
ModelParaList <- Optimization(ATSMInputs, StatQ, DataFreq, FactorLabels, Economies, ModelType)

# 5) Numerical and graphical outputs
# a) Prepare list of inputs for graphs and numerical outputs
InputsForOutputs <- InputsForOutputs(ModelType, Horiz, DesiredGraphs, OutputLabel, StatQ, DataFreq,
                                    WishGraphYields, WishGraphRiskFac, WishOrthoJLLgraphs, WishFPremia,
                                    FPmatLim, WishBootstrap, BootList, WishForecast, ForecastList)

# b) Fit, IRF, FEVD, GIRF, GFEVD, and Term Premia
NumericalOutputs <- NumOutputs(ModelType, ModelParaList, InputsForOutputs, FactorLabels, Economies)

# c) Confidence intervals (bootstrap analysis)
BootstrapAnalysis <- Bootstrap(ModelType, ModelParaList, NumericalOutputs, Economies, InputsForOutputs,
                               FactorLabels, JLLlist, GVARlist, WishBC, BRWlist)

# 6) Out-of-sample forecasting
Forecasts <- ForecastYields(ModelType, ModelParaList, InputsForOutputs, FactorLabels, Economies,
                            JLLlist, GVARlist, WishBC, BRWlist)
```

## User-Friendly Inspection of Key Elements in the Selected ATSM Specification 

Starting from version $1.3.0$, the *MultiATSM* package provides additional ways to concisely inspect key elements of the selected ATSM. In particular, users can now gain an alternative perspective on the general model inputs, optimized model parameters, and root mean square error (RMSE) from the forecast analysis. For clarity, the variable names used hereafter align with those presented in Section \@ref(S:SectionTemplate).


### ATSM inputs
The *ATSMInputs* variable is an object of class *ATSMModelInputs* that includes the $S3$ methods `print()` and `summary()`. The `print()` method provides an overview of both the model inputs and the economic system's general features, while the `summary()` method offers statistical summaries of the risk factors and bond yields.

```{r}
data("Out_Example")
print(Out$ATSMInputs)
```

```{r}
summary(Out$ATSMInputs) 
```

### ATSM parameters
The `Optimization()` function returns an object of class *'ATSMModelOutputs'*, which includes the `summary()` $S3$ method. This method offers a parsimonious summary of key model outputs following 
optimization process, including the maximum log-likelihood value, the intercept and slope coefficients from the model’s cross-sectional dimension, and the count of physical-measure eigenvalues greater than one—an indicator of potential explosive behavior in the model’s time series dimension. 

```{r}
summary(Out$ModelParaList) 
```


### Forecasts
The `ForecastYields()` function returns an object of class *'ATSMModelForecast'*, which can be visualized using the `plot()` $S3$ method. This method generates a graphical representation of the RMSEs across different countries and forecast horizons.

```{r}
plot(Out$Forecasts) 
```

# References