---
title: "co2 data example"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{co2}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
        echo = TRUE, results = 'hold', warning=F, cache=F, 
  #dev = 'pdf', 
  message=F, 
  fig.width=5, fig.height=5,
  tidy.opts=list(width.cutoff=75), tidy=FALSE
)
old <- options(scipen = 1, digits = 4)
```

## Loading GPFDA package
```{r}
library(GPFDA)
```

## Loading co2 data

In this example, we use a real dataset and apply a customised covariance kernel.

```{r}
co2 <- datasets::co2
```

For visualisation purposes, we will use only the first five years of the sample.
```{r}
y <- co2[1:60]
n <- length(y)
input <- 5*(1:n)/n
```

We want to fit a GPR model using the observed data and then make predictions at
a $1000$ equally spaced time points:
```{r}
inputNew <- 5*seq(1/n, 1, length.out=1000) # input test for prediction
```

## Using exponential covariance kernel

We will first fit a GPR model with a linear trend line for the mean function by 
setting `meanModel='t'`, and a exponential covariance function by setting 
`Cov='pow.ex'` and `gamma=1`.
```{r, warnings=F, results=F, fig.width=8, fig.height=5}
set.seed(789)
fit1 <- gpr(input=input, response=y, Cov='pow.ex', gamma=1, meanModel='t',
                      nInitCandidates=50, trace=2)
```

```{r}
sapply(fit1$hyper, exp)
```

Since the noise variance `vv` estimate is extremely low, this fitted model basically 
interpolates the datapoints:
```{r, warnings=F, results=F, fig.width=8, fig.height=5}
plot(fit1, main="exponential kernel - fitted model")
```

See below the large uncertainty in predictions for test input points:
```{r, warnings=F, results=F, fig.width=8, fig.height=5}
pred1 <- gprPredict(train = fit1, inputNew=inputNew)
plot(pred1, main="exponential kernel - predictions")
```

These results suggest that the exponential kernel is likely not to be the best 
choice. A kernel which takes into account a periodic pattern may be wanted.

## Using a customised covariance kernel

To  capture the periodic pattern, we use the following covariance function:
\begin{equation}
k(t,t')=v \exp(-w (t-t')^2 - u \sin^2(\pi (t-t'))), \quad v,w,u > 0
\label{cus-cov}
\end{equation}
(see Williams, C. K., \& Rasmussen, C. E. (2006). "Gaussian Processes for 
Machine Learning", the MIT press).

This is not a standard covariance kernel and is not included in the package. 
Therefore, we will define it manually. 

### Defining a customised covariance kernel

The first step is to define the covariance kernel itself. The name of the kernel 
must be constituted by the prefix  `cov.` and 6 letters  ( e.g., `cov.custom`). 

Here the C++ function `distMat` is used to calculate distances $\exp(w)(t-t')^2$.
The similar `distMatSq` function is used when `t` and `t'` are identical. See
package documentation for more details.

```{r}
cov.custom <- function(hyper, input, inputNew=NULL){
  hyper <- lapply(hyper, exp)
  if(is.null(inputNew)){
    inputNew <- as.matrix(input)
  }else{
    inputNew <- as.matrix(inputNew)
  }
  input <- as.matrix(input)
  A1 <- distMat(input=input, inputNew=inputNew, A=as.matrix(hyper$custom.w), 
                        power=2)
  sept <- outer(input[,1], inputNew[,1], "-")
  A2 <- hyper$custom.u*(sin(pi*sept))^2
  customCov <- hyper$custom.v*exp(-A1-A2)
  return(customCov)
}
```


### Defining the first derivatives

The second step is to define the first derivative of the log-likelihood with 
respect to each of the hyperparameters of the covariance kernel. 

The name of the functions should look like `Dloglik.custom.w`, where the middle 
affix is the custom defined name, and the last letter (i.e. `w`) names the vector 
of the hyperparameters. 

For details about derivatives, see Shi, J. Q., and Choi, T. (2011), "Gaussian Process Regression Analysis for Functional Data", CRC Press.

<!-- In this case, only the first derivative of the covariance matrix against the parameters need to be calculated. If define it as `out', the result should be the trace of `out' multiply `AlphaQ', which is same as `sum(out*AlphaQ)'. The argument `AlphaQ' is calculated in the main function, which will attach to these functions. $AlphaQ = (Q^{-1}y)^2-Q^{-1}$, where Q is the covariance matrix. -->



```{r}
Dloglik.custom.w <- function(hyper, input, AlphaQ){
  A1 <- distMatSq(input=input, A=as.matrix(exp(hyper$custom.w)), power=2)
  Dcov <- - cov.custom(hyper, input)*A1
  out <- 0.5*sum(diag(AlphaQ%*%Dcov))
  return(out)
}

Dloglik.custom.u <- function(hyper, input, AlphaQ){
  sept <- outer(input[,1], input[,1], "-")
  A2 <- exp(hyper$custom.u)*(sin(pi*sept))^2
  Dcov <- - cov.custom(hyper, input)*A2
  out <- 0.5*sum(diag(AlphaQ%*%Dcov))
  return(out)
}

Dloglik.custom.v <- function(hyper, input, AlphaQ){
  Dcov <- cov.custom(hyper, input)
  out <- 0.5*sum(diag(AlphaQ%*%Dcov))
  return(out)
}
```

### Defining the second derivatives

The third step is to define the second derivatives of the log-likelihood.

```{r}
D2custom.w <- function(hyper, input, inv.Q, Alpha.Q){
  Cov <- cov.custom(hyper, input)
  A1 <- distMatSq(input=input, A=as.matrix(exp(hyper$custom.w)), power=2)
  D1cov <- -Cov*A1
  D2cov <- -(D1cov + Cov)*A1
  D2c.w <- D2(D1cov, D2cov, inv.Q, Alpha.Q)
  return(D2c.w)
}

D2custom.u <- function(hyper, input, inv.Q, Alpha.Q){
  Cov <- cov.custom(hyper, input)
  sept <- outer(input[,1], input[,1], "-")
  A2 <- exp(hyper$custom.u)*(sin(pi*sept))^2
  D1cov <- - Cov*A2
  D2cov <- -(D1cov + Cov)*A2
  D2c.u <- D2(D1cov, D2cov, inv.Q, Alpha.Q)
  return(D2c.u)
}

D2custom.v <- function(hyper, input, inv.Q, Alpha.Q){
  out <- cov.custom(hyper, input)
  return(out)
}
```

### Defining a diagonal matrix including the signal variance

Finally, we define a function to calculate a diagonal matrix contanining the 
variance of the customised covariance function. This is used for making 
predictions at new input points.

```{r}
diag.custom <- function(hyper, input){
  Qstar <- rep(exp(hyper$custom.v), nrow(input))
  return(Qstar)
}
```


### Fitting the GPR model with the customised kernel and making predictions

```{r, message=F, results=F, fig.width=8, fig.height=5}
fit2 <- gpr(input=input, response=y, Cov='custom',
            NewHyper=c('custom.w','custom.u','custom.v'), gamma=2, meanModel='t',
            nInitCandidates=50, trace=2, useGradient = F)
```
```{r}
sapply(fit2$hyper, exp)
```
Note that now the noise variance `vv` estimate is no longer so small. Both the 
fitted model and the predictions seem much more reasonable for this dataset:
```{r, message=F, fig.width=8, fig.height=5}
plot(fit2, main="customised kernel - fitted model")
```

```{r, message=F, fig.width=8, fig.height=5}
pred2 <- gprPredict(train = fit2, inputNew=inputNew)
plot(pred2, main="customised kernel - predictions")
```



```{r, include = FALSE}
options(old)
```