---
title: "Extracting the Likelihood Function and Changing Optimizer"
author: ""
date: ""
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Extracting the Likelihood Function and Changing Optimizer}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include=FALSE}
library(ctsmTMB)
```

In this document we show how to use the `likelihood` method to obtain function handlers for the objective function, and gradient, (and hessian if using a Kalman filter), for instance to use another optimization algorithm than `stats::nlminb`.

# Simulate from the Ornstein-Uhlenbeck process

---

We use the common Ornstein-Uhlenbeck process to showcase the use of `likelihood`. 

$$
\mathrm{d}X_{t} = \theta (\mu - X_{t}) \, \mathrm{d}t \, + \sigma_{X} \, \mathrm{d}B_{t} 
$$


$$
Y_{t_{k}} = X_{t_{k}} + e_{t_{k}}, \qquad e_{t_{k}} \sim \mathcal{N}\left(0,\sigma_{Y}^{2}\right)  
$$
We first create data by simulating the process

```{r}
# Simulate data using Euler Maruyama
set.seed(10)
theta=10; mu=1; sigma_x=1; sigma_y=1e-1
# 
dt.sim = 1e-3
t.sim = seq(0,1,by=dt.sim)
dw = rnorm(length(t.sim)-1,sd=sqrt(dt.sim))
x = 3
for(i in 1:(length(t.sim)-1)) {
  x[i+1] = x[i] + theta*(mu-x[i])*dt.sim + sigma_x*dw[i]
}

# Extract observations and add noise
dt.obs = 1e-2
t.obs = seq(0,1,by=dt.obs)
y = x[t.sim %in% t.obs] + sigma_y * rnorm(length(t.obs))

# Create data
.data = data.frame(
  t = t.obs,
  y = y
)
```


# Construct model object

---

We now construct the `ctsmTMB` model object

```{r}
# Create model object
obj = ctsmTMB$new()

# Add system equations
obj$addSystem(
  dx ~ theta * (mu-x) * dt + sigma_x*dw
)

# Add observation equations
obj$addObs(
  y ~ x
)

# Set observation equation variances
obj$setVariance(
  y ~ sigma_y^2
)

# Specify algebraic relations
obj$setAlgebraics(
  theta ~ exp(logtheta),
  sigma_x ~ exp(logsigma_x),
  sigma_y ~ exp(logsigma_y)
)

# Specify parameter initial values and lower/upper bounds in estimation
obj$setParameter(
  logtheta   = log(c(initial = 5,    lower = 0,    upper = 20)),
  mu         = c(    initial = 0,    lower = -10,  upper = 10),
  logsigma_x = log(c(initial = 1e-1, lower = 1e-5, upper = 5)),
  logsigma_y = log(c(initial = 1e-1, lower = 1e-5, upper = 5))
)

# Set initial state mean and covariance
obj$setInitialState(list(x[1], 1e-1*diag(1)))
```

# Estimation

---

We are in principle ready to call the `estimate` method to run the optimization scheme using the built-in optimization which uses `stats::nlminb` i.e.

```{r}
fit = obj$estimate(.data)
```

Inside the package we optimise the objective function with respect to the fixed parameters using the construction function handlers from `TMB::MakeADFun` and parsing them to `stats::nlminb` i.e.

```{r, eval=FALSE}
nll = TMB::MakeADFun(...)
opt = stats::nlminb(start=nll$par, objective=nll$fn, grad=nll$gr, hessian=nll$he)
```

# Extract function handlers

---

The `likelihood` method allows you to retrieve the `nll` object that holds the negative log-likelihood, and its derivatives. The method takes arguments similar to those of `estimate`.

```{r}
nll = obj$likelihood(.data)
```

The initial parameters (supplied by the user) are stored here
```{r}
nll$par
```

The objective function can be evaluated by
```{r}
nll$fn(nll$par)
```

The gradient can be evaluated by
```{r}
nll$gr(nll$par)
```

The hessian can be evaluated by
```{r}
nll$he(nll$par)
```

We can now use these to optimize the function using e.g. `stats::optim` instead. 

# Extract parameter lower/upper bounds

---


You can extract the parameter bounds specified when calling `setParameter()` method by using the `getParameters` method (note that `nll$par` and `pars$initial` are identical).

```{r}
pars = obj$getParameters()
print(pars)
```

# Optimize manually using `stats::optim`

We supply the initial parameter values, objective function handler and gradient handler, and parameter bounds to `optim`.
```{r}
opt = stats::optim(par=nll$par, 
                   fn=nll$fn, 
                   gr=nll$gr, 
                   method="L-BFGS-B", 
                   lower=pars$lower, 
                   upper=pars$upper)
```

# Compare results between the two optimizers

---

Lets compare the results from using `stats::optim` with the extracted function handler versus the internal optimisation that uses `stats::nlminb` stored in `fit`:

```{r}
# Estimated parameters
data.frame(external=opt$par, internal=fit$par.fixed)

# Neg. Log-Likelihood
data.frame(external=opt$value, internal=fit$nll)

# Gradient components
data.frame(external=t(nll$gr(opt$par)), internal=t(nll$gr(fit$par.fixed)))
```