---
title: "On the usage and estimation of theta in DImulti()"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{On the usage and estimation of theta in DImulti()}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{css styling, echo=FALSE}
span.R {
  font-family: Courier New;
}
```

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

options(crayon.enabled = TRUE)

ansi_aware_handler <- function(x, options)
{
  paste0(
    "<pre class=\"r-output\"><code>",
    fansi::sgr_to_html(x = x, warn = FALSE, term.cap = "256"),
    "</code></pre>"
  )
}

old_hooks <- fansi::set_knit_hooks(knitr::knit_hooks,
                                 which = c("output", "message", "error", "warning"))
knitr::knit_hooks$set(
  output = ansi_aware_handler,
  message = ansi_aware_handler,
  warning = ansi_aware_handler,
  error = ansi_aware_handler
)
```

```{r setup}
library(DImodelsMulti)
library(DImodels)
```

<h2>A quick overview</h2>

The main function of the R package <span class="R">DImodelsMulti</span> is <span class="R">DImulti()</span>. It is designed to fit (1) multivariate Diversity-Interactions (DI) 
models (Dooley *et al.*, 2015), (2) repeated measures DI models (Finn *et al.*, 2013), and (3) multivariate repeated measures DI models.

An example of a multivariate repeated measures DI model can be seen below:

$$ y_{kmt} =  \sum^{S}_{i=1}{\beta_{ikt} p_{im}} +
 \sum^{S}_{\substack{i,j=1 \\ i<j}}{\delta_{ijkt}(p_{im}p_{jm})^{\theta_{k}}} +
 \epsilon_{kmt} $$
 
where $y_{kmt}$ refers to the value of the $k^{th}$ ecosystem function from the
$m^{th}$ experimental unit at a time point $t$. For an experimental unit $m$, $\beta_{ikt}$ scaled 
by $p_{im}$ is the expected contribution of the $i^{th}$  species to the $k^{th}$ response at time 
point $t$ and is referred to as the $i^{th}$ species’ ID effect.
$S$ represents the number of unique species present in the study. Similarly to the ID effect,
the interaction effect, $\delta$, between species is scaled by some combination of the products 
of species proportions, which depends on the interaction structure chosen. The example above shows 
the full pairwise structure, which has a unique interaction term, $\delta_{ij}$, per pair of species $i$ & $j$. 

The nonlinear term $\theta$ (Connolly *et al.*, 2013; Vishwakarma *et al.*, 2023), the focus of this vignette, is applied as a power to each pair of species proportions included in the interaction terms of the model. It is included in the model to allow the shape of the BEF relationship to change. 
The value of the nonlinear parameter $\theta$ is allowed to vary between ecosystem functions, in 
turn allowing the fixed effect structure to change across functions, in recognition that the nature of the species interactions could change between ecosystem functions.


![](Theta.png){width=100%}
Image Credit: Connolly *et al.* 2013


<h2>DImulti()'s estimation process for $\theta$ </h2>

In the function <span class="R">DImulti()</span>, $\theta$ values may be supplied by a user through 
the parameter <span class="R">'theta'</span> or estimated by specifying <span class="R">TRUE</span>
through <span class="R">'estimate_theta'</span>. The non-linear term may also be "left out" by 
setting <span class="R">'theta'</span> equal to 1 and <span class="R">'estimate_theta'</span> to 
FALSE, which is the default for the function.

```{r thetaOptions, eval = FALSE}
DImulti(..., theta = c(0.5, 1, 1.2))
DImulti(..., theta = 1)
```


Setting a value for theta supplied by a user is a straightforward process thanks to the function
<span class="R">DI_data()</span> from the parent package <span class="R">DImodels</span> (Moral *et al.*, 2023). We divide the dataset by ecosystem function, as multivariate data can have differing values of theta for each, then pass each subset into the function, specifying the desired $\theta$ value and base interaction structure. The subsets are then bound together again and the dataset is ready for use with theta. 

The below example would transform the usual format of a full pairwise structure from 
$\delta_{12}p_{1}p_{2} + \delta_{13}p_{1}p_{3} + \delta_{23}p_{2}p_{3}$ to $\delta_{12}(p_{1}p_{2})^{1.2} + \delta_{13}(p_{1}p_{3})^{1.2} + \delta_{23}(p_{2}p_{3})^{1.2}$.

```{r DI_data_value, eval = FALSE}
DI_data(..., theta = 1.2, what = "FULL")
```


When theta is estimated, the process is a bit more involved. We create a function to fit multivariate repeated measures DI models, wrapping <span class="R">DImulti()</span>, which accepts a vector of theta values, one per ecosystem function, and returns the negative log likelihood of the produced model. We then use <span class="R">optim()</span> to minimise the returned value from this function over the space of the allowable values of theta (0.1 to 1.5). 

After fitting a model where theta is estimated. This model should then be compared to a model with the same fixed effects structure using $\theta = 1$ using information criteria (AIC, BIC, AICc, BICc) to test the significance of the estimated values. This may be done all at once, as below:

```{r test_theta, eval = FALSE}
AICc(DImodel)
AICc(DImodel_theta)
```

Or each value of theta should be tested individually while the others are held constant at a value of 1.

```{r testSep_theta, eval = FALSE}
AICc(DImodel)
AICc(DImodel_thetaFunc1)

AICc(DImodel)
AICc(DImodel_thetaFunc2)

AICc(DImodel)
AICc(DImodel_thetaFunc3)
```

The best method for estimating and testing theta values has not yet been thoroughly tested. 

In the event that the preferred model has an estimated value of theta that differs from 1 but causes fitting issues, e.g., predictions are unrealistic, either in the model used for estimation or a final selected model, set the theta value equal to 1 and continue/redo the analysis.

<h2>Implications of this methodology</h2>

The methodology used in this package allows a user to estimate the value of theta for each ecosystem function in the dataset simultaneously using log-likelihood as a measure of model fit quality.

For multivariate and repeated measures DI models, we have allowed the value of theta to vary across ecosystem functions, which in turn allows the change in fixed effects structures to vary across functions. This is not common practice in multivariate regression and so its effects/implications will require further study.

The methodology used in this R package may change in future updates to align with research findings on the topic.

 

<h2>References</h2>

Vishwakarma, R., Byrne, L., Connolly, J., de Andrade Moral, R. and Brophy, C., 2023. 
Estimation of the non-linear parameter in Generalised Diversity-Interactions models is unaffected by change in structure of the interaction terms. 
Environmental and Ecological Statistics, 30(3), pp.555-574.  

Moral, R.A., Vishwakarma, R., Connolly, J., Byrne, L., Hurley, C., Finn, J.A. and Brophy, C.,
2023. 
Going beyond richness: Modelling the BEF relationship using species identity, evenness,
richness and species interactions via the DImodels R package. 
Methods in Ecology and Evolution, 14(9), pp.2250-2258.  

Dooley, Á., Isbell, F., Kirwan, L., Connolly, J., Finn, J.A. and Brophy, C., 2015. 
Testing the effects of diversity on ecosystem multifunctionality using a multivariate model. 
Ecology Letters, 18(11), pp.1242-1251.  

Finn, J.A., Kirwan, L., Connolly, J., Sebastià, M.T., Helgadottir, A., Baadshaug, O.H.,
Bélanger, G., Black, A., Brophy, C., Collins, R.P. and Čop, J., 2013. 
Ecosystem function enhanced by combining four functional types of plant species in intensively
managed grassland mixtures: a 3‐year continental‐scale field experiment.
Journal of Applied Ecology, 50(2), pp.365-375 . 

Connolly, J., Bell, T., Bolger, T., Brophy, C., Carnus, T., Finn, J.A., Kirwan, L., Isbell, F.,
Levine, J., Lüscher, A. and Picasso, V., 2013. 
An improved model to predict the effects of changing biodiversity levels on ecosystem function.

Journal of Ecology, 101(2), pp.344-355.  

Kirwan, L., Connolly, J., Finn, J.A., Brophy, C., Lüscher, A., Nyfeler, D. and Sebastià, M.T.,
2009. 
Diversity-interaction modeling: estimating contributions of species identities and interactions
to ecosystem function. 
Ecology, 90(8), pp.2032-2038.  

Brent, R.P., 1973. 
Some efficient algorithms for solving systems of nonlinear equations. 
SIAM Journal on Numerical Analysis, 10(2), pp.327-344.