---
title: "Multipartite Stochastic Block Models"
subtitle: "An illustration on a mutualistic ecological network"
author: "team großBM"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
    toc_depth: 4
bibliography: references.bib
link-citations: yes
vignette: >
  %\VignetteIndexEntry{Multipartite Stochastic Block Models}
  %\VignetteEngine{knitr::rmarkdown} 
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

## Preliminaries

This vignette illustrates the use of the `estimateMultipartiteSBM` function and the methods accompanying the R6 classes `multipartiteSBMfit'.

### Requirements

The only package required for the analysis is **sbm**:

```{r setup, message=FALSE, warning=FALSE}
library(sbm)
```


## Dataset

We apply our methodology to an ecological mutualistic multipartite network.  
The dataset --compiled and conducted by   @Dattilo  at Centro de Investigaciones Costeras La Mancha (CICOLMA), located on the central coast of the Gulf of Mexico, Veracruz, Mexico--
 involves three general types of plant-animal mutualistic interaction:
 pollination,  seed dispersal by frugivorous birds, and   protective mutualisms between ants and plants with extrafloral nectaries.


The dataset --which is one of the largest compiled so far with respect
to species richness, number of interactions and sampling effort--  includes  4 functional groups (FG), namely  plants, pollinator species (referred as floral visitors), ant species and frugivorous bird species. Three binary bipartite networks have been collected representing interactions between 1/ plants and florals visitor,  2/ plants and ants, and 3/ plants and seed dispersal birds, resulting into three bipartite networks.   

The FG are of respective sizes: $n_1 =  141$  plant species,  $n_2 = 173$  pollinator species, $n_3 = 46$ frugivorous bird species and $n_4 = 30$  ant species. 
        
The 3 networks contain   $753$  observed interactions of which $55\%$ are   plant-pollinator interactions, $17\%$ are   plant-birds interactions  and $28\%$ are plant-ant interactions.

```{r loading dataset, eval=TRUE}
data(multipartiteEcologicalNetwork)
str(multipartiteEcologicalNetwork)
names(multipartiteEcologicalNetwork)
```


### Formatting the data
We format the data to be able to use our functions i.e. we transform the matrices into an list containing *the matrix*, *its type* : `inc`   for incidence matrix,  `adj`    for adjacency symmetric, and `diradj` for  non symmetric (oriented) adjacency  matrix, the name of functional group in row and the name of functional group in column.   The three matrices are gathered in a list. 

To do so, we use the function `defineNetwork`. 
```{r transform dataset,  eval=TRUE}
Net <- multipartiteEcologicalNetwork
type = "bipartite"
model = "bernoulli"
directed = FALSE
PlantFlovis <- defineSBM(Net$Inc_plant_flovis, model, type, directed, dimLabels = c("Plants",
    "Flovis"))
PlantAnt <- defineSBM(Net$Inc_plant_ant, model, type, directed, dimLabels = c("Plants",
    "Ants"))
PlantBird <- defineSBM(Net$Inc_plant_bird, model, type, directed, dimLabels = c("Plants",
    "Birds"))
``` 

If one wants to keep a track of the names of the species, they should be used as rownames and colnames in the matrices.



```{r example of dataset, eval=TRUE}
PlantFlovis$netMatrix[1:2, 1:2]
```


A plot of the data can be obtained with following command

```{r plot data}
plotMyMultipartiteMatrix(list(PlantFlovis, PlantAnt, PlantBird))
```



### Mathematical Background

See @multipartite for details.

### Inference 

The model selection and the estimation are performed with the function `estimatemultipartiteBM`.

```{r load result, echo = FALSE, eval = TRUE}
load('resMultipartiteEcological.rda')
``` 

```{r MBM, echo = TRUE, eval = FALSE}
estimOptions = list(initBM = FALSE)
listSBM <- list(PlantFlovis, PlantAnt, PlantBird)
myMSBM <- estimateMultipartiteSBM(listSBM, estimOptions)
``` 


\code{myMSBM} contains the estimated parameters of the models we run through during the search of the better numbers of blocks. 

```{r MBM what}
myMSBM
``` 


The best model has the following numbers of blocks:
```{r MBM v_K }
myMSBM$nbBlocks
```


To see the parameters estimated for the better model we use the following command `myMSBM$connectParam` or `myMSBM$blockProp`:
```{r MBM param }
myMSBM$blockProp
myMSBM$connectParam
```


The clustering  supplied by the better model  are in `myMSBM$memberships$***`.

```{r MBM Z }
table(myMSBM$memberships$Plants)
table(myMSBM$memberships$Ants)      
``` 

```{r storedmodels}
myMSBM$storedModels
``` 

## Plots 
We can either plot the reorganized matrix:
```{r plot, eval = TRUE }
plot(myMSBM) 
``` 

or the mesoscopic view:
```{r plot meso, eval = TRUE}
plotOptions = list(vertex.size = c(12, 6, 4, 4))
plotOptions$vertex.shape = rep("circle", 4)
plotOptions$vertex.color = c("darkolivegreen3", "darkgoldenrod2", "salmon2",
    "cadetblue2")
plotOptions$edge.curved = 0.1
plot(myMSBM, type = "meso", plotOptions = plotOptions)
```

 

## References