---
title: "Centrality indices"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{05 centrality indices}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
  
This vignette describes how to build different centrality indices on the basis of 
indirect relations as described in [this](indirect_relations.html) vignette. Note,
however, that the primary purpose of the netrankr package is **not** to provide a great
variety of indices, but to offer alternative methods for centrality assessment. 
Nevertheless, the package also provides an Rstudio addin 'index_builder()', which 
allows to create and customize more than 20 different indices.

________________________________________________________________________________

## Theoretical Background

A one-mode network can be described as a *dyadic variable* $x\in \mathcal{W}^\mathcal{D}$,
where $\mathcal{W}$ is the value range of the network (in the simple case of 
unweighted networks $\mathcal{W}=\{0,1\}$) and $\mathcal{D}=\mathcal{N}\times\mathcal{N}$ 
describes the dyadic domain of actors $\mathcal{N}$.
\
\
Observed presence or absence of ties (the value range is binary) is usually not 
the relation of interest for network analytic tasks. Instead, mostly implicitly, 
relations are *transformed* into a new set of *indirect* relations on the basis 
of the *observed* relations. As an example, consider (shortest path) distances in the 
underlying graph. While they are fairly easy to derive from an observed network 
of contacts, it is impossible for actors in a network to answer the question 
"How far away are you from others you are not connected with?". We denote generic 
transformed networks from an observed network $x$ as $\tau(x)$. 
\
\

With this notion of indirect relations, we can express all centrality indices in
a common framework as
$$
c_\tau(i)=\sum\limits_{t \in \mathcal{N}} \tau(x)_{it}
$$
Degree and closeness centrality, for instance, can be obtained by setting $\tau=id$ 
and $\tau=dist$, respectively. Others need several additional specifications which 
can be found in [Brandes (2016)](https://dx.doi.org/10.1177/2059799116630650) or 
[Schoch & Brandes (2016)](https://doi.org/10.1017/S0956792516000401). 
\
With this framework, all centrality indices can be characterized as degree-like 
measures in a suitably transformed network $\tau(x)$. To build specific indices,
we follow the *analytic pipeline* for centrality assessment:
$$
\text{Observed network}\;(x) \longrightarrow 
\text{transformation}\;(\tau(x)) \longrightarrow 
\text{aggregation}\;(e.g. \sum_j \tau(x)_{ij})
$$

________________________________________________________________________________

## Building indices with the `netrankr` package

```{r setup, warning=FALSE,message=FALSE}
library(netrankr)
library(igraph)
library(magrittr)
```

The `netrankr` does, by design, not explicitly implement any centrality index. It 
does, however, provide a large set of components to create indices. Building an index
based on an indirect relation, computed with `indirect_relations()`, is done with 
the function `aggregate_positions()`.  
\
The usual workflow is as follows:  
`g %>% indirect_relations() %>% aggregate_positions()`   
which is equivalent to
`aggregate_positions(indirect_relations(g))`.  
The former, however, comes with enhanced readability and is in accordance with 
the proposed analytic pipeline (see above).  
\
`aggregate_position()` has a parameter `type` which is used to choose an appropriate 
aggregation method. Commonly, this is simply the sum operation.

```{r standardcent,eval=F}
data("dbces11")
g <- dbces11

V(g)$name <- 1:11

#Degree
g %>% 
  indirect_relations(type="adjacency") %>% 
  aggregate_positions(type="sum")
#Closeness
g %>% 
  indirect_relations(type="dist_sp") %>% 
  aggregate_positions(type="invsum")
#Betweenness Centrality
g %>% 
  indirect_relations(type="depend_sp") %>% 
  aggregate_positions(type="sum")
#Eigenvector Centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_limit_prop) %>% 
  aggregate_positions(type="sum")
```

For closeness `type="invsum"` is used since traditional closeness is defined as
$$
c_c(i)=\frac{1}{\sum_t dist(i,t)}.
$$
To obtain a slight variant of closeness, i.e.
$$
c_c(i)=\sum_t \frac{1}{dist(i,t)},
$$
the following code can be used:
```{r closeness_variant, eval=F}
#harmonic closeness
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_inv) %>% 
  aggregate_positions(type="sum")
```

Indices based on shortest path distances constitute the biggest group of indices in the `netrankr` package.

```{r distance_indices,eval=F}
#residual closeness (Dangalchev,2006)
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_2pow) %>% 
  aggregate_positions(type="sum")

#generalized closeness (Agneessens et al.,2017) (alpha>0)
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_dpow,alpha=2) %>% 
  aggregate_positions(type="sum")

#decay centrality (Jackson, 2010) (alpha in [0,1])
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_powd,alpha=0.7) %>% 
  aggregate_positions(type="sum")

#integration centrality (Valente & Foreman, 1998)
dist_integration <- function(x){
  x <- 1 - (x - 1)/max(x)
}
g %>% 
  indirect_relations(type="dist_sp",FUN=dist_integration) %>% 
  aggregate_positions(type="sum")

```

The package implements several additional distance measures for networks, for which
no index exists so far. Consult the help of `indirect_relations()` for possibilities.

Another large group of indices is based on walk counts.

```{r othercent,eval=F}
#subgraph centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp) %>% 
  aggregate_positions(type="self")
#communicability centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp) %>% 
  aggregate_positions(type="sum")
#odd subgraph centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp_odd) %>% 
  aggregate_positions(type="self")
#even subgraph centrality
g %>% 
  indirect_relations(type="walks",FUN=walks_exp_even) %>% 
  aggregate_positions(type="self")
#katz status
g %>% 
  indirect_relations(type="walks",FUN=walks_attenuated) %>% 
  aggregate_positions(type="sum")
```


**Note**: The analytic pipeline can of course be wrapped into a function.

```{r index_func}
degree_centrality <- function(g){
  DC <- g %>% 
    indirect_relations(type="adjacency") %>% 
    aggregate_positions(type="sum")
  return(DC)
}
```

Additionally, the Rstudio addin `index_builder()` provides a convenient way to produce the code for any desired index.