---
title: "Testing Sums"
author: "Donny Williams"
date: "5/25/2020"
bibliography: ../inst/REFERENCES.bib
output:
rmarkdown::html_vignette:
  toc: yes
vignette: >
  %\VignetteIndexEntry{Testing Sums}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Introduction
This is a follow-up to the vignette ["Three Ways to Test the Same Hypothesis"](https://donaldrwilliams.github.io/BGGM/articles/hyp_3_ways.html). A
new feature, `pcor_sum`, was added to **BGGM** that allows for testing partial correlation sums. 
This differs from the Bayes factor approach ("Approach #3"), in that only the posterior 
distribution is used to determine whether there is a difference in the sums. 

### R package
```{r, eval = FALSE, message=FALSE}
# need the developmental version
if (!requireNamespace("remotes")) { 
  install.packages("remotes")   
}   

# install from github
remotes::install_github("donaldRwilliams/BGGM")
library(BGGM)
```


# One Group
This first example looks at one group, where a sum is tested within the same ptsd network. I focus on the 
relations between the re-experiencing (`B`) and avoidance (`C`) communities. In particular, the sum of relations between the "Intrusion" (5 nodes) community and the "Avoidance" (two nodes) community is tested. 

## Sum to String
For the avoidance symptom "avoidance of thoughts" `C1`, this can be written in `R` code with

```
# ptsd
Y <- ptsd

# paste together sums
paste0(colnames(Y)[1:5],  "--C1", collapse = " + ")

#> "B1--C1 + B2--C1 + B3--C1 + B4--C1 + B5--C1"
```
whereas, for the avoidance symptom "avoidance of reminders" (`C2`), this is written as

```
paste0(colnames(Y)[1:5],  "--C2", collapse = " + ")

#> "B1--C2 + B2--C2 + B3--C2 + B4--C2 + B5--C2"
```
Note that typically this would have to be written out. `paste0` was used in this case to 
avoid typing out all of the relations.

## Fit Model

Here an ordinal GGM is fitted

```
fit <- estimate(Y+1, type = "ordinal", iter = 1000)
```
where the `+1` changes the first category from 0 to 1 (required).

## Test Sums
The next step is to use the `pcor_sum` function. First, I combine the sums into one string separated with `;`.
```
# sum 1
sum1 <- paste0(colnames(Y)[1:5],  "--C1", collapse = " + ")

# sum 2
sum2 <- paste0(colnames(Y)[1:5],  "--C2", collapse = " + ")

# paste together
sums <- paste(sum1, sum2, sep = ";")

# print
sums
#> "B1--C1 + B2--C1 + B3--C1 + B4--C1 + B5--C1;B1--C2 + B2--C2 + B3--C2 + B4--C2 + B5--C2"
```

Next `pcor_sum` is used

```
test_sum <- pcor_sum(fit, relations = sums)

# print
test_sum

# BGGM: Bayesian Gaussian Graphical Models 
# --- 
# Network Stats: Posterior Sum
# Posterior Samples: 1000 
# --- 
# Estimates 
# 
# Sum: 
#                                    Post.mean Post.sd Cred.lb Cred.ub
# B1--C1+B2--C1+B3--C1+B4--C1+B5--C1     0.215   0.096   0.034   0.404
# B1--C2+B2--C2+B3--C2+B4--C2+B5--C2     0.334   0.097   0.145   0.514
# --- 
# 
# Difference:
# B1--C1+B2--C1+B3--C1+B4--C1+B5--C1 - B1--C2+B2--C2+B3--C2+B4--C2+B5--C2 
# 
#  Post.mean Post.sd Cred.lb Cred.ub Prob.greater Prob.less
#     -0.119   0.145  -0.409   0.173        0.205     0.795
# --- 
```

`Prob.greater` is the posterior probability that the first sum is larger than the second sum.

## Plot Results
The object `test_sum` can then be plotted. Note this returns three plots, but only the difference is shown here

```
plot(test_sum)$diff
```

![](../man/figures/test_sum_hist.png)

The histogram is not very smooth in this case because `iter = 1000`, but this of course can be changed.

# Two Groups
This next example is for two groups. The data are called `bfi` and they are in the **BGGM** package. I compare a sum of two relations for questions measuring agreeableness in males and females. The relations tested are as follows


## Sum to String
```r
sums <- c("A3--A4 + A4--A5")
```
where `A1` is "know how to comfort others", `A4` is "love children", and `A5` is "make people feel at ease".

## Fit Models
The next step is to fit the models
```r
# data
Y <- bfi

# males
Y_males <- subset(Y, gender == 1, select = -c(education, gender))[,1:5]

# females
Y_females <- subset(Y, gender == 2, select = -c(education, gender))[,1:5]


fit_female <- estimate(Y_females, seed = 2)

# fit males
fit_male <- estimate(Y_males, seed = 1)
```

## Test Sums
Then test the sum

```r
test_sum <- pcor_sum(fit_female, fit_male, relations = sums)

# print
test_sum

#> BGGM: Bayesian Gaussian Graphical Models 
#> --- 
#> Network Stats: Posterior Sum
#> Posterior Samples: 5000 
#> --- 
#> Estimates 
#> 
#> Sum: 
#>                   Post.mean Post.sd Cred.lb Cred.ub
#> g1: A3--A4+A4--A5     0.292   0.026   0.241   0.342
#> g2: A3--A4+A4--A5     0.305   0.036   0.234   0.375
#> --- 
#> 
#> Difference:
#> g1: A3--A4+A4--A5 - g2: A3--A4+A4--A5 
#> 
#>  Post.mean Post.sd Cred.lb Cred.ub Prob.greater Prob.less
#>     -0.014   0.045    -0.1   0.074        0.386     0.614
#> --- 
```
## Sanity Check
For a kind of sanity check, here is the sum for the male group obtained from the point estimates.

```r
pcor_mat(fit_male)["A3", "A4"] + pcor_mat(fit_male)["A4", "A5"] 

#>  0.305
```

This matches the output.

# Notes
By default, the print function for `pcor_sum` provides 95 % credible intervals. This can be changed by
directly using the print function, for example `print(test_sum, cred = 0.99)`, provides
99 % credible intervals.

Currently, this function only supports sums, due to this being of interest for the psychological network
literature in particular. This can be extended to accommodate multiplication, subtraction,
testing values other than zero, etc. Please make a feature request at either
[github](https://github.com/donaldRwilliams/BGGM/issues) or [BGGM-users group](https://groups.google.com/forum/#!forum/bggm-users).