---
title: "Additional options available in the `jointVIP` package"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Additional options available in the `jointVIP` package}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
devtools::load_all(".")
```

## Setting up to demonstrate `jointVIP`

See the Get started with jointVIP vignette to get started on how to use `jointVIP` package. Using the same data sets, this vignette's main purpose is to demonstrate other options that are available. 

```{r data_cleaning, include=FALSE}
# load jointVIP package
library(jointVIP)

# data to use for example
library(causaldata)

# matching methods shown in example
library(MatchIt)
library(optmatch)
# load data for estimating earnings from 1978
# treatment is the NSW program
pilot_df = cps_mixtape
analysis_df = nsw_mixtape

transform_earn <- function(data, variables){
  data = data.frame(data)
  log_variables = sapply(variables, 
                         function(s){paste0('log_',s)})
  data[,log_variables] =
    apply(data[,variables], 2, 
        function(x){ifelse(x == 0, 
                           log(x + 1), 
                           log(x))})
  return(data)
}

pilot_df <- cps_mixtape
pilot_df <- transform_earn(pilot_df, c('re74', 're75', 're78'))

analysis_df <- nsw_mixtape
analysis_df <- transform_earn(analysis_df, c('re74', 're75', 're78'))

treatment = 'treat'
outcome = 'log_re78'
covariates = c(names(analysis_df)[!names(analysis_df) %in% c(treatment,
                                           outcome, "data_id",
                                           "re74", "re75",
                                           "re78")])

new_jointVIP = create_jointVIP(treatment = treatment,
                               outcome = outcome,
                               covariates = covariates,
                               pilot_df = pilot_df,
                               analysis_df = analysis_df)

m.out <- matchit(
  treat ~ log_re75 + log_re74,
  data = analysis_df,
  method = "optimal",
  distance = "mahalanobis"
)
optmatch_df <- match.data(m.out)[, c(treatment, outcome, covariates)]

post_optmatch_jointVIP <- create_post_jointVIP(new_jointVIP, 
                                               post_analysis_df = optmatch_df)

```

```{r setup, eval=FALSE}
library(jointVIP)

# gentle reminder of how to create a new jointVIP object
new_jointVIP = create_jointVIP(treatment = treatment,
                               outcome = outcome,
                               covariates = covariates,
                               pilot_df = pilot_df,
                               analysis_df = analysis_df)

# gentle reminder of how to create a new post_jointVIP object
post_optmatch_jointVIP = create_post_jointVIP(new_jointVIP,
                                             post_analysis_df = optmatch_df)
```

## Demonstration for additional options in `summary()` and `print()`

```{r sumnprint}
# # simplest usage
# summary(new_jointVIP)

summary(new_jointVIP, 
        smd = 'pooled', 
        use_abs = FALSE, 
        bias_tol = 0.005)

print(new_jointVIP, 
      smd = 'pooled', 
      use_abs = FALSE, 
      bias_tol = 0.005)

# not run
# get_measures(new_jointVIP, smd = 'cross-sample')
```
The `summary()` and `print()` functions have the same additional parameters and uses rounded numbers to the third decimal place.

* The `smd` parameter allows only `pooled` or `cross-sample` options. The cross-sample is based on the analysis sample numerator and pilot sample denominator (equivalent to standardized version of the one-sample omitted variable bias). The cross-sample version is the default. The pooled version of the standardized mean difference (standard) can be specified. The `pooled` option is the standard option used in balance tables and Love plots that uses both treated and control variances from the analysis data set to construct the SMD. Note this `pooled` option applies the same formula to both binary and continuous variables.

* The `use_abs` parameter takes in either TRUE or FALSE, stating if set TRUE (default), the absolute measures are used. Otherwise, signed measured are used if set to be FALSE. 
* The `bias_tol` parameter is to set the absolute bias tolerance that one wishes to examine at a glance. The default is 0.01. 

Under the hood, `get_measures()` function is used to calculate. If the researcher wishes to save the measures calculated, perhaps `get_measures()` would be used; example is shown above. Only signed measures are presented as outputs for that function.

## Demonstration for additional options in `plot()`

```{r plot_ex, dpi=300, fig.asp = 0.75, fig.width = 6, out.width = "80%", fig.align = "center", message=FALSE}
# # simplest usage
# plot(new_jointVIP)

plot(new_jointVIP, 
     smd = 'pooled', 
     use_abs = FALSE, 
     plot_title = 'Signed version of the jointVIP with pooled SMD')

plot(new_jointVIP, 
     bias_curve_cutoffs = c(0.005, 0.05, 0.10),
     text_size = 5, 
     label_cut_std_md = 0.1,
     max.overlaps = 15,
     plot_title = 'Increased text size and bias curve specifications',
     expanded_y_curvelab = 0.002
     #label_cut_outcome_cor = 0.2,
     #label_cut_bias = 0.1
     )

plot(new_jointVIP, 
     bias_curves = FALSE,
     add_var_labs = FALSE,
     plot_title = 'No bias curves or variable labels'
     )
```

There are many parameters for the `plot()` option. The `smd` and `use_abs` options functions the same as above. The other main parameter input is `plot_title`, which allows users to specify the title of the plot. Additional parameters not listed as a main parameter is explained and example usage is shown above.

* `bias_curve_cutoffs`: draws bias curves by the specifications. This is only used when `smd` is specified as `cross-sample`.
* `text_size`: text size of the variable labels can be increased.
* `max.overlaps`: maximum overlap of the variable labels.
* `label_cut_std_md`: standardized mean difference label cutoff, an example would be, if you wish to label all variables with standardized mean difference above 0.1.
* `label_cut_outcome_cor`: outcome correlation label cutoff, an example would be, if you wish to label all variables with outcome correlation above 0.2.
* `label_cut_bias`: bias label cutoff, an example would be, if you wish to label all variables with bias difference above 0.1.
* `bias_curves`: TRUE (default) draws the omitted variable bias curves, FALSE suppresses the bias curves. If `bias_curve_cutoffs` also specified, `bias_curves` takes priority. This is only used when `smd` is specified as `cross-sample`.
* `add_var_labs`: TRUE (default) adds variable labels. This suppresses all `label_cut` inputs if specified FALSE.
* `expanded_y_curvelab`: if one wishes to expand the y-axis, the bias curve labels don't automatically get updated. So, this allows the fine-tuning of the bias-curve labels. Typically this is used under the hood for bootstrap version of the plot. However, user can specify this if they wish. This is only used when `smd` is specified as `cross-sample`.

# Post-analysis parameters examples

The same variables are specified in the Get started with jointVIP vignette; here we choose a matching example to demonstrate the additional parameters. 

## Demonstration for additional options in `summary()` and `print()`

```{r sumnprint_post, dpi=300, fig.asp = 0.75, fig.width = 6, out.width = "80%", fig.align = "center", message=FALSE}
# get_post_measures(post_optmatch_jointVIP, smd = 'cross-sample')

summary(post_optmatch_jointVIP,
        use_abs = FALSE,
        bias_tol = 0.01,
        post_bias_tol = 0.001)

print(post_optmatch_jointVIP,
      bias_tol = 0.001)

plot(post_optmatch_jointVIP, 
     plot_title = "Post-match jointVIP using rcbalance matching",
     smd = 'cross-sample',
     use_abs = FALSE,
     add_post_labs = TRUE,
     post_label_cut_bias = 0.001)
```

All of the options from above can be used; below will only address additional parameters or function outputs.

* `get_post_measures()` takes in a post_jointVIP object and `smd` specifications for measures from the post-matched results. 
* `summary()` function adds text output of post-match results in addition to pre-match results. The `post_bias_tol` specifically is a `summary()` parameter that outputs post_bias tolerance for text comparison rounded to the fourth decimal place.
* `print()` function adds another column to post-match bias.
* `plot()` function includes two new parameters:
  - `add_post_labs` TRUE (default) shows the variable labels post-matching/weighting; FALSE suppresses it.
  - `post_label_cut_bias` numeric number for variable labels; only used when `add_post_labs` is TRUE.