--- 
title: "MicroSynth: A Tutorial" 
author: "Michael Robbins and Steven Davenport" 
date: "`r Sys.Date()`" 
output: rmarkdown::html_vignette 
vignette: >
  %\VignetteIndexEntry{MicroSynth: A Tutorial} 
  %\VignetteEngine{knitr::rmarkdown} 
  %\VignetteEncoding{UTF-8} 
---

```{r load, echo=F, warning=FALSE}
library("microsynth")
library("knitr")
knitr::opts_chunk$set(message = FALSE, warning = FALSE, fig.height=3.1, fig.width=7.2, fig.show = "hold")

# For fast vignette-compiling (needed for CRAN), declare function to save reduced microsynth object.
# (Microsynth chunks with long run-times are set to eval=TRUE to create objects for the first time,
# but these chunks are set to eval=FALSE in the package for fast vignette compiling.)
# This reduces microsynth files to <1MB instead of >10 MB

saveReducedMicrosynth <- function(msobject, filename) {
  
  # Proceed only if object exists (so it is always safe to run the code)
  if(exists(deparse(substitute(msobject)))) {
    
    # Strip data-heavy objects and save
    msobject$w$Weights <- NULL
    msobject$w$Intervention <- NULL
    saveRDS(msobject, filename)
  }
}
```

## Introduction to Synthetic Controls

Synthetic controls are a generalization of the difference-in-difference approach
(Abadie and Gardeazabal, 2003; Abadie et al, 2010). Difference-in-difference
methods often require the researcher to manually identify a control case,
against which the treatment will be compared, on the basis of apparent
similarity before the intervention and the plausibility that identical secular
trends affect both the treatment and control equally after the intervention.
Instead, the synthetic control method offers a formalized and more rigorous
method for identifying comparison cases, by constructing a "synthetic" control
unit that represents a weighted combination of many untreated cases. Weights are
calculated in order to maximize the similarity between the synthetic control and
the treatment unit in terms of specified "matching" variables. By matching on
the observable characteristics between treatment and control, the method may
also do a better job of matching on the unobservable characteristics (though by
nature this cannot be verified).

The advantages over the general difference-in-difference approach are several:
a) the observable similarity of control and treatment cases is maximized, and
perhaps also similarity of unobservables, strengthening the assumptions (e.g.,
equal secular trends) inherent to the difference-in-difference approach; b) the
method is feasible even when there exists no single untreated case adequately
similar to the treatment case; and c) researchers can point to a formal and
objective approach to the selection of controls, rather than having to justify
ad hoc decisions which could potentially create the appearance of the researcher
having his thumb on the scale.

Generally, synthetic controls have been applied in the context of a single
treatment case with a limited number (e.g., several dozens) of untreated cases
for comparison. The `Synth` package has been developed for R and designed for
this type of application. But the relative dearth of treatment and comparison
data in such settings complicates efforts to a) develop a synthetic control that
matches the treatment case, b) precisely estimate the effect of treatment, c)
gauge the significance of that effect, and d) jointly incorporate multiple
outcome variables.

This package is developed to address those limitations, by incorporating
high-dimensional, micro-level data into the synthetic controls framework.
Therefore, in addition to what Synth provides, microsynth offers several
advantages and new tools:

* With the advantage of a large number of smaller-scale observations, microsynth
is often better able to calculate weights that provide exact matches between
treatment and synthetic control units (on all variables passed to `match.out`
[for time-variant variables] and `match.covar` [for time-variant variables]).
This bolsters the conceptual framework behind the synthetic control method.

* To generate an additional measure for significance, microsynth can generate 
hundreds or thousands of placebo treatment units using random permutations of
the control units (e.g., with `perm = 250` and `jack = TRUE`). This allows
estimated effects from the actual treatment unit to be compared to effects for
the placebo treatment units, after standardization (if `use.survey = TRUE`),
generating a new variance estimator and p-value. The sampling distribution of
the effects from placebo treatment units is plotted visually, along with
Synth-style plots comparing observed outcomes in the treatment and synthetic
control units over time.

* An omnibus statistic is calculated to assess the statistical significance
across multiple variables (i.e., those set to `omnibus.var`), as may be desired
in scenarios with limited power where several outcome variables.

* Results may be estimated across multiple follow-up periods (by passing a
vector to `end.post`).

* Matching variables may be specified flexibly. Time-variant variables may be
aggregated across multiple time periods before matching (by passing a list to
`match.out` or passing a value to `period`), helping to reduce variable
sparseness and improve the likelihood of a satisfactory match.

* microsynth provides parameters to assist users in finding feasible models when
a plethora of matching variables and a scarcity of data make the calculation of
satisfactory weights difficult. Users may call `check.feas` or `use.backup` to
call on more computationally-intensive methods to calculate weights.
Alternately, difficult-to-match variables may be passed to
`match.out.min`/`match.covar.min` as to seek weights that deliver the
best-possible but not necessarily exact match on those variables.

* microsynth is also backwards compatible, i.e., it can be deployed on the
Synth-like case of a single treatment with a limited number of untreated cases,
although the relative dearth of data should be expected to decrease matching
performance and limit the usefulness of the features discussed above (see
Example 8).

## An example: Using microsynth to evaluate a Drug Market Intervention

For this example we will use \code{\link{seattledmi}} to evaluate a Drug Market 
Intervention using the "seattledmi" dataset provided with the microsynth
package. The intervention was applied to 39 blocks, which represent the
treatment; the remaining 9,603 Seattle blocks are potential comparison units
from which the synthetic control may be constructed. Data are available for
block-level Census demographics and incidences of crime reported by the Seattle
Police Department.

```{r ex0_load} 
colnames(seattledmi)
set.seed(99199)
```

We would like to detect whether the program was effective at reducing the
incidence of crime in those neighborhoods where the intervention was applied.
Before beginning examples, we will specify the mandatory minimum parameters
pursuant to the dataset and our basic research design.

#### Setting ID columns

The bedrock of the synthetic controls research design (like any 
difference-in-difference method) involves comparing observations between 
treatment (i.e., "intervention") areas versus control areas, with observations 
for each unit over a certain period of time. Therefore microsynth requires we 
identify the `idvar`, `timevar`, and `intvar` columns.

In this case, we are provided with Census block-level observation units (`idvar
= "ID"`) and quarterly observations (`timevar = "time"`), along with a binary
variable with 0 for all untreated groups and the treated groups during the
pre-intervention period and a 1 for treated groups at the time of intervention 
and later (`intvar = "Intervention"`).

#### Setting time parameters

Next, the user can specify parameters relating to the beginning of the 
pre-intervention data (`start.pre`), the last time period of the pre-intervention 
period (`end.pre`), 
and the time(s) through which post-intervention effects ought to be estimated
(`end.post`). For all observations up to and including `end.pre`, outcome
variables and covariates will be used to match treatment and control. (If the
data is formatted such that 0s are assigned to all `end.pre` observations for
the control units *and* treatment units pre-intervention, and 1s assigned only
to treatment units post-intervention, then `end.pre` will by default be
automatically set appropriately, such that `end.pre` will equal the last period
of pre-intervention data.)

In this case, our study period begins at the first quarter of data available in
the dataset; the intervention occurs after 12 quarters of pre-intervention data
(`end.pre = 12`); and our study period continues for four quarters of 
post-intervention data (`end.post = 16`). With this dataset, `end.post` could 
also be left unassigned and would be automatically set to the latest observation
in the data; likewise, we can set `end.pre = NULL`, as we
expect the program's effects not to occur instantaneously, the `intvar` column
is adequately formatted to allow microsynth to detect the intervention time
automatically. Note that `start.pre` will default to the earliest time in the
dataset.

#### Setting outcome variables and covariates and related parameters

The last group of `microsynth`'s mandatory parameters relate to declaring 
outcome variables and covariates. Both outcomes and covariates will be used to 
match treatment units to synthetic controls during the pre-intervention period. 
The key difference between outcomes and covariates is that outcome variables are
required to be time-variant, and covariates to be time-invariant (constant 
overtime).

For this study, we would like to estimate the effect of the DMI on rates of 
crime. Specifically we are interested in the effects on four types of incidences
of crime: felony arrests, misdemeanor arrests, drug arrests, and any criminal
arrest. Passing these variables to `match.out` instructs microsynth to calculate
weights that provide exact matches on these variables; assigning `result.var =
match.out` identifies them as outcome variables for which we would like effects
estimated; `omnibus.var` will include them in the omnibus statistic. After the 
`microsynth` object is created, we can plot results with \code{plot.microsynth} 
with the argument set to `plot.var = match.out` to indicate variables to appear 
on plots.  

Treatment and synthetic control will also be matched on block-level Census 
demographic data: each block's population, black residents, hispanic residents,
males aged 15-21, the number of households, the number of families per
household, the number of female-led households, the number of households that
are renters, and the number of vacant houses. As these variables are all
time-invariant, they will be set to `"match.covar"`.

```{r ex0_variables}
cov.var <- c("TotalPop", "BLACK", "HISPANIC", "Males_1521", "HOUSEHOLDS", 
             "FAMILYHOUS", "FEMALE_HOU", "RENTER_HOU", "VACANT_HOU")
match.out <- c("i_felony", "i_misdemea", "i_drugs", "any_crime")
```

#### An aside: advanced methods for setting matching variables

Exact matches are not always possible, especially for variables that are sparse
(i.e., few non-zero values), containing little variation, or for which the
treatment units have values outside of the range of observations from the
un-treated units. In these cases, variables may be moved from
`match.out`/`match.covar` to `match.out.min`/`match.out.covar` as to minimize
the distance between treatment and synthetic control on those variables rather
than find exact matches. Alternately, a value may be set to `period` to
aggregate all variable names in `match.out`/`match.covar` under the same regular
time duration; or, to set aggregation instructions with more detail,
`match.out`/`match.covar` may receive a list with detailed parameters.

microsynth() provides several different ways to address this problem. A variable
can be treated such that the distance between treatment and synthetic control is
minimized, even if a distance of zero is infeasible, by listing it under 
`match.out.min` (for time-variant outcome variables) or `match.covar.min` (for 
time-invariant variables). In this case, `match.out`, `match.out.min`, 
`match.covar`, and `match.covar.min` may each be vectors of variable names. 
There ought not be any overlap: each variable should appear in only one 
argument.

Another potential response is to aggregate the variable across multiple time 
periods. `match.out`, `match.out.min`, `match.covar`, and `match.covar.min` all 
behave similarly in this manner. Rather than being passed a vector of variable 
names, each may receive a list; each element of the list is a vector 
corresponding to the time units across which each variable should be aggregated 
before matching, with each element named equal to the variable name. In this 
case, the element vectors represent the duration during which the variable 
should be aggregated, counting backwards from the intervention time.

Combining these approaches, if `match.covar.min = list("Y1" = c(1, 3, 3))`, then
the variable "Y1" will be used to match treatment to synthetic control at the 
time of the intervention (*t*), the sum of values of "Y1" across *t-1* to *t-3*,
and the sum across *t-4* to *t-6*.

If the dataset contains both time-variant outcome variables *and* time-variant
predictor variables (i.e., belonging on the RHS of a regression rather than the
LHS), then both 1) `match.out` or `match.out.min` and 2) `result.var` must be
specified. `match.out` or `match.out.min` should include all time-variant
variables used for matching, whether they are true outcomes or predictors;
`result.var` should specify only the subset of those that are outcomes (for
which estimated effects will be calculated).

Note: in some cases, the term "outcome variable" may be a misnomer. Though by
default all time-variant variables assigned to `match.out` and `match.out.min`
will be used to estimate the program effect (`result.var = T`), this doesn't
have to be the case. `result.var` may be set to a vector of variable names
representing a subset of the outcome variables entered into `match.out` and
`match.out.min`; this is useful if the dataset includes time-variant variables
that we'd like to use to match treatment and synthetic control but which we do
not want to use for the purposes of evaluating the program effect.

#### Other parameters that may be set

microsynth allows for extensive configuration, for instance, relating to the 
mechanics of calculating weights, plotting options, and the calculation of 
variance estimators through permutation tests and jackknife replication groups. 
These aspects will be discussed in the later examples below.

## Basic estimation and plotting

### Example 1:  Barebones results

In this minimal example, we will calculate and display results in the simplest 
way possible. This includes:

* calculating weights to match treatment to synthetic control on the variables
specified 
* calculating a variance estimate based on the linearization method
only, and based on that, running a one-sided lower significance test
(`test=lower`) 
* calculating an omnibus statistic to test joint significance
across all outcome variables (`result.var = match.out`) 
* creating a plot and displaying it as output (`plot_microsynth()`); to save to file, 
specify a .csv or .xlsx as the `file` argument in `plot_microsynth()`. 
* estimating results but not saving it to file (`result.file=NULL`); instead, 
results can be viewed by inspecting the `microsynth` object.

As microsynth runs, it will display output relating to the calculation of
weights, the matching of treatment to synthetic control, and the calculation of
survey statistics (e.g., the variance estimator). The first table to display 
summarizes the matching properties after applying the main weights. It shows
three columns: 1) characteristics of the treated areas for the time-variant and
time-invariant variables, 2) characteristics of the synthetic control, and 3)
characteristics of the entire population. Because this example is successful in
creating a matching synthetic control, the first column and the second column
will be nearly equal.

Note that `match.out = match.out`, `result.var = match.out`, and 
`omnibus.var =match.out`. This means that the outcome 
variables that we declared as `match.out` will all be matched on
exactly, will be used to report results, and will feature in the omnibus p-value. 
`match.covar` indicates that the specified covariates will also be matched on 
exactly. (By setting `result.var = match.out`, there is provided one chart per 
time-variant outcome variable for which we calculate results.) 

```{r ex1, eval = TRUE, echo=TRUE} 
sea1 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time", intvar="Intervention", 
                   start.pre=1, end.pre=12, end.post=16, 
                   match.out=match.out, match.covar=cov.var, 
                   result.var=match.out, omnibus.var=match.out,
                   test="lower",
                   n.cores = min(parallel::detectCores(), 2))
sea1
summary(sea1)
```

After the call to microsynth has been made, the \code{print} function displays 
a brief description of the parameters used in the \code{microsynth} call along 
with the results (if available). Also, the \code{summary} function
can be used to display a summary of the matching between treatment, synthetic
control, and the population, and the results table. Below we reproduce the
results that were saved to file in the previous example, with one row for each
of the variables entered to `result.var`, which have each been used to calculate
an omnibus statistic (`omnibus.var = TRUE`), and two columns corresponding to
the confidence interval (`confidence`) resulting from the variance estimator
generated by linearization. The first row of the output (`16`) refers to the
maximum post-intervention time used to compile results (`end.post`).

Note that the p-value of the omnibus statistic is smaller than any of the
individual outcome variables.

```{r ex1b, eval = TRUE}
plot_microsynth(sea1)
```

Above are produced plots under default settings. By default, if no other 
arguments are declared in the  call to `plot_microsynth()`, the plots will include one row 
for each variable passed to `result.var` in the original \code{microsynth} call. 
Likewise, values for the duration of the pre- and post-intervention periods 
(i.e. `start.pre`, `end.pre`, `end.post`) can also be automatically detected from 
the original \code{microsynth} object if not specified manually.

The first plot column compares
the observed outcomes among the treatment, synthetic control, and population
during the pre-intervention and post-intervention periods. Outcomes are scaled
by default (`scale.var = "Intercept"`) to the number of treatment units, to
facilitate comparison. The dotted red line indicates the last time period of the
pre-intervention period (`end.pre`). Because matching was successful, the treatment 
and synthetic control lines track closely during the pre-intervention period; their
divergence during the post-intervention period represents an estimate of the
causal effect of the program (i.e., the red synthetic control line is treated as
the counterfactual to the black treatment line). This difference is charted on
the right plot column.

### Example 2: Adding permutations and jackknife

In addition to using linearization to calculate a variance estimate, microsynth
can approximate the estimator's sampling distribution by generating permuted
placebo groups. When dealing with a large number of treatment and control units,
there is a near infinite number of potential permutations. A default (`perm =
250`) is set as permutations are somewhat computationally intensive.

For each placebo, weights are calculated to match the placebo treatment to a new
synthetic control, and an effect is estimated, generating a sampling
distribution and an corresponding p-value. Because the actual treatment area is
a non-random group of treatment units, while the placebo treatments are random
groups, by default microsynth will standardized the placebo treatment effects to
filter out potential design effects (`use.survey = TRUE`).

We will also generate jackknife replication groups, using as many groups as the 
lesser of the number of cases in the treatment group and the number of cases in 
the control group (`jack = TRUE`).

The output from this call to microsynth will be largely identical to the
previous call, except for the appearance of the right column of plots. Now that
permutation groups have been generated, the estimated effect under each of the
placebo treatments (gray lines) will be shown along with the estimated effect of
the real treatment. This displays the estimated treatment effect in the context
of the estimator's sampling distribution.

```{r ex2, eval = FALSE, echo=TRUE} 
sea2 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time", intvar="Intervention", 
                   start.pre=1, end.pre=12, end.post=c(14, 16),
                   match.out=match.out, match.covar=cov.var, 
                   result.var=match.out, omnibus.var=match.out, 
                   test="lower", 
                   perm=250, jack=TRUE,
                   n.cores = min(parallel::detectCores(), 2))
```

Calling \code{summary} or \code{print} identifies other new changes to the 
results. Columns are added to display the confidence intervals 
(`confidence = 0.9`) and p-values (`test = "lower"`) from the jackknife and 
permutation tests. Note that `end.post=c(14,16)` in the code above, instructing 
results to be calculated for two different follow-up periods, ending at t=14 
and t=16 respectively. One results table will be calculated for each.

```{r, eval = TRUE, echo = FALSE}
saveReducedMicrosynth(sea2, "../inst/extdata/sea2.rds")
sea2 <- readRDS("../inst/extdata/sea2.rds")
sea2
```

```{r}
plot_microsynth(sea2)
```

### Example 3:  Model feasibility while matching on more variables

Now, we will add additional outcome variables and also use them to match the 
treatment area to the synthetic control units. We do this at the risk of 
model feasibility, as each variable introduces another constraint. 

```{r ex3_vars}
match.out <- c("i_robbery", "i_aggassau", "i_burglary", "i_larceny", "i_felony", 
               "i_misdemea", "i_drugsale", "i_drugposs", "any_crime")
```

In the example below, without overriding the default weight parameters,
microsynth will fail to find a feasible model. Weights would not be calculated,
and no results or plots will be generated. But we may still attempt to estimate
the model by setting `check.feas = TRUE` and `use.backup = TRUE`. This will
check for feasibility, and if needed, invoke the computationally intensive
`LowRankQP` package to calculate the weights.

Note that the additional matching variables introduce further constraints to the
calculation of weights, lengthening the output. Moreover, the introduction of
additional time-variant matching variables results in a poorer match on each,
shown in the left column of plots, where red and dashed-black lines no longer
track perfectly in the pre-intervention period.

Also note that we need not specify values for `start.pre`, `end.pre`, and
`end.post`, as the default settings align with our intentions. Likewise, we can
trust the default values for specifying the variables for the omnibus statistic
(`omnibus.var=result.var`) by default. This way we specify the minimum
number of non-default arguments.

```{r ex3, eval = FALSE, echo=TRUE} 
sea3 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time", intvar="Intervention", 
                   end.pre=12,
                   match.out=match.out, match.covar=cov.var, 
                   result.var=match.out, perm=250, jack=0, 
                   test="lower", check.feas=TRUE, use.backup = TRUE,
                   n.cores = min(parallel::detectCores(), 2))
```

```{r, eval = TRUE, echo = FALSE}
saveReducedMicrosynth(sea3, "../inst/extdata/sea3.rds")
sea3 <- readRDS("../inst/extdata/sea3.rds")
summary(sea3)
```

The results file now shows additional rows for the new outcome variables, and 
these are also displayed in plots.

```{r}
plot_microsynth(sea3)
```


### Example 4: Provide match.out as a list (time-aggregating matching variables)

Another potential response is to aggregate sparse variables across multiple time
periods before using them to match to synthetic control. Rather than passing a
vector of variable names to `match.out` and/or `match.out.min`, the user may
pass a list; each element of the list is a vector corresponding to the time
units across which each variable should be aggregated before matching, with each
element named equal to the variable name. In this case, the element vectors
represent the duration during which the variable should be aggregated, counting
backwards from the intervention time.

In our dataset, incidences of drug sale are relatively scarce, and so are
aggregated every four months before matching (`'i_drugsale'=rep(4,3)`);
meanwhile, larceny is relatively common and so is matched un-aggregated
(`'i_larceny'=rep(1, 12)'`).

Each vector indicates the time-durations for aggregation, starting from the
period directly prior to intervention and finishing with the earliest
observations in the dataset. Because our `end.pre = 12`, to use the full
dataset, each vector-element in the list should add to 12. Sums less than 12
would ignore portions of the pre-intervention data; sums more than 12 will throw
an error, calling on more pre-intervention data than are available.

The aggregated variables now appear in the main weights summary table, e.g.,
"i_robbery.11.12", representing the sum of reported robberies in time periods 11
and 12.

```{r ex4, eval = FALSE, echo=TRUE} 
match.out <- list( 'i_robbery'=rep(2, 6), 
                   'i_aggassau'=rep(2, 6), 
                   'i_burglary'=rep(1, 12), 
                   'i_larceny'=rep(1, 12), 
                   'i_felony'=rep(2, 6), 
                   'i_misdemea'=rep(2, 6), 
                   'i_drugsale'=rep(4, 3), 
                   'i_drugposs'=rep(4, 3), 
                   'any_crime'=rep(1, 12))

sea4 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time",intvar="Intervention",
                   match.out=match.out, match.covar=cov.var, 
                   result.var=names(match.out), omnibus.var=names(match.out), 
                   end.pre=12,
                   perm=250, jack = TRUE,
                   test="lower",
                   n.cores = min(parallel::detectCores(), 2))

```

```{r ex4load, eval = TRUE, echo = FALSE}
saveReducedMicrosynth(sea4, "../inst/extdata/sea4.rds")
sea4 <- readRDS("../inst/extdata/sea4.rds")
summary(sea4)
```

The aggregation of the outcome variables over time is not directly reflected in
the results table, though the estimates have changed as a consequence of the
aggregation.

```{r ex4plot}
plot_microsynth(sea4)
```

## Partial calls to microsynth

The following examples will demonstrate that microsynth can be used to calculate
weights and variance estimators, produce results, and display charts separately,
one at a time. This can be useful given the time-intensive nature of calculating
weights and generating permutation groups. It allows for weights to be saved
once calculated and for plots and results to be reproduced iteratively without
repeating the matching process.

### Example 5:  Weights only

This setting represses reporting of results by setting
`result.var` = FALSE. Only weights will be calculated.
Note that settings for permutation groups (`perm`) and jackknife replication
groups (`jack`) are considered when calculating weights, and then will not be
referred to again in calls that only produce plots or display results.

```{r ex5, eval = TRUE, echo=TRUE, results='hide'}
match.out <- c("i_felony", "i_misdemea", "i_drugs", "any_crime")

sea5 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time",intvar="Intervention",
                   end.pre=12,
                   match.out=match.out, match.covar=cov.var,
                   result.var=FALSE, perm=0, jack=FALSE,
                   n.cores = min(parallel::detectCores(), 2))

summary(sea5)
```

Appropriately, the table summarizing the main weights may be viewed, but results
are unavailable.

### Example 6:  Results only

If weights have already been calculated, then microsynth() can also be
configured to only reproduce results. Results are displayed for all outcome
variables used for exact matches (`result.var = match.out`). Further, results 
can now be calculated for any single or group of follow-up periods 
(`end.post=c(14,16)`) without having to re-calculate weights.

```{r ex6, eval = FALSE, echo=TRUE}
sea6 <- microsynth(seattledmi, 
                   idvar="ID", timevar="time", intvar="Intervention",
                   end.pre=12, end.post=c(14, 16),
                   result.var=match.out,
                   test="lower", 
                   w=sea5$w,
                   n.cores = min(parallel::detectCores(), 2))

sea6
```

For each follow-up period, a separate results table is provided. If saving to
file, this requires the file be saved as an XLSX rather than a CSV; each table
will be saved to a different XLSX tab.

```{r, eval = TRUE, echo = FALSE}
saveReducedMicrosynth(sea6, "../inst/extdata/sea6.rds")
sea6 <- readRDS("../inst/extdata/sea6.rds")
sea6
```

### Example 7:  Plots only

If weights have already been calculated, then plot_microsynth() can be used to 
display plots from the original microsynth object. In this case, we limit plots 
to a subset of time-variant variables (`plot.var=match.out[1:2]`).

```{r ex7, eval = TRUE, echo=TRUE}
plot_microsynth(sea6, plot.var=match.out[1:2])
```

## Alternative applications for microsynth

### Example 8: Apply microsynth in the traditional setting of Synth

One of the major differences between Synth and microsynth is that Synth requires
that the treatment is confined to a single unit of observation, and to
estimating the effect on a single outcome variable; in contrast, microsynth
anticipates that treatment has been applied to multiple areas and can estimate
effects with respect to multiple outcomes. But microsynth can also be applied to
this simpler case.

To demonstrate, first we will create a reduced dataset with 1 treatment block
and 100 control blocks.

```{r ex8prep, eval = TRUE, echo=TRUE}
set.seed(86872)
ids.t <- names(table(seattledmi$ID[seattledmi$Intervention==1]))
ids.c <- names(table(seattledmi$ID[seattledmi$Intervention==0]))
ids.synth <- c(sample(ids.t, 1), sample(ids.c, 100))
seattledmi.one <- seattledmi[is.element(seattledmi$ID, as.numeric(ids.synth)), ]
```

Then microsynth can be run on the dataset with just a single variable passed out
`match.out`, so that effect is estimated for only one variable, as with Synth.
Due to the small size of the reduced dataset, model feasibility may be an issue
(so we set `use.backup = TRUE` and `check.feas = TRUE`) and variance estimators
will be less reliable.

```{r ex8, eval = FALSE, echo=TRUE}
sea8 <- microsynth(seattledmi.one, 
                   idvar="ID", timevar="time", intvar="Intervention", 
                   match.out=match.out[4], match.covar=cov.var, 
                   result.var=match.out[4],
                   test="lower", perm=250, jack=FALSE, 
                   check.feas=TRUE, use.backup=TRUE,
                   n.cores = min(parallel::detectCores(), 2))
```

```{r, eval = TRUE, echo = FALSE}
saveReducedMicrosynth(sea8, "../inst/extdata/sea8.rds")
sea8 <- readRDS("../inst/extdata/sea8.rds")
plot_microsynth(sea8)
summary(sea8)
```


### Example 9: Cross-sectional data for propensity score-type weights

microsynth() may also be used to calculate propensity score-type weights. We
will demonstrate this by transforming our panel data into a cross-sectional
dataset with data corresponding to our final observed period. 

```{r ex9, eval = TRUE, echo=TRUE}
seattledmi.cross <- seattledmi[seattledmi$time==16, colnames(seattledmi)!="time"]
```

By setting `match.out = FALSE`, no outcome variables will be used to calculate
weights, only (time-invariant) covariates (`match.covar`). No outcome-reporting 
variables (`result.var = NULL`) need be
reported. Plots are therefore inappropriate, but results (i.e., a summary of
weights only) can be saved to file or viewed using `summary`.

```{r ex9results, eval = FALSE, echo=TRUE}
sea9 <- microsynth(seattledmi.cross, 
                   idvar="ID", intvar="Intervention",
                   match.out=FALSE, match.covar=cov.var,
                   result.var=NULL, 
                   test="lower",
                   perm=250, jack=TRUE,
                   n.cores = min(parallel::detectCores(), 2))
```

```{r, eval = TRUE, echo = FALSE}
saveReducedMicrosynth(sea9, "../inst/extdata/sea9.rds")
sea9 <- readRDS("../inst/extdata/sea9.rds")
sea9
```


## References
Abadie A, Gardeazabal J (2003). “The economic costs of conflict: A case study of
the Basque Country.” \emph{American Economic Review}, pp. 113-132.

Abadie A, Diamond A, Hainmueller J (2010). “Synthetic control methods for
comparative case studies: Estimating the effect of California’s tobacco control
program.” \emph{Journal of the American Statistical Association}, 105(490),
493-505.