---
title: "Introduction to ChangePointTaylor"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Introduction-to-ChangePointTaylor}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

The ChangePointTaylor package is a simple R implementation of the change in mean detection [method](https://variation.com/wp-content/uploads/change-point-analyzer/change-point-analysis-a-powerful-new-tool-for-detecting-changes.pdf) developed by Wayne Taylor and utilized in his [Change Point Analyzer](https://variation.com/product/change-point-analyzer/) software. The package recursively uses the 'MSE' change point calculation to identify candidate change points. The change points are then re-estimated and Taylor's backwards elimination process is employed to come up with a final set of change points. Many of the underlying functions are written in C++ for improved performance.   
```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```
## Installation

You can install the released version of ChangePointTaylor from [CRAN](https://CRAN.R-project.org) with:

``` r
install.packages("ChangePointTaylor")
```

### Example

Load the package and other needed libraries for this example
```{r setup}
library(ChangePointTaylor)
library(dplyr)
library(ggplot2)
```

View the example dataset of US trade deficit data from January 1987 to December 1988.
```{r paged.print=TRUE}
US_Trade_Deficit
```
Plot the data
```{r fig.height=4, fig.width=7}

trade_deficit_plot <- US_Trade_Deficit %>%
  mutate(date = as.Date(paste(date, "1"), format = "%b '%y %d")) %>%
  ggplot(aes(x = date, y = deficit_billions, group = 1)) +
  geom_line() +
  geom_point() +
  theme_bw() +
  scale_x_date(date_breaks = "1 month", date_labels = "%b '%y") +
  theme(
    axis.text.x = element_text(angle = 45, vjust = 1, hjust =1),
    axis.title.x = element_blank()
  ) +
  ggtitle("US Trade Deficit: 1987-1988")
  
trade_deficit_plot
```


In its simplest form, the `change_point_analyzer()` function simply takes a numeric vector and returns the identified change points. However, the output only identifies changes by their index in the original numeric vector. 
```{r}
change_point_analyzer(US_Trade_Deficit$deficit_billions)
```

When a vector of labels, the same length as the `x` values, is supplied to the `label` argument, those labels will be displayed in the output dataframe.
```{r}
change_points <- change_point_analyzer(US_Trade_Deficit$deficit_billions, label = US_Trade_Deficit$date)
change_points
```

Plot the change points we identified.
```{r fig.height=4, fig.width=7}
trade_deficit_plot +
  geom_vline(xintercept = as.Date(paste(change_points$label, "1"), format = "%b '%y %d"), color = "steelblue", linetype = "dashed", size = 1.3)
```



The number of bootstraps can be controlled with the `n_bootstraps` argument. This can reduce stochastic differences between subsequent function calls; however, this comes at the expense of execution speed. 
```{r message=FALSE, warning=FALSE}
bench::mark(
  change_point_analyzer(US_Trade_Deficit$deficit_billions, label = US_Trade_Deficit$date, n_bootstraps = 1000)
 ,change_point_analyzer(US_Trade_Deficit$deficit_billions, label = US_Trade_Deficit$date, n_bootstraps = 10000)
  ,check = F
 ,min_iterations = 2
 ,max_iterations = 5
) %>%
  mutate(expression = c("1000 Bootstraps", "10000 Bootstraps")) %>%
  select(expression:mem_alloc)
```

The the user can also adjust the minimum level of confidence a change point must reach to become an initial candidate (`min_candidate_conf`) and the minimum confidence to be included in the final table of change points (`min_tbl_conf`).
```{r}
change_point_analyzer(US_Trade_Deficit$deficit_billions, label = US_Trade_Deficit$date, min_candidate_conf = 0.66,  min_tbl_conf = 0.95)
```