---
title: "User Guide"
author:
- name: 'Dan Knight'
  email: danknight@mednet.ucla.edu
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: > 
  %\VignetteIndexEntry{User Guide}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

```{r, echo=F, message=F}
library(CancerEvolutionVisualization);
```

# Introduction
CancerEvolutionVisualization (CEV) creates publication quality phylogenetic tree plots. For simple plots,
this package will handle most settings right out of the box. However, more complex
plots may require some trial and error to achieve the right arrangement of nodes
and branches.

This guide will show best practices for creating plots, as well as examples
of common use cases and tips for refining plot settings.


# Input Data
There are many methods for determining subpopulations within genomic data, and 
you should be free to use whatever method you prefer for a given dataset. This
package only handles visualization - not analysis. Therefore, data must be prepared
and formatted before being passed to any CEV functions.

## Tree Dataframe
This is the primary source of data for a plot. It defines the tree's structure
of parent and child nodes. It also provides information about the number of
mutations at each node.


### Ex. 1: Parent Data
```{r echo=F}
load('data/input.examples.Rda');
```

The simplest input format is a column containing the parent node of each
individual node. (A node will only have one parent.) The root node will not have
a parent, so a value of `NA` is used.

```{r echo=F}
parent.only <- data.frame(tree.input[, 'parent', drop = FALSE]);

knitr::kable(
    parent.only,
    col.names = c(colnames(tree.input)[1]),
    row.names = TRUE
    );
```

```{r, fig.show='hide'}
parent.only.tree <- SRCGrob(parent.only);
```

```{r echo=F}
grid.draw(parent.only.tree);
```

### Ex. 2: Branch Lengths

It's common to associate branch lengths with a the values of a particular variable 
(for example, PGA or SNVs). Up to two branch lengths can be specified. Including a 
`length1` and/or `length2` column will enable this branch scaling behaviour,
and automatically add y-axis ticks and labels.

Multiple length values can be used together. All columns whose names contain `length`
will be used. (For example, `length1` and `snv.length` are both valid.) Multiple 
columns will result in multiple (distinctly coloured) parallel lines. Any branch 
length conflicts will be resolved automatically. For each branch, the next node
will be placed at the end of the longest line.

```{r echo=F}
branch.lengths <- tree.input[, 1:3];

knitr::kable(
    branch.lengths,
    row.names = TRUE
    );
```

```{r, fig.show='hide'}
branch.lengths.tree <- SRCGrob(branch.lengths);
```

```{r, fig.dim=c(5, 3.5)}
grid.draw(branch.lengths.tree);
```

### Ex. 3 Styling the Tree

CEV gives the user control over numerous visual aspects of the tree. By specifying optional columns and values in the tree input data.frame, the user has individual control of the colour, width, and line type of each node, label border, and edge.

#### Optional Style Columns
| Style | Column |
| --- | --- |
| Node Colour | `node.col` |
| Node Label Colour | `node.label.col` |
| Node Border Colour | `border.col` |
| Node Border Width | `border.width` |
| Node Border Line Type | `border.type` |
| | |
| Edge Colour | `edge.col.1`, `edge.col.2` |
| Edge Width | `edge.width.1`, `edge.width.2` |
| Edge Line Type | `edge.type.1`, `edge.type.2` |

Default values replace missing columns and `NA` values, allowing node-by-node, and edge-by-edge control as needed. For sparsely defined values (for example, only specifying a single edge), it can be convenient to initialize a column with `NA`s, then manually assign specific nodes as needed.

#### Line Types
Valid values for line type columns are based on lattice's values (with some additions and differences).

| Line Type |
| --- |
| `NA` |
| `'none'` |
| `'solid'` |
| `'dashed'` |
| `'dotted'` |
| `'dotdash'` |
| `'longdash'` |
| `'twodash'` |

#### Styled Tree

```{r echo=F}
node.style <- tree.input[, c(
    'parent', 'length1', 'length2',
    'node.col', 'node.label.col',
    'border.col', 'border.width', 'border.type',
    'edge.col.1', 'edge.type.1',
    'edge.col.2', 'edge.width.2'
    )];

knitr::kable(
    node.style[, !(colnames(node.style) %in% c('parent', 'length1', 'length2'))],
    row.names = TRUE
    );
```

```{r, fig.show='hide'}
node.style.tree <- SRCGrob(node.style);
```

```{r, fig.dim=c(5, 3.5), echo=F}
grid.draw(node.style.tree);
```

### Ex. 4: Showing Cellular Prevalence

A `cellular.prevalence` column can also be added. These values must range between 0 and 1, and the sum of all child nodes must not be larger than their parent node's value.

```{r echo=F}
CP <- tree.input[, c('parent', 'length1', 'length2', 'CP')];

knitr::kable(
    CP,
    row.names = TRUE
    );
```

```{r, fig.show='hide'}
CP.tree <- SRCGrob(CP);
```

```{r, echo=F}
grid.draw(CP.tree);
```

## Text Dataframe
This secondary dataframe can be used to specify additional text corresponding to each 
node.

### Ex. 5: Node Text
Each row must include a node ID for the text. Text will be stacked next to the 
specified node.

```{r echo=F}
simple.text.data <- text.input[, 1:2];

knitr::kable(
    simple.text.data,
    col.names = colnames(text.input)[1:2]
    );
```


```{r, fig.show='hide'}
simple.text.tree <- SRCGrob(parent.only, simple.text.data);
```

```{r}
grid.draw(simple.text.tree);
```

### Ex. 6: Specifying Colour and Style
- An optional `col` column can be included to specify the colour of each text.
- A `fontface` column can be included to bold, italicize, etc. These values correspond to the standard R `fontface` values.
- `NA` values in each column will default to `black` and `plain` respectively.

```{r echo=F}
knitr::kable(
    text.input
    );
```

```{r, fig.show='hide'}
full.text.tree <- SRCGrob(parent.only, text.input);
```

```{r}
grid.draw(full.text.tree);
```

# Plot Parameters
The default settings should produce a reasonable baseline plot, but many users will
want more control over their plot. This section will highlight some of the most
common parameters in `SRCGRob`.

## Plot Size

### Ex. 7: Plot Width with Horizontal Padding
Some plots require more or less horizontal padding between the x-axes and the tree
itself. The `horizontal.padding` parameter scales the default padding proportionally.
For example, `horizontal.padding = -0.2` would reduce the padding by 20%.

```{r, fig.show='hide'}
padding.tree <- SRCGrob(
    branch.lengths,
    horizontal.padding = -0.8
    );
```

```{r, fig.dim=c(3.5, 3.5)}
grid.draw(padding.tree);
```

### Ex. 8: Branch Scaling
Branches are scaled automatically, but users can further scale each branch with
the `scale1` and `scale2` parameters. These values scale each branch proportionally,
so `scale1 = 1.1` would make the first set of branch lengths 10% longer.

```{r, fig.show='hide'}
scaled.tree <- SRCGrob(
    branch.lengths,
    scale1 = 1.5,
    scale2 = 0.5
    );
```

```{r, fig.height=4}
grid.draw(scaled.tree);
```


### Ex. 9: Plot Title
The main title of the plot is referred to as `main` in plot parameters. `main` sets
the title text, `main.cex` sets the font size, and `main.y` is used to move the main
title up if more space is required for the plot.

```{r, fig.show='hide'}
title.tree <- SRCGrob(
    parent.only,
    main = 'Example Plot'
    );
```

```{r, echo=F}
title.tree$vp$y <- unit(0.8, 'npc');
```

```{r, fig.height=3.5}
grid.draw(title.tree);
```

## X-Axes
A y-axis will be added automatically for each branch length column (the 
left-sided axis corresponding to the first branch length column, and the right
with the second length column).

### Ex. 10: Y-Axis
Ticks are placed automatically based on the plot size and the branch lengths. 

### Ex. 11: Axis Title
Axis titles are specified with the `yaxis1.label` and `yaxis2.label` parameters.

```{r, fig.show='hide'}
axis.title.tree <- SRCGrob(
    parent.only,
    yaxis1.label = 'SNVs',
    horizontal.padding = -0.6
    );
```

```{r, echo=F}
axis.title.tree$vp$x <- unit(0.75, 'npc');
```

```{r}
grid.draw(axis.title.tree);
```

### Ex. 12: Axis Tick Placement
The default axis tick positions can be overridden with the `yat` parameter. This
expects a list of vectors, each corresponding to the ticks on an x-axis.

```{r, fig.show='hide'}
xaxis1.ticks <- c(10, 20, 30, 35, 40);
xaxis2.ticks <- c(100, 250, 400);

yat.tree <- SRCGrob(
    branch.lengths,
    yat = list(
        xaxis1.ticks,
        xaxis2.ticks
        ),
    horizontal.padding = -0.4
    );
```

```{r, fig.width=5}
grid.draw(yat.tree);
```

### Ex. 13: Normal
```{r, fig.show='hide'}
normal.tree <- SRCGrob(
    parent.only,
    add.normal = TRUE
    );
```

```{r}
grid.draw(normal.tree);
```