---
title: 'Reconstruct intermediate sequences'
author: "Kenneth B. Hoehn"
date: '`r Sys.Date()`'
output:
  pdf_document:
    dev: pdf
    fig_height: 4
    fig_width: 7.5
    highlight: pygments
    toc: yes
  html_document:
    fig_height: 4
    fig_width: 7.5
    highlight: pygments
    theme: readable
    toc: yes
  md_document:
    fig_height: 4
    fig_width: 7.5
    preserve_yaml: no
    toc: yes
geometry: margin=1in
fontsize: 11pt
vignette: >
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{Reconstruct Intermediate Sequences}
  %\VignetteEncoding{UTF-8}
  %\usepackage[utf8]{inputenc}
---

Dowser automatically reconstructs intermediate sequences as part of the `getTrees` function. These are stored in the `nodes` list contained in each `phylo` object.

First, collapse internal nodes with identical sequences using the `collapesNodes`. This will significantly clean up the visualization. You could alternatively run `getTrees` with `collapse=TRUE`. Then, visualize the trees using `plotTrees` but with the `node_nums` parameter set. This will display the ID number of each internal node.

To obtain the IMGT-gapped sequence for each reconstructed node, specify the clone ID and node number in the `getNodeSeq` function.

To obtain all observed and reconstructed sequences for all clones, use the `getAllSeqs` function.

You can  save the output of `getAllSeqs` as a fasta file using the `dfToFasta` function.

```{r, eval=TRUE, warning=FALSE, message=FALSE}
library(dowser)

data(ExampleClones)

# Collapse nodes with identical sequences. This will 
trees = collapseNodes(ExampleClones[1:2,])

# Plot trees with node ID numbers
plots = plotTrees(trees, tips="c_call", tipsize=2, node_nums=TRUE, labelsize=7)

plots[[1]]

sequence = getNodeSeq(trees, node=50, clone=3128)

print(sequence)

# Get all sequences as a data frame
all_sequences = getAllSeqs(trees)

head(all_sequences)

```

## Saving sequences to a file

The `dfToFasta` function can be used to save a dataframe of sequences as a fasta file:

```{r, eval=FALSE, warning=FALSE, message=FALSE}

# Save all sequences as a fasta file
dfToFasta(all_sequences, file="all_sequences.fasta", id="node_id", columns=c("clone_id","locus"))

```