---
title: "Import and plot UTI and EMA data"
author: "Stefano Coretta"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Import and plot UTI and EMA data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, echo=FALSE, include=FALSE}
knitr::opts_chunk$set(out.width = "300px", fig.align = "center", dpi = 300)
library(readr)
library(dplyr)
library(stringr)
library(ggplot2)
theme_set(theme_bw())
library(rticulate)
```

The package `rticulate` facilitates import in R of the following data:

- Deep Lab Cut (DLC) spline data from Articulate Assistant Advancedâ„¢. [Paper](https://www.mdpi.com/1424-8220/22/3/1133)

- `.pos` files from the Carstens AG500 electro-magnetic-articulographer.

- (Legacy) fan-line spline data from [Articulate Assistant Advancedâ„¢](https://www.articulateinstruments.com/aaa/).

To use the package, attach it as usual.

```{r attach, eval=FALSE}
library(rticulate)
```

The following sections illustrate how to import the three types of data.

# Import DLC spline data from AAA

The `read_aaa()` function can read DLC spline data exported from AAA. It assumes that the X/Y coordinates are in the rightmost columns of the exported data (i.e., when exporting they should be selected last). We recommend to include a header when exporting DLC data from AAA for a smooth experience.

The example file we will read can be inspected [here](https://github.com/stefanocoretta/rticulate/blob/main/inst/extdata/it01-dlc.tsv). This file contains a header (see the section on importing fan-line spline data below on how to import legacy fan-line data files without a header).

With `format = "wide"`, the file is read as is and a `frame_id` index column is added to index each tongue contour frame (normally, each row in the exported data is one ultrasound frame).

```{r dlc}
# system.file() is needed here because the example files reside in the package.
# You can just include the file path directly in read_aaa, like 
# read_aaa("~/Desktop/splines.tsv", columns)
file_path <- system.file("extdata", "it01-dlc.tsv", package = "rticulate")

dlc <- read_aaa(file_path, format = "wide")
dlc
```

For plotting and analysis, it is useful to have the data in a long format. By default, `read_aaa()` pivots the data from wide to long. The resulting tibble has X and Y coordinates for each knot of each spline in each row. DLC knots are numbered from 0, as they are in the exported data. A further column, `displ_id`, indexes a knot displacement trajectory in time if multiple time points are exported.

```{r dlc-2}
dlc <- read_aaa(file_path)
dlc
```

You can then proceed to plot the data as you would with any other data.

```{r dlc-plot}
dlc |> 
  ggplot(aes(X, Y)) +
  geom_point(alpha = 0.1, size = 0.5) +
  coord_fixed()
```

To plot tongue contours, use the `frame_id` column to group points from the same contour (we need to filter the data to include only `DLC_Tongue` data).

```{r dlc-plot-2}
dlc |> 
  filter(spline == "DLC_Tongue") |> 
  ggplot(aes(X, Y, group = frame_id)) +
  geom_path(alpha = 0.1, linewidth = 0.1) +
  coord_fixed()
```

You can also plot the X or Y displacement of the knots along time.

```{r dlc-displ, fig.asp=2}
dlc |> 
  filter(
    `Date Time of recording` == "29/11/2016 15:10:52",
    spline == "DLC_Tongue"
  ) |> 
  ggplot(aes(`Time of sample in annot`, Y, group = displ_id)) +
  geom_path() +
  facet_grid(rows = vars(knot), scales = "free_y")
```

Check out `vignette("kinematics", package = "rticulate")` and `vignette("filter-signals", package = "rticulate")`.

# Import `.pos` files from the Carstens AG500 EMA

You can import binary `.pos` files from the Carstens AG500 electro-magnetic articulographer using the `read_ag500_pos()` function. In most contexts, the defaults of the function will suffice. Due to CRAN not accepting binary files, the following code is not run, but you can test it by downloading the `.pos` file provided here: <https://github.com/stefanocoretta/rticulate/blob/vignettes/data-raw/ema/0025.pos>.

```{r ema, eval=FALSE}
ema <- read_ag500_pos("0025.pos")
```

You can plot the antero-posterior and vertical displacement of a sensor (here the tongue tip sensor, channel 5).

```{r ema-plot, eval=FALSE}
ema |> 
  filter(chn == 5) |> 
  ggplot(aes(x, z)) +
  geom_point(size = 0.1)
```

Or, for example, just the vertical displacement along time.

```{r ema-plot-2, eval=FALSE}
ema |> 
  filter(chn == 5) |> 
  ggplot(aes(time, z)) +
  geom_point(size = 0.1)
```



# (LEGACY) Import fan-line spline data from AAA

The function `read_aaa()` can import legacy fan-line spline data from AAA and transform it into a longer format (where each observation is a point on a fan line and the coordinates values are two variables, `X` and `Y`, see `?tongue` for more details).

The following code will show how to import a file without the header that has legacy fan-line spline data. You can inspect the file we will read [here](https://github.com/stefanocoretta/rticulate/blob/main/inst/extdata/it01.tsv). The X/Y coordinates are stored in the rightmost columns.

When the file does not contain a header like in this case, you must create a vector with the column names for all columns except the X/Y coordinates columns.

```{r columns}
columns <- c(
    "speaker",
    "seconds",
    "rec_date",
    "prompt",
    "label",
    "TT_displacement",
    "TT_velocity",
    "TT_abs_velocity",
    "TD_displacement",
    "TD_velocity",
    "TD_abs_velocity"
)
```

Now we can use `read_aaa()` to import the spline data as a tibble. The function requires a string with the file path and name, and a vector with the names of the columns. It is necessary to specify the number of fan lines with the `fan_lines` argument.

```{r read-aaa}
# system.file() is needed here because the example files reside in the package.
# You can just include the file path directly in read_aaa, like 
# read_aaa("~/Desktop/splines.tsv", columns)
file_path <- system.file("extdata", "it01.tsv", package = "rticulate")

tongue <- read_aaa(file_path, fan_lines = 42, column_names = columns)
```

To check the head of the tibble, just do:

```{r tibble}
tongue
```

Sometimes is useful to add extra information for each prompt (like vowel, consonant place, phonation, etc.). We can do so by using functions from the `dplyr` package (`word()` is from the `stringr` package).

```{r join}
stimuli <- read_csv(system.file("extdata", "stimuli.csv", package = "rticulate"))

tongue <- mutate(tongue, word = word(prompt, 2)) %>%
    left_join(y = stimuli) %>%
    mutate_if(is.character, as.factor)
```

Let's check `tongue` again.

```{r tibble-2}
tongue
```

# Import multiple files

To import multiple files with AAA data, simply use a list of paths with `read_aaa`, for example using `list.files`.

```{r read-multiple}
tongue2 <- list.files(
    path = system.file("extdata", package = "rticulate"),
    pattern = "*\\d{2}.tsv",
    full.names = TRUE
    ) %>%
    read_aaa(., fan_lines = 42, column_names = columns)
```