---
title: "Plots"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{a02_plots}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---
  
```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  warning = FALSE,
  message = FALSE,
  fig.width=7.2, 
  fig.height=5
)
options(rmarkdown.html_vignette.check_title = FALSE)
```
  
**visOmopResults** provides plotting tools that simplify visualising data in `<summarised_result>` format while also being compatible with other formats.
  
```{r setup}
library(visOmopResults)
```

## Plotting with a `<summarised_result>`

For this vignette, we will use the `penguins` dataset from the **palmerpenguins** package.  This dataset will be summarised using the `PatientProfiles::summariseResult()` function, which aggregates the data into the `<summarised_result>` format:

```{r}
library(PatientProfiles)
library(palmerpenguins)
library(dplyr)

summariseIsland <- function(island) {
  penguins |>
    filter(.data$island == .env$island) |>
    summariseResult(
      group = "species",
      includeOverallGroup = TRUE,
      strata = list("year", "sex", c("year", "sex")),
      variables = c(
        "bill_length_mm", "bill_depth_mm", "flipper_length_mm", "body_mass_g", 
        "sex"),
      estimates = c(
        "median", "q25", "q75", "min", "max", "count_missing", "count", 
        "percentage", "density")
    ) |>
    suppressMessages() |>
    mutate(cdm_name = island)
}

penguinsSummary <- bind(
  summariseIsland("Torgersen"), 
  summariseIsland("Biscoe"), 
  summariseIsland("Dream")
)

penguinsSummary |> glimpse()
```

### Plotting principles for `<summarised_result>` objects

**1) Tidy Format**  
When working with `<summarised_result>` objects, the data is internally converted into the tidy format before plotting. This is an important distinction because columns such as `strata_name` and `strata_level` from the original `<summarised_result>` cannot be used directly with the plotting functions. Instead, tidy columns should be referenced.

For more information about the tidy format, refer to the **omopgenerics** package vignette on `<summarised_result>` [here](https://darwin-eu.github.io/omopgenerics/articles/summarised_result.html#tidy-a-summarised_result).

To identify the available tidy columns, use the `tidyColumns()` function:

```{r}
tidyColumns(penguinsSummary)
```

**2) Subsetting Variables**  
Before calling the plotting functions, always subset the `<summarised_result>` object to the variable of interest. Avoid combining results from unrelated variables, as this may lead to NA values in the tidy format, which can affect your plots.



### Scatter plot

We can create simple scatter plots using the `plotScatter()` let's see some  examples:
```{r}
penguinsSummary |>
  filter(variable_name == "bill_depth_mm") |>
  filterStrata(year != "overall", sex == "overall") |>
  scatterPlot(
    x = "year", 
    y = "median",
    line = TRUE, 
    point = TRUE,
    ribbon = FALSE,
    facet = "cdm_name",
    colour = "species"
  )
```

Additionally, we can use the function `themeVisOmop()` to change the default `ggplot2` style to our default style. Not only that, but we can use standard ggplot2 functionalities to the returned plot:

```{r}
penguinsSummary |>
  filter(variable_name %in% c("bill_length_mm", "bill_depth_mm"))|>
  filterStrata(year == "overall", sex == "overall") |>
  filterGroup(species != "overall") |>
  scatterPlot(
    x = "density_x", 
    y = "density_y",
    line = TRUE, 
    point = FALSE,
    ribbon = FALSE,
    facet = cdm_name ~ variable_name,
    colour = "species"
  ) +
  themeVisOmop() +
  ggplot2::facet_grid(cdm_name ~ variable_name, scales = "free_x") 
```

```{r}
penguinsSummary |>
  filter(variable_name == "flipper_length_mm") |>
  filterStrata(year != "overall", sex %in% c("female", "male")) |>
  scatterPlot(
    x = c("year", "sex"), 
    y = "median",
    ymin = "q25",
    ymax = "q75",
    line = FALSE, 
    point = TRUE,
    ribbon = FALSE,
    facet = cdm_name ~ species,
    colour = "sex",
    group = c("year", "sex")
  )  +
  themeVisOmop() +
  ggplot2::coord_flip() +
  ggplot2::labs(y = "Flipper length (mm)") + 
  ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5, hjust=1))
```

```{r}
penguinsSummary |>
  filter(
    variable_name %in% c("flipper_length_mm", "bill_length_mm", "bill_depth_mm")
  ) |>
  filterStrata(sex == "overall") |>
  scatterPlot(
    x = "year", 
    y = "median",
    ymin = "min",
    ymax = "max",
    line = FALSE, 
    point = TRUE,
    ribbon = TRUE,
    facet = cdm_name ~ species,
    colour = "variable_name",
    group = c("variable_name")
  ) +
  themeVisOmop(fontsizeRef = 12) + 
  ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5, hjust=1))
```

### Bar plot

Let's create a bar plots:

```{r}
penguinsSummary |>
  filter(variable_name == "number records") |>
  filterGroup(species != "overall") |>
  filterStrata(sex != "overall", year != "overall") |>
  barPlot(
    x = "year",
    y = "count",
    colour = "sex",
    facet = cdm_name ~ species
  ) +
  themeVisOmop(fontsizeRef = 12)
```

### Box plot

Let's create some box plots of their body mass:

```{r}
penguinsSummary |>
  filter(variable_name == "body_mass_g") |>
  boxPlot(x = "year", facet = species ~ cdm_name, colour = "sex") +
  themeVisOmop()
```

```{r}
penguinsSummary |>
  filter(variable_name == "body_mass_g") |>
  filterGroup(species != "overall") |>
  filterStrata(sex %in% c("female", "male"), year != "overall") |>
  boxPlot(x = "cdm_name", facet = c("sex", "species"), colour = "year") +
  themeVisOmop(fontsizeRef = 11)
```

Note that as we didnt specify x there is no levels in the x axis, but box plots
are produced anyway.

## Plotting with a `<data.frame>`

Plotting functions can also be used with the usual `<data.frame>`. In this case we 
will use the tidy format of `penguinsSummary`.

```{r}
penguinsTidy <- penguinsSummary |>
  filter(!estimate_name %in% c("density_x", "density_y")) |> # remove density for simplicity
  tidy()
penguinsTidy |> glimpse()
```

Using this tidy format, we can replicate plots. For instance, we recreate the previous example:
```{r}
penguinsTidy |>
  filter(
    variable_name == "body_mass_g",
    species != "overall",
    sex %in% c("female", "male"),
    year != "overall"
  ) |>
  boxPlot(x = "cdm_name", facet = sex ~ species, colour = "year") +
  themeVisOmop()
```

## Custom plotting

The tidy format is very useful to apply any other custom ggplot2 function that 
we may be interested on:
```{r}
library(ggplot2)
penguinsSummary |>
  filter(variable_name == "number records") |>
  tidy() |>
  ggplot(aes(x = year, y = sex, fill = count, label = count)) +
  themeVisOmop() +
  geom_tile() +
  scale_fill_viridis_c(trans = "log") + 
  geom_text() +
  facet_grid(cdm_name ~ species) + 
  ggplot2::theme(axis.text.x = ggplot2::element_text(angle = 90, vjust = 0.5, hjust=1))
```

## Combine with `ggplot2`

The plotting functions are a wrapper around the ggplot2 package, outputs of 
the plotting functions can be later customised with ggplot2 and similar tools. 
For example we can use `ggplot2::labs()` to change the labels and 
`ggplot2::theme()` to move the location of the legend.

```{r}
penguinsSummary |>
  filter(
    group_level != "overall",
    strata_name == "year &&& sex",
    !grepl("NA", strata_level),
    variable_name == "body_mass_g") |>
  boxPlot(x = "species", facet = cdm_name ~ sex, colour = "year") +
  themeVisOmop(fontsizeRef = 12) +
  ylim(c(0, 6500)) +
  labs(x = "My custom x label")
```

You can also use `ggplot2::ggsave()` to later save one of this plots into '.png'
file.

```{r, eval=FALSE}
ggsave(
  "figure8.png", plot = last_plot(), device = "png", width = 15, height = 12, 
  units = "cm", dpi = 300)
```

## Combine with `plotly`

Although the package currently does not provide any plotly functionality
ggplots can be easily converted to `<plotly>` ones using the function
`plotly::ggplotly()`. This can make the interactivity of some plots better.