---
title: "3 - The Swarm-Verse"
author: "Marina Papadopoulou"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{3 - The Swarm-Verse}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

After having analyzed our data, we can examine the intra- and inter-specific variation present across datasets through dimensionality reduction techniques (create 'swarm spaces').

## 3.1 Load data

Load the metrics of collective motion calculated previously. Include the events you want to compare in a swarm space.

```{r message=FALSE, warning=FALSE}
library(swaRmverse)

# load pacakge data for many species
data("multi_species_metrics")

## A] Create the swarm space for this data only:
all_data <- multi_species_metrics

## B] Or bind with new data if continuing from step2
data("new_species_metrics") ## loads the output of step 2

new_species_tobind <- new_species_metrics[,!colnames(new_species_metrics) %in% c('event_dur', 'N', 'set', 'start_time')] # remove columns not needed for the swarm space
all_data <- rbind(multi_species_metrics, new_species_tobind)

## C] Or to use just the new data (overwrites previous command, comment out to compare with the other species):
all_data <- new_species_metrics

```

## 3.2 Build swarm space

Create new swarm space with PCA analysis:

```{r}
new_pca <- swarm_space(metrics_data = all_data,
                       space_type = "pca"
                       )

ggplot2::ggplot(new_pca$swarm_space,
                 ggplot2::aes(x = PC1, y = PC2, color = species)
                ) +
                ggplot2::geom_point() +
                ggplot2::theme_bw()

```

Check what each principal component represents and get the info of each event:

```{r}
pca_info <- new_pca$pca$rotation[, new_pca$pca$sdev > 1]
print(pca_info)

ref_data <- new_pca$ref
head(ref_data)

```

Or create a new swarm space with tSNE to better study the local structure of the data:

```{r}

new_tsne <- swarm_space(metrics_data = all_data,
                              space_type = "tsne",
                              tsne_rand_seed = 2023,
                              tsne_perplexity = 10
                              )

print("t-SNE was run with the following parameters:")
print(new_tsne$tsne_setup)

ggplot2::ggplot(new_tsne$swarm_space, ggplot2::aes(x = X, y = Y, color = species)) +
  ggplot2::geom_point() +
  ggplot2::theme_bw()

```

## 3.3 Expand existing swarm space

Starting from previously generated PCA swarm space, add new data:

```{r}
data("multi_species_pca")
data("multi_species_pca_data")

new_pca_data <- expand_pca_swarm_space(metrics_data = new_species_metrics,
                                       pca_space = multi_species_pca)

expanded_pca <- rbind(multi_species_pca_data, 
                      new_pca_data)

ggplot2::ggplot(expanded_pca,
                ggplot2::aes(x = PC1, y = PC2, color = species)) +
   ggplot2::geom_point() +
   ggplot2::theme_bw()

```


## 3.4 Your own swarm space

To compare several new datasets, one should run the analysis until the end of step 2 for each one of them. Then simply bind the result datasets together and run the swarm spaces as above:

```{r message=FALSE, warning=FALSE}

data("new_species_metrics") ## loads the output of step 2

## Use another dataset:
data_df <- get(data("tracks", package = "trackdf"))
data_df$set <- as.Date(data_df$t)

another_species <- col_motion_metrics_from_raw(data_df,
                                mov_av_time_window = 10,
                                step2time = 1,
                                geo = TRUE,
                                verbose = FALSE,
                                speed_lim = 0,
                                pol_lim = 0.3,
                                parallelize_all = FALSE
                                )

another_species$species <- "new_species_2"

## Bind all the datasets you want to compare here
all_data <- rbind(another_species, new_species_metrics)

new_pca <- swarm_space(metrics_data = all_data,
                       space_type = "pca"
                       )

ggplot2::ggplot(new_pca$swarm_space,
                ggplot2::aes(x = PC1, y = PC2, color = species)
                ) +
  ggplot2::geom_point() +
  ggplot2::theme_bw()

```