---
title: "Simulated Grouped Hyper Data Frame"
output: rmarkdown::html_vignette
author: Tingting Zhan
vignette: >
  %\VignetteIndexEntry{intro}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
library(knitr)
opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
options(rmarkdown.html_vignette.check_title = FALSE)
```

# Introduction

This vignette of package **`groupedHyperframe.random`** ([Github](https://github.com/tingtingzhan/groupedHyperframe.random), [RPubs](https://rpubs.com/tingtingzhan/groupedHyperframe_random)) documents the simulation of `superimpose`d `ppp.object` and the `groupedHyperframe` object.

## Note to Users

Examples in this vignette require that the `search` path has

```{r setup}
library(groupedHyperframe.random)
```

## Terms and Abbreviations

```{r echo = FALSE, results = 'asis'}
c(
  '', 'Forward pipe operator', '`?base::pipeOp` introduced in `R` 4.1.0', 
  '`CRAN`, `R`', 'The Comprehensive R Archive Network', 'https://cran.r-project.org',
  '`coords`', '$x$- and $y$-coordinates', '`spatstat.geom:::ppp`',
  '`diag`', 'Diagonal matrix', '`base::diag`',
  '`groupedHyperframe`', 'Grouped hyper data frame', '`groupedHyperframe::as.groupedHyperframe`',
  '`hypercolumns`, `hyperframe`', '(Hyper columns of) hyper data frame', '`spatstat.geom::hyperframe`',
  '`marks`, `marked`', '(Having) mark values', '`spatstat.geom::is.marked`',
  '`pmax`', 'Parallel maxima', '`base::pmax`',
  '`ppp`, `ppp.object`', 'Point pattern', '`spatstat.geom::ppp.object`',
  '`recycle`', 'Recycling', 'https://r4ds.had.co.nz/vectors.html#scalars-and-recycling-rules',  
  '`rlnorm`', 'Log normal random variable', '`stats::rlnorm`', 
  '`rMatClust`', 'Matern\'s cluster process', '`spatstat.random::rMatClust`',
  '`rmvnorm_`', 'Multivariate normal random variable', '`groupedHyperframe.random::rmvnorm_`; `MASS::mvrnorm`',
  '`rnbinom`', 'Negative binomial random variable', '`stats::rnbinom`',
  '`rpoispp`', 'Poisson point pattern', '`spatstat.random::rpoispp`',
#  '`rStrauss`', 'Strauss process', '`spatstat.random::rStrauss`',
  '`superimpose`', 'Superimpose', '`spatstat.geom::superimpose`',
  '`var`, `cor`, `cov`', 'Variance, correlation, covariance', '`stats::var`, `stats::cor`, `stats::cov`'
) |>
  matrix(nrow = 3L, dimnames = list(c('Term / Abbreviation', 'Description', 'Reference'), NULL)) |>
  t.default() |>
  as.data.frame.matrix() |> 
  kable()
```

## Acknowledgement

This work is supported by NCI R01CA222847 ([I. Chervoneva](https://orcid.org/0000-0002-9104-4505), [T. Zhan](https://orcid.org/0000-0001-9971-4844), and [H. Rui](https://orcid.org/0000-0002-8778-261X)) and R01CA253977 (H. Rui and I. Chervoneva). 


# Simulated Point Pattern

Function `.rppp()` simulates `superimpose`d `ppp.object`s with *vectorized* parameterization of random point pattern and distribution of `marks`. 

## Simulated un`marked` Point Pattern

Example below simulates a `coords`-only, un`marked`, two `superimpose`d Matern's cluster processes $(\kappa, \mu, s) = (10,8,.15)$ and $(5,4,.06)$.
```{r}
set.seed(125); r = .rppp(rMatClust(kappa = c(10, 5), mu = c(8, 4), scale = c(.15, .06)))
# plot(r) # suppressed for aesthetics
```

## Simulated `marked` Point Pattern

Example below simulates two `superimpose`d `marked` `ppp`s,

* Matern's cluster process $(\kappa,\mu,s) = (10,8,.15)$, attached with a log-normal `mark` $(\mu,\sigma)=(3,.4)$, and a negative-binomial `mark` $(r,p)=(4,.3)$.

* Matern's cluster process $(\kappa,\mu,s) = (5,4,.06)$, attached with a log-normal `mark` $(\mu,\sigma)=(5,.2)$, and a negative-binomial `mark` $(r,p)=(4,.3)$. 

```{r}
set.seed(125); r1 = .rppp(
  rMatClust(kappa = c(10, 5), mu = c(8, 4), scale = c(.15, .06)), 
  rlnorm(meanlog = c(3, 5), sdlog = c(.4, .2)),
  rnbinom(size = 4, prob = .3) # shorter parameter recycled
)
```

Example below simulates two `superimpose`d `marked` `ppp`s,

* Poisson point pattern $\lambda=3$, attached with a log-normal `mark` $(\mu,\sigma)=(3,.4)$, and a negative-binomial `mark` $(r,p)=(4,.3)$.

* Poisson point pattern $\lambda=6$, attached with a log-normal `mark` $(\mu,\sigma)=(5,.2)$, and a negative-binomial `mark` $(r,p)=(6,.1)$.

```{r}
set.seed(62); r2 = .rppp(
  rpoispp(lambda = c(3, 6)),
  rlnorm(meanlog = c(3, 5), sdlog = c(.4, .2)),
  rnbinom(size = c(4, 6), prob = c(.3, .1))
)
```

In the foreseeable future we will not support simulating more than one type of point patterns in a single call to function `.rppp()`.  End user may manually `superimpose` different (`marked`) point patterns after simulating each of them separately.

```{r}
spatstat.geom::superimpose(r1, r2)
```


# Simulated `groupedHyperframe`

Now consider two `superimpose`d Matern's cluster processes attached with a log-normal `mark`. The population parameters are
```{r}
(p = data.frame(kappa = c(3,2), scale = c(.4,.2), mu = c(10,5), 
                meanlog = c(3,5), sdlog = c(.4,.2)))
```

We simulate for 3 subjects (e.g., patients). The subject-specific parameters deviate from the population parameters under a multivariate normal distribution with variance-covariance matrix $\Sigma$.  The matrix $\Sigma$ may be specified by a `numeric` scalar, indicating all-equal `diag`onal `var`iances and zero `cor`relations/`cov`ariances.  We also make sure that all subject-specific parameters satisfy that $\kappa>1$, $\mu>1$, $s>0$ for Matern's cluster processes, and $\sigma>0$ for log-normal distribution.  Each `matrix` of the subject-specific parameters has the subjects on the rows, and the parameters of the `ppp`s to be `superimpose`d on the columns.
```{r}
set.seed(39); (p. = rmvnorm_(n = 3L, mu = p, Sigma = list(
  kappa = .2^2, scale = .05^2, mu = .5^2, 
  meanlog = .1^2, sdlog = .01^2)) |> 
    within.list(expr = {
      kappa = pmax(kappa, 1 + .Machine$double.eps)
      mu = pmax(mu, 1 + .Machine$double.eps)
      scale = pmax(scale, .Machine$double.eps)
      sdlog = pmax(sdlog, .Machine$double.eps)
    }))
```

We simulate one to four `ppp`s (e.g., medical images) per subject.

```{r}
set.seed(37); (n = sample.int(n = 4L, size = 3L, replace = TRUE)) 
```

Function `grouped_rppp()` simulates a `groupedHyperframe` with a `ppp`-hypercolumn, and one-or-more columns of the grouping structure. 
```{r}
set.seed(76); (r = p. |> 
  with.default(expr = {
    grouped_rppp(
      rMatClust(kappa = kappa, scale = scale, mu = mu), 
      rlnorm(meanlog = meanlog, sdlog = sdlog),
      n = n
    )
  }))
```