---
title: "Distributions Provided by mniw"
author: "Martin Lysy"
date: "`r Sys.Date()`"
output:
  rmarkdown::html_vignette:
    toc: true
    keep_md: false
vignette: >
  %\VignetteIndexEntry{Distributions Provided by mniw}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

\newcommand{\bm}[1]{\boldsymbol{#1}}
\newcommand{\tx}[1]{\textrm{#1}}
\newcommand{\tfrac}[2]{\textstyle{\frac{#1}{#2}}}
\newcommand{\rv}[3]{#2_{#1},\ldots,#2_{#3}}
\newcommand{\iid}{\overset{\;\tx{iid}\;}{\sim}}
\newcommand{\ind}{\overset{\:\tx{ind}\:}{\sim}}
\newcommand{\XX}{{\bm{X}}}
\newcommand{\YY}{{\bm{Y}}}
\newcommand{\LL}{{\bm{L}}}
\newcommand{\PPs}{{\bm{\Psi}}}
\newcommand{\ZZ}{{\bm{Z}}}
\newcommand{\LLa}{{\bm{\Lambda}}}
\newcommand{\SSi}{{\bm{\Sigma}}}
\newcommand{\GGa}{{\bm{\Gamma}}}
\newcommand{\UU}{{\bm{U}}}
\newcommand{\VV}{{\bm{V}}}
\newcommand{\aa}{{\bm{a}}}
\newcommand{\bb}{{\bm{b}}}
\newcommand{\xx}{{\bm{x}}}
\newcommand{\yy}{{\bm{y}}}
\newcommand{\bbe}{{\bm{\beta}}}
\newcommand{\mmu}{{\bm{\mu}}}
\newcommand{\kka}{{\bm{\kappa}}}
\newcommand{\lla}{{\bm{\lambda}}}
\newcommand{\TTh}{{\bm{\Theta}}}
\newcommand{\GG}{{\bm{G}}}
\newcommand{\OOm}{{\bm{\Omega}}}
\newcommand{\bz}{{\bm{0}}}
\newcommand{\wish}{\operatorname{Wish}}
\newcommand{\iwish}{\operatorname{InvWish}}
\newcommand{\MN}{\operatorname{MatNorm}}
\newcommand{\MT}{\operatorname{MatT}}
\newcommand{\mniw}{\operatorname{MNIW}}
\newcommand{\re}{\operatorname{RxNorm}}
\newcommand{\N}{\operatorname{Normal}}


## Wishart Distribution

The Wishart distribution on a random positive-definite matrix $\XX_{q\times q}$ is is denoted $\XX \sim \wish(\PPs, \nu)$, and defined as $\XX = (\LL \ZZ)(\LL \ZZ)'$, where:

- $\PPs_{q\times q} = \LL\LL'$ is the positive-definite matrix scale parameter,
- $\nu > q$ is the shape parameter, 
- $\ZZ_{q\times q}$ is a random lower-triangular matrix with elements

    $$
    Z_{ij} \begin{cases} 
           \iid \N(0,1) & i < j \\ 
           \ind \chi^2_{(\nu-i+1)} & i = j \\
           = 0 & i > j.
           \end{cases}
    $$

The log-density of the Wishart distribution is

$$
\log p(\XX \mid \PPs, \nu) = -\tfrac{1}{2} \left[\mathrm{tr}(\PPs^{-1} \XX) + (q+1-\nu)\log |\XX| + \nu \log |\PPs|  + \nu q \log(2) + 2 \log \Gamma_q(\tfrac \nu 2)\right],
$$

where  $\Gamma_n(x)$ is the multivariate Gamma function defined as

$$
\Gamma_n(x) = \pi^{n(n-1)/4} \prod_{j=1}^n \Gamma\big(x + \tfrac 1 2 (1-j)\big).
$$

### Inverse-Wishart Distribution

The Inverse-Wishart distribution $\XX \sim \iwish(\PPs, \nu)$ is defined as $\XX^{-1} \sim \wish(\PPs^{-1}, \nu)$.  Its log-density is given by

$$
\log p(\XX \mid \PPs, \nu) = -\tfrac 1 2 \left[\mathrm{tr}(\PPs \XX^{-1}) + (\nu+q+1) \log |\XX| - \nu \log |\PPs| + \nu q \log(2) + 2 \log \Gamma_q(\tfrac \nu 2)\right].
$$

### Properties

If $\XX_{q\times q} \sim \wish(\PPs,\nu)$, the for a nonzero vector $\aa \in \mathbb R^q$ we have

$$
\frac{\aa'\XX\aa}{\aa'\PPs\aa} \sim \chi^2_{(\nu)}.
$$

## Matrix-Normal Distribution

The Matrix-Normal distribution on a random matrix $\XX_{p \times q}$ is denoted $\XX \sim \MN(\LLa, \SSi_R, \SSi_C)$, and defined as $\XX = \LL\ZZ \UU + \LLa$, where:

- $\LLa_{p \times q}$ is the mean matrix parameter,
- ${\SSi_R}_{p \times p} = \LL\LL'$ is the row-variance matrix parameter,
- ${\SSi_C}_{q \times q} = \UU'\UU$ is the column-variance matrix parameter,
- $\ZZ_{q\times q}$ is a random matrix with $Z_{ij} \iid \N(0,1)$.

The log-density of the Matrix-Normal distribution is

$$
\log p(\XX \mid \LLa, \SSi_R, \SSi_C) = -\tfrac 1 2 \left[\mathrm{tr}\big(\SSi_C^{-1}(\XX-\LLa)'\SSi_R^{-1}(\XX-\LLa)\big) + \nu q \log(2\pi) + \nu \log |\SSi_C| + q \log |\SSi_R|\right].
$$

### Properties

If $\XX_{p \times q} \sim \MN(\LLa, \SSi_R, \SSi_C)$, then for nonzero vectors $\aa \in \mathbb R^p$ and $\bb \in \mathbb R^q$ we have

$$
\aa' \XX \bb \sim \N(\aa' \LLa \bb, \aa'\SSi_R\aa \cdot \bb'\SSi_C\bb).
$$

## Matrix-Normal Inverse-Wishart Distribution

The Matrix-Normal Inverse-Wishart Distribution on a random matrix $\XX_{p \times q}$ and random positive-definite matrix $\VV_{q\times q}$ is denoted $(\XX,\VV) \sim \mniw(\LLa, \SSi, \PPs, \nu)$, and defined as

$$
\begin{aligned}
\XX \mid \VV & \sim \MN(\LLa, \SSi, \VV) \\
\VV & \sim \iwish(\PPs, \nu).
\end{aligned}
$$

### Properties

The MNIX distribution is conjugate prior for the multivariable response regression model

$$
\YY_{n \times q} \sim \MN(\XX_{n\times p} \bbe_{p \times q}, \VV, \SSi).
$$

That is, if $(\bbe, \SSi) \sim \mniw(\LLa, \OOm^{-1}, \PPs, \nu)$, then

$$
\bbe, \SSi \mid \YY \sim \mniw(\hat \LLa, \hat \OOm^{-1}, \hat \PPs, \hat \nu),
$$

where

$$
\begin{aligned}
\hat \OOm & = \XX'\VV^{-1}\XX + \OOm 
&
\hat \PPs & = \PPs + \YY'\VV^{-1}\YY + \LLa'\OOm\LLa - \hat \LLa'\hat \OOm \hat \LLa
\\
\hat \LLa & = \hat \OOm^{-1}(\XX'\VV^{-1}\YY + \OOm\LLa)
&
\hat \nu & = \nu + n.
\end{aligned}
$$

## Matrix-t Distribution

The Matrix-$t$ distribution on a random matrix $\XX_{p \times q}$ is denoted $\XX \sim \MT(\LLa, \SSi_R, \SSi_C, \nu)$, and defined as the marginal distribution of $\XX$ for $(\XX, \VV) \sim \mniw(\LLa, \SSi_R, \SSi_C, \nu)$.  Its log-density is given by

$$
\begin{aligned}
\log p(\XX \mid \LLa, \SSi_R, \SSi_C, \nu) 
& = -\tfrac 1 2 \Big[(\nu+p+q-1)\log | I + \SSi_R^{-1}(\XX-\LLa)\SSi_C^{-1}(\XX-\LLa)'| \\
& \phantom{= -\tfrac 1 2 \Big[} + q \log |\SSi_R| + p \log |\SSi_C| \\
& \phantom{= -\tfrac 1 2 \Big[} + pq \log(\pi) - \log \Gamma_q(\tfrac{\nu+p+q-1}{2}) + \log \Gamma_q(\tfrac{\nu+q-1}{2})\Big].
\end{aligned}
$$

### Properties

If $\XX_{p\times q} \sim \MT(\LLa, \SSi_R, \SSi_C, \nu)$, then for nonzero vectors $\aa \in \mathbb R^p$ and $\bb \in \mathbb R^q$ we have

$$
\frac{\aa'\XX\bb - \mu}{\sigma} \sim t_{(\nu -q + 1)},
$$

where 
$$
\mu = \aa'\LLa\bb, \qquad \sigma^2 = \frac{\aa'\SSi_R\aa \cdot \bb'\SSi_C\bb}{\nu - q + 1}.
$$

## Random-Effects Normal Distribution

Consider the multivariate normal distribution on $q$-dimensional vectors $\xx$ and $\mmu$ given by

$$
\begin{aligned}
\xx \mid \mmu & \sim \N(\mmu, \VV) \\
\mmu & \sim \N(\lla, \SSi).
\end{aligned}
$$

The random-effects normal distribution is defined as the posterior distribution $\mmu \sim p(\mmu \mid \xx)$, which is given by

$$
\mmu \mid \xx \sim \N\big(\GG(\xx-\lla) + \lla, \GG\VV\big), \qquad \GG = \SSi(\VV + \SSi)^{-1}.
$$

The notation for this distribution is $\mmu \sim \re(\xx, \VV, \lla, \SSi)$.

## Hierarchical Normal-Normal Model

The hierarchical Normal-Normal model is defined as

$$
\begin{aligned}
\yy_i \mid \mmu_i, \bbe, \SSi & \ind \N(\mmu_i, \VV_i) \\
\mmu_i \mid \bbe, \SSi & \iid \N(\xx_i'\bbe, \SSi) \\
(\bbe, \SSi) & \sim \mniw(\LLa, \OOm^{-1}, \PPs, \nu),
\end{aligned}
$$

where:

- ${\yy_i}_{q\times 1}$ is the response vector for subject $i$,
- ${\mmu_i}_{q\times 1}$ is the random effect for subject $i$,
- ${\VV_i}_{q\times q}$ is the error variance for subject $i$,
- ${\xx_i}_{p\times 1}$ is the covariate vector for subject $i$,
- $\bbe_{p \times q}$ is the random-effects coefficient matrix,
- $\SSi_{q \times q}$ is the random-effects error variance.

Let $\YY_{n\times q} = (\rv 1 \yy n)$, $\XX_{n\times p} = (\rv 1 \xx n)$, and $\TTh_{n \times q} = (\rv 1 \mmu n)$.  If interest lies in the posterior distribution $p(\TTh, \bbe, \SSi \mid \YY, \XX)$, then a Gibbs sampler can be used to cycle through the following conditional distributions:

$$
\begin{aligned}
\mmu_i \mid \bbe, \SSi, \YY, \XX & \ind \re(\yy_i, \VV_i, \xx_i'\bbe, \SSi) \\
\bbe, \SSi \mid \TTh, \YY, \XX & \sim \mniw(\hat \LLa, \hat \OOm^{-1}, \hat \PPs, \hat \nu),
\end{aligned}
$$

where $\hat \LLa$, $\hat \OOm$, $\hat \PPs$, and $\hat \nu$ are obtained from the MNIW conjugate posterior formula with $\YY \gets \TTh$.