---
title: "Tweedie Distribution"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Tweedie Distribution}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## Introduction

The tweedie distribution is a new comer to generalized linear model framework. It has three parameters, $\mu$, $\sigma^2$, $\rho$, and has a number of unique characteristics. Many other distributions are special cases of the tweedie distribution.

* Distributions
  + $\rho$ = 0 gaussian
  + $\rho$ = 1 poisson
  + $\rho$ = 2 gamma
  + $\rho$ = 3 inverse gaussian

For many parameters, the probability density function cannot be evaluated directly. Instead special algorithms must be created to calculate the density. Interested statisticians should read [Evaluation of Tweedie exponential dispersion model densities by Fourier inversion](https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.449.2161&rep=rep1&type=pdf). The author created the R package tweedie; which GlmSimulatoR relies upon.

Due to computational demands associated with the tweedie distribution and software available in R, this package focuses on tweedie distributions with $\rho$ in [1, 2] and $\sigma^2$ equal to 1.

## Building a glm model

To test out R's special glm function for the tweedie distribution, lets generate data and see how close our estimates are. We will be using the cplm package.


```{r model, echo=TRUE}
library(GlmSimulatoR)
library(ggplot2)
library(cplm, quietly = TRUE)
set.seed(1)

simdata <- simulate_tweedie(weight = .2, ancillary = 1.15, link = "log")

ggplot(simdata, aes(x = Y)) +
  geom_histogram(bins = 30)
```

The response variable is usually zero, but sometimes it is positive. This is what makes the tweedie distribution unique.

```{r model2, echo=TRUE}
glm_model <- cpglm(Y ~ X1, data = simdata, link = "log")
summary(glm_model)
```

The estimated index parameter is very close to $\rho$ (the ancillary argument), and the estimated weights are close to $\beta$ (the weight argument). Overall the function worked well.