---
title: "Compute weighted mean with `stat_weighted_mean()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Compute weighted mean with `stat_weighted_mean()`}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(ggstats)
library(ggplot2)
```


`stat_weighted_mean()` computes mean value of **y** (taking into account any **weight** aesthetic if provided) for each value of **x**. More precisely, it will return a new data frame with one line per unique value of **x** with the following new variables:

- **y**: mean value of the original **y** (i.e. **numerator**/**denominator**)
- **numerator**
- **denominator**

Let's take an example. The following plot shows all tips received according to the day of the week.

```{r}
data(tips, package = "reshape")
ggplot(tips) +
  aes(x = day, y = tip) +
  geom_point()
```

To plot their mean value per day, simply use `stat_weighted_mean()`.

```{r}
ggplot(tips) +
  aes(x = day, y = tip) +
  stat_weighted_mean()
```

We can specify the geometry we want using `geom` argument. Note that for lines, we need to specify the **group** aesthetic as well.

```{r}
ggplot(tips) +
  aes(x = day, y = tip, group = 1) +
  stat_weighted_mean(geom = "line")
```

An alternative is to specify the statistic in `ggplot2::geom_line()`.

```{r}
ggplot(tips) +
  aes(x = day, y = tip, group = 1) +
  geom_line(stat = "weighted_mean")
```

Of course, it could be use with other geometries. Here a bar plot.

```{r}
p <- ggplot(tips) +
  aes(x = day, y = tip, fill = sex) +
  stat_weighted_mean(geom = "bar", position = "dodge") +
  ylab("mean tip")
p
```

It is very easy to add facets. In that case, computation will be done separately for each facet.

```{r}
p + facet_grid(rows = vars(smoker))
```

`stat_weighted_mean()` could be also used for computing proportions as a proportion is technically a mean of binary values (0 or 1).

```{r}
ggplot(tips) +
  aes(x = day, y = as.integer(smoker == "Yes"), fill = sex) +
  stat_weighted_mean(geom = "bar", position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  ylab("proportion of smoker")
```

Finally, you can use the **weight** aesthetic to indicate weights to take into account for computing means / proportions.

```{r}
d <- as.data.frame(Titanic)
ggplot(d) +
  aes(x = Class, y = as.integer(Survived == "Yes"), weight = Freq, fill = Sex) +
  geom_bar(stat = "weighted_mean", position = "dodge") +
  scale_y_continuous(labels = scales::percent) +
  labs(y = "Proportion who survived")
```