---
author: "Joseph Larmarange"
title: "Variables labels and packed columns"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Variables labels and packed columns}
  %\VignetteEngine{knitr::rmarkdown}
  \usepackage[utf8]{inputenc}
---

The **tidyr** package allows to group several columns of a tibble into one single df-column, see `tidyr::pack()`. Such df-column is itself a tibble. It's not currently clear why you would ever want to pack columns since few functions work with this sort of data.

```{r}
library(tidyr)
d <- iris %>%
  as_tibble() %>%
  pack(
    Sepal = starts_with("Sepal"),
    Petal = starts_with("Petal"),
    .names_sep = "."
  )
str(d)
class(d$Sepal)
```

Regarding variable labels, you may want to define a label for one sub-column of a df-column, or eventually a label for the df-column itself.

For a sub-column, you could use easily `var_label()` to define your label.

```{r}
library(labelled)
var_label(d$Sepal$Length) <- "Length of the sepal"
str(d)
```

But you cannot use directly `var_label()` for the df-column. 

```{r}
var_label(d$Petal) <- "wrong label for Petal"
str(d)
```

As `d$Petal` is itself a tibble, applying `var_label()` on it would have an effect on each sub-column. To change a variable label to the df-column itself, you could use `label_attribute()`.

```{r}
label_attribute(d$Petal) <- "correct label for Petal"
str(d)
```

On the other hand, `set_variable_labels()` works differently, as the primary intention of this function is to work on the columns of a tibble.

```{r}
d <- d %>% set_variable_labels(Sepal = "Label of the Sepal df-column")
str(d)
```

This is equivalent to:

```{r}
var_label(d) <- list(Sepal = "Label of the Sepal df-column")
str(d)
```

To use `set_variable_labels()` on sub-columns, you should use this syntax:

```{r}
d$Petal <- d$Petal %>%
  set_variable_labels(
    Length = "Petal length",
    Width = "Petal width"
  )
str(d)
```

If you want to get the list of variable labels of a tibble, by default `var_label()` or `get_variable_labels()` will return the labels of the first level of columns.

```{r}
d %>% get_variable_labels()
```

To obtain the list of variable labels for sub-columns, you could use `recurse = TRUE`:

```{r}
d %>% get_variable_labels(recurse = TRUE)
d %>%
  get_variable_labels(
    recurse = TRUE,
    null_action = "fill",
    unlist = TRUE
  )
```