---
title: "Format Apply Function"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Format Apply Function}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```
### How to use fapply()

The `fapply()` function applies a format to a vector, factor, or list. 
This function 
may be used independently of the `fdata()` function.  Here is an example:
```{r eval=FALSE, echo=TRUE} 
v1 <- c("A", "B", "C", "B")
v1

# [1] "A" "B" "C" "B"

fmt1 <- value(condition(x == "A", "Label A"),
              condition(x == "B", "Label B"),
              condition(TRUE, "Other"))

fapply(v1, fmt1)

# "Label A" "Label B"   "Other" "Label B" 

```
One advantage of using `fapply()` is that your original data is not
altered. The formatted values are assigned to a new object. If your 
original data changes, the formatting function should be reapplied 
to maintain consistency with the original data.

## What kind of formats are available
Data can be formatted with several different types of objects:

  * A formatting string
  * A named vector
  * A user-defined format
  * A vectorized function
  * A formatting list

You can use the type of formatting object that is most suitable to 
your data and situation.  Each type of formatting object has it's own 
strengths and weaknesses.

#### Formatting String
The formatting functions accept formatting strings such as those associated
with the Base R `format()` and `sprintf()` functions.  If the data type of the 
vector is a date or datetime, `fapply()` will use the format codes associated 
with the `format()` function.  For other data types, `fapply()` will use the format
codes associated with `sprintf`.  Here is an example:
```{r eval=FALSE, echo=TRUE} 
v1 <- c(1.367, 8.356, 4.583, 2.873)

fapply(v1, "%.1f%%")

[1] "1.4%" "8.4%" "4.6%" "2.9%"

```

#### Named vectors
Data may be formatted using a named vector as a lookup.  Simply ensure that 
the names on the formatting vector correspond to the values in the data vector.  
The advantage
of using a named vector for formatting is its simplicity.  The disadvantage
is that it only works with character values. Here is an 
example of formatting using a named vector:
```{r eval=FALSE, echo=TRUE} 
v1 <- c("A", "B", "C", "B")

fmt1 <- c(A = "Label A", B = "Label B", C= "Label C")

fapply(v1, fmt1)

# "Label A" "Label B" "Label C" "Label B" 

```
#### User-defined formats
The **fmtr** package provides custom functions for creating user-defined 
formats, in a manner that is similar to a SAS® user-defined format.
These functions are `value()` and `condition()`.
The `value()` function accepts one or more conditions.  The `condition()`
function accepts an expression/label pair.  A user-defined format has the 
advantage of a clear and flexible syntax.  It is excellent for categorizing
data.  Here is an example of a user-defined 
format:
```{r eval=FALSE, echo=TRUE} 
v1 <- c("A", "B", NA, "C")

fmt2 <- value(condition(is.na(x), "Missing"),
              condition(x == "A", "Label A"),
              condition(x == "B", "Label B"),
              condition(TRUE, "Other"))
              
fapply(v1, fmt2)

# "Label A" "Label B"   "Missing" "Other" 

```
The user-defined format may also be used to format values conditionally.
Conditional formatting is accomplished by using a formatting string
as the label. The following example formats a numeric value two decimal
places, unless it exceeds a specified range.
```{r eval=FALSE, echo=TRUE} 
v2 <- c(18.3987, 15.45852, 8.9835, 11.246246, 25.3858, NA)

fmt3 <- value(condition(is.na(x), "Missing"),
              condition(x < 10, "Low"),
              condition(x > 20, "High"),
              condition(TRUE, "%.2f"))
              
fapply(v2, fmt3)

# [1] "18.40"   "15.46"   "Low"     "11.25"   "High"    "Missing"
```

#### Vectorized functions
Vectorized functions provide the most powerful way of formatting 
data.  Vectorized functions can be user-defined, or wrapping an 
available packaged function.  The vectorized function has the advantage 
of being nearly limitless
in the types of formatting you can perform.  The drawback is that a 
vectorized function can be more complicated to write.  Here is an 
example of formatting 
with a user-defined, vectorized function:
```{r eval=FALSE, echo=TRUE} 
v1 <- c("A", "B", NA, "C")

fmt2 <- Vectorize(function(x) {
    
    if (is.na(x)) 
      ret <- "Missing"
    else if (x %in% c("A", "B"))
      ret <- paste("Label", x)
    else 
      ret <- "Other"
    
    return(ret)
    
  })
  
fapply(v1, fmt2)

# "Label A" "Label B"   "Missing" "Other" 

```

#### A formatting list
Sometimes data needs to be formatted differently for each row.  This 
situation is difficult to deal with in any language.  
But it can be made easy in R with the **fmtr** package and a formatting list.

A formatting list is a list that contains one or more of the four types
of formatting objects described above.  It is defined with the `flist()` 
function. A formatting list can be applied in 
two different ways: in order, or with a lookup.  

By default, the 
list is applied in order.  That means the first format in the list is 
applied to the first item in the vector, the second format in the list is 
applied to the second item in the vector, and so on.  The list is recycled 
if the number of list items is shorter than the number of values in the 
vector.

For the lookup method, the formatting object is specified by a lookup vector. 
The lookup vector should contain names 
associated with the elements in the formatting list. The lookup vector should 
also contain the same number of items as the data vector.  For each item
in the data vector, **fmtr** will look up the appropriate format
from the formatting list, and apply that format to the
corresponding data value.

The following is an example of a lookup style formatting list:
```{r eval=FALSE, echo=TRUE} 
# Set up data
v1 <- c("num", "char", "date", "char", "date", "num")
v2 <- list(1.258, "H", as.Date("2020-06-19"),
           "L", as.Date("2020-04-24"), 2.8865)

df <- data.frame(type = v1, values = I(v2))
df

#    type     values
# 1   num      1.258
# 2  char          H
# 3  date 2020-06-19
# 4  char          L
# 5  date 2020-04-24
# 6   num     2.8865

# Set up formatting list
lst <- flist(type = "row", lookup = v1,
             num = "%.1f",
             char = value(condition(x == "H", "High"),
                          condition(x == "L", "Low"),
                          condition(TRUE, "NA")),
             date = "%y-%m")

# Assign formatting list to values column
attr(df$values, "format") <- lst


# Apply formatting
fdata(df)

#   type values
# 1  num    1.3
# 2 char   High
# 3 date  20-06
# 4 char    Low
# 5 date  20-04
# 6  num    2.9


```

Next: [Format Catalogs](fmtr-fcat.html)