---
title: "Frequently Asked Questions"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{FAQ}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

options(rmarkdown.html_vignette.check_title = FALSE)

```

Below are some frequently asked questions about the **fmtr** package. Click
on the links below to navigate to the full question and answer content.

## Index{#top}

* [How do I format a data frame?](#fdata)
* [How can I add a new, formatted column to my dataframe?](#fapply)
* [How do I use a format as a lookup?](#lookup)
* [Where is the documentation for the formatting codes?](#documentation)
* [How do I assign formats from metadata to my data frame?](#formats)
* [How do I assign labels from metadata to my data frame?](#labels)
* [How do I assign descriptions from metadata to my data frame?](#descriptions)
* [How do I create a format catalog?](#fcat)
* [How do I use formats from a format catalog?](#usefmt)
* [How do I assign formats from a catalog to a data frame?](#assign)
* [How do I create a format catalog from an Excel spreadsheet?](#excel)
* [How do I export my format catalog to a spreadsheet?](#export)
* [Can I create a user-defined format from data?](#asfmt)
* [Can I create an input format?](#input)
* [Can I create a numeric user-defined format?](#numeric)
* [Is there a way to set up a search path for a format catalog?](#path)
* [Is there a way to read in a format catalog from SAS®?](#sas7bcat)
* [What happened to the labels() function?](#labelsmoved)
* [Can I change a User-Defined Format after it has been created?](#udfedit)


## Content

### How do I format a data frame? {#fdata}

**Q:** I have a data frame with different types of variables:
numbers, dates, character values.  How can I format them without messing
up the original data?


**A:** With the **fmtr** package you can assign the formats to the 
"format" attribute of the dataframe, and the apply the formats using the 
`fdata()` function, returning the result to a new data frame.  Here is 
a simple example:
```{r eval=FALSE, echo=TRUE}
# Create sample data
dat <- data.frame(SUBJ = c(1, 2, 3),
                  BDATE = c(as.Date("1945-10-17"), 
                            as.Date("1967-09-04"),
                            as.Date("1998-04-28")),
                  SEX = c("M", "F", "M"),
                  WEIGHT = c(77.1107, 64.2848, 85.9842))
                  
# View data                  
dat
#   SUBJ      BDATE SEX  WEIGHT
# 1    1 1945-10-17   M 77.1107
# 2    2 1967-09-04   F 64.2848
# 3    3 1998-04-28   M 85.9842

# Assign formats
formats(dat) <- list(BDATE = "%Y/%m/%d",
                     SEX = c("M" = "Male", "F" = "Female"),
                     WEIGHT = "%1.1f kg")

# Apply formats to new data frame
dat2 <- fdata(dat)

# View new data frame
dat2
#   SUBJ      BDATE    SEX  WEIGHT
# 1    1 1945/10/17   Male 77.1 kg
# 2    2 1967/09/04 Female 64.3 kg
# 3    3 1998/04/28   Male 86.0 kg
```
[top](#top) 

******


### How can I add a new, formatted column to my dataframe? {#fapply}

**Q:** I want to add a new column to my data frame using a format.  This
column will be a categorization of an existing continuous column.  I want
to add it directly to my existing data frame, not create a new one.  How
can I do that?

**A:** First create a categorization format:  
```{r eval=FALSE, echo=TRUE}
fmt <- value(condition(x < 5, "A"),
             condition(x >= 5, "B"))


```
Then we'll create some sample data:
```{r eval=FALSE, echo=TRUE}
dat <- data.frame(ID = c(1, 2, 3),
                  NUM = c(2, 3, 7))
```
Then you can apply the format to your data frame
using either Base R or **tidyverse**.  
```{r eval=FALSE, echo=TRUE}
# Base R method
dat$CAT <- fapply(dat$NUM, fmt)

# View result
dat
#   ID NUM CAT
# 1  1   2   A
# 2  2   3   A
# 3  3   7   B

# tidyverse method
dat <- dat %>% 
       mutate(CAT = fapply(NUM, fmt))
       
dat
#   ID NUM CAT
# 1  1   2   A
# 2  2   3   A
# 3  3   7   B

```

[top](#top)

******

### How do I use a format as a lookup? {#lookup}

**Q:** I have a decode lookup I want to use on a variable in my dataframe.
How can I do this with the **fmtr** package?

**A:** There are two common ways to apply a lookup decode 
with the **fmtr** package. One is to create a named vector from the decode.
The other is to create a user-defined format.
```{r eval=FALSE, echo=TRUE}
# Create sample data frame
dat <- data.frame(ID = c(1, 2, 3, 4),
                  CODE = c("A", "C", "B", NA))

# Create decode vector
v1 <- c(A = "Value A", B = "Value B", C = "Value C")

# Create user-defined format
fmt1 <- value(condition(x == "A", "Value A"),
              condition(x == "B", "Value B"),
              condition(x == "C", "Value C"),
              condition(TRUE, "Other"))
              
# Apply decode vector
fapply(dat$CODE, v1)
# [1] "Value A" "Value C" "Value B" NA 

# Apply user-defined format
fapply(dat$CODE, fmt1)
# [1] "Value A" "Value C" "Value B" "Other"
```
As you can see, both the named vector and the user-defined format can
decode the data.  The advantage of the user-defined format is that it allows
you to handle NA values and assign defaults in a controlled way. 

The named vector is easy to create.  But there is no way to control what happens
to any data value that is not in the lookup.  Which method to use depends on your
data and the context in which you are applying the decode.  

Note that you may also write a vectorized function to perform the lookup.
See "Example 4" in the documentation on `fapply()` for a 
vectorized function example.

[top](#top) 

******


### Where is the documentation for the formatting codes? {#documentation}

**Q:** I'm trying to create some formats for dates and numbers, but am 
not sure what codes are available.  Where is the documentation for the
possible codes?

**A:** Some commonly used codes are documented as part of the **fmtr**
documentation [here](https://fmtr.r-sassy.org/reference/FormattingStrings.html).
Additional documentation on the possible codes for dates is 
[here](https://rdrr.io/r/base/strptime.html), and for numbers 
[here](https://rdrr.io/r/base/sprintf.html).

[top](#top)

******

### How do I assign formats from metadata to my data frame? {#formats}

**Q:** I have metadata for my datasets that includes the desired format
for each variable.  How can I apply these formats to my data?  

**A:** The metadata must ultimately map the variable name to the desired
format for that variable.  So we can imagine there are at least two columns 
in the metadata: the variable name and the format.
```{r eval=FALSE, echo=TRUE}
# Sample metadata
mdat <- data.frame(var = c("col1", "col2", "col3"),
                   fmt = c("%1.1f", "%m-%d-%Y", "%1.2f%%"))

# View metadata
mdat
#    var      fmt
# 1 col1    %1.1f
# 2 col2 %m-%d-%Y
# 3 col3  %1.2f%%

```
Then imagine another data frame that needs formatting:
```{r eval=FALSE, echo=TRUE}
# Sample data
dat <- data.frame(col1 = c(1.235, 3.3947, 7.2842),
                  col2 = c(as.Date("2021-11-01"),
                           as.Date("2021-11-02"),
                           as.Date("2021-11-03")),
                  col3 = c(23.325, 87.2746, 64.2184))

# View sample data
dat
#     col1       col2    col3
# 1 1.2350 2021-11-01 23.3250
# 2 3.3947 2021-11-02 87.2746
# 3 7.2842 2021-11-03 64.2184
```
Now we can put the format metadata into a list:
```{r eval=FALSE, echo=TRUE}
# Create list out of metadata vectors
lst <- as.list(mdat$fmt)
names(lst) <- mdat$var

# View List
lst
# $col1
# [1] "%1.1f"
#
# $col2
# [1] "%m-%d-%Y"
# 
# $col3
# [1] "%1.2f%%"

```
Now we can assign the list of formats to the dataframe format attributes 
using the `formats()` function:
```{r eval=FALSE, echo=TRUE}
# Assign formats to data
formats(dat) <- lst

# Data not formatted yet
dat
#     col1       col2    col3
# 1 1.2350 2021-11-01 23.3250
# 2 3.3947 2021-11-02 87.2746
# 3 7.2842 2021-11-03 64.2184

```
Then apply the formats using `fdata()`:
```{r eval=FALSE, echo=TRUE}
# Apply the formats to entire data frame
fdata(dat)
#   col1       col2   col3
# 1  1.2 11-01-2021 23.32%
# 2  3.4 11-02-2021 87.27%
# 3  7.3 11-03-2021 64.22%
```
[top](#top) 

******


### How do I assign labels from metadata to my data frame? {#labels}

**Q:** I have metadata for my datasets that includes the desired label
for each variable.  How can I apply these labels to my data?  

**A:** This question is similar to the above question concerning formats in
metadata.  The process for applying labels from metadata will be nearly identical
to the process for formats.  You will create a named list of labels from the 
metadata, then assign it to the dataframe, and apply it with `fdata()`.  The 
difference is that in the case of labels, you will assign them with the 
`labels()` function instead of the `formats()` function.  Like this:
```{r eval=FALSE, echo=TRUE}
# Create sample list of labels
lst <- list(col1 = "My First Column",
            col2 = "My Second Column", 
            col3 = "My Third Column")
            
# Create sample data frame
dat <- data.frame(col1 = c(1.235, 3.3947, 7.2842),
                  col2 = c(as.Date("2021-11-01"),
                           as.Date("2021-11-02"),
                           as.Date("2021-11-03")),
                  col3 = c(23.325, 87.2746, 64.2184))

# Assign labels to data frame
labels(dat) <- lst

# View label attributes
labels(dat)
# $col1
# [1] "My First Column"
# 
# $col2
# [1] "My Second Column"
# 
# $col3
# [1] "My Third Column"

```
Note that starting in **fmtr** 1.5.8, the `labels()` function has 
been moved to the **common** package.  To use the `labels()` function, 
please reference that package.

[top](#top)

******

### How do I assign descriptions from metadata to my data frame? {#descriptions}

**Q:** I have metadata for my datasets that includes the desired descriptions
for each variable.  How can I apply these descriptions to my data?


**A:** This question is similar to the questions on applying formats from
metadata, and applying labels from metadata.  The only difference is that 
you will use the `descriptions()` function.
```{r eval=FALSE, echo=TRUE}
# Assign descriptions
descriptions(dat) <- list(col1 = "Here is my description for col1.",
                          col2 = "Here is my description for col2.",
                          col3 = "Here is my description for col3.")

# View descriptions
descriptions(dat)
# $col1
# [1] "Here is my description for col1."
# 
# $col2
# [1] "Here is my description for col2."
# 
# $col3
# [1] "Here is my description for col3."

```

[top](#top) 

******


### How do I create a format catalog? {#fcat}

**Q:** I want to create a format catalog, save that catalog, and reuse it
later.  How can I do that?

**A:** The **fmtr** package provides all the functions necessary to do what
you want.  Here is an example:
```{r eval=FALSE, echo=TRUE}
library(fmtr)

# Create format catalog
fmts <- fcat(AGECAT = value(condition(x >= 18 & x <= 24, "18 to 24"),
                            condition(x >= 25 & x <= 44, "25 to 44"),
                            condition(x >= 45 & x <= 64, "45 to 64"),
                            condition(x >= 65, ">= 65"),
                            condition(TRUE, "Other")),
             SEX = value(condition(is.na(x), "Missing"),
                         condition(x == "M", "Male"),
                         condition(x == "F", "Female"),
                         condition(TRUE, "Other")),
             VAR = c("AGE" = "Age", 
                     "AGECAT" = "Age Group", 
                     "SEX" = "Sex")) 

# Save format catalog
write.fcat(fmts, "c:/mypath")

# Read format catalog back in
fmts <- read.fcat("c:/mypath/fmts.fcat")

# View format catalog
fmts
# # A format catalog: 3 formats
# - $AGECAT: type U, 5 conditions
# - $SEX: type U, 4 conditions
# - $VAR: type V, 3 elements

# Use restored formats
fapply(c(55, 27, 19), fmts$AGECAT)
# [1] "45 to 64" "25 to 44" "18 to 24"
```

[top](#top)

******

### How do I use formats from a format catalog? {#usefmt}

**Q:** My colleague gave me a format catalog.  How can I use it?

**A:** First read the format catalog into R using `read.fcat()`.  Then
you can begin using the formats in the catalog using dollar sign ($) notation.
Here is an example:
```{r eval=FALSE, echo=TRUE}
# Read format catalog back in
fmts <- read.fcat("c:/mypath/fmts.fcat")

# View format catalog
fmts
# # A format catalog: 3 formats
# - $AGECAT: type U, 5 conditions
# - $SEX: type U, 4 conditions
# - $VAR: type V, 3 elements

# Use restored formats
fapply(c(55, 27, 19), fmts$AGECAT)
# [1] "45 to 64" "25 to 44" "18 to 24"


```

[top](#top) 

******


### How do I assign formats from a catalog to a data frame? {#assign}

**Q:** I have a format catalog that I use to store formats.  The formats are 
shared between several datasets.  How can I assign formats from the 
catalog to one of my datasets?

**A:** Read the format catalog in using `read.fcat()`, then assign the formats
using the `formats()` function.  Like this:
```{r eval=FALSE, echo=TRUE}
# Read format catalog in
fmts <- read.fcat("c:/mypath/fmts.fcat")

# View format catalog
fmts
# # A format catalog: 3 formats
# - $AGECAT: type U, 5 conditions
# - $SEX: type U, 4 conditions
# - $VAR: type V, 3 elements

# Create sample data frame
dat <- read.table(header = TRUE, text = '
SUBJECT  AGECAT SEX
101       35      F
102       19      F
103       57      M
')

# Assign formats from catalog to data frame
formats(dat) <- fmts

# View formatted data
fdata(dat)
#   SUBJECT   AGECAT    SEX
# 1     101 25 to 44 Female
# 2     102 18 to 24 Female
# 3     103 45 to 64   Male

```
Note that this only works when the format names in the catalog correspond
to the column names in the dataframe.  If the names in the catalog do not
correspond to the column names, it is best to manipulate the names of
the format catalog using the `names()` function so that they 
match the column names in the dataframe. Then proceed as above. 
Like this:
```{r eval=FALSE, echo=TRUE}
# Read format catalog in
fmts <- read.fcat("c:/packages/fmts.fcat")

# View format catalog
fmts
# # A format catalog: 3 formats
# - $AGECAT: type U, 5 conditions
# - $SEX: type U, 4 conditions
# - $VAR: type V, 3 elements

# Create sample data frame
dat <- read.table(header = TRUE, text = '
SUBJ     AGE GENDER
101       35      F
102       19      F
103       57      M
')

# Reassign format names in catalog
names(fmts) <- c("AGE", "GENDER", "VAR")

# Assign formats from catalog to data frame
formats(dat) <- fmts

# View formatted data
fdata(dat)
#   SUBJECT   AGECAT    SEX
# 1     101 25 to 44 Female
# 2     102 18 to 24 Female
# 3     103 45 to 64   Male
```
[top](#top)

******

### How do I create a format catalog from an Excel spreadsheet? {#excel}

**Q:** I have format information stored in an Excel spreadsheet.  Can I use
that to create a format catalog and format my data?

**A:** Yes, provided the data is either in the correct structure or 
can be put in the correct structure to create a format catalog.  The correct
structure includes the following columns: Name, Type, Expression, Label and Order.
See the documentation on `as.fcat.data.frame()` for further description of the 
needed column values.

Here is an example showing Excel data that is already in the correct structure:
```{r eval=FALSE, echo=TRUE}
library(fmtr)
library(readxl)

# Read data from Excel
xldat <- read_excel("c:\\packages\\myxlfile.xlsx")

# View data frame
xldat
# # A tibble: 10 x 5
# Name   Type  Expression                                                Label    Order
# <chr>  <chr> <chr>                                                     <chr>    <lgl>
# 1  AGECAT U     "x >= 18 & x <= 24"                                       18 to 24 NA   
# 2  AGECAT U     "x >= 25 & x <= 44"                                       25 to 44 NA   
# 3  AGECAT U     "x >= 45 & x <= 64"                                       45 to 64 NA   
# 4  AGECAT U     "x >= 65"                                                 >= 65    NA   
# 5  AGECAT U     "TRUE"                                                    Other    NA   
# 6  SEX    U     "is.na(x)"                                                Missing  NA   
# 7  SEX    U     "x == \"M\""                                              Male     NA   
# 8  SEX    U     "x == \"F\""                                              Female   NA   
# 9  SEX    U     "TRUE"                                                    Other    NA   
# 10 VAR    V     "c(AGE = \"Age\", AGECAT = \"Age Group\", SEX = \"Sex\")" NA       NA 

# Convert dataframe to format catalog
fmts <- as.fcat(xldat)

# View format catalog                     
fmts
# # A format catalog: 3 formats
# - $AGECAT: type U, 5 conditions
# - $SEX: type U, 4 conditions
# - $VAR: type V, 3 element


# Create sample data frame
dat <- read.table(header = TRUE, text = '
SUBJECT AGECAT  SEX
101       35      F
102       19      F
103       57      M
')

# Assign formats from catalog
formats(dat) <- fmts

# Apply formats
fdata(dat)
#   SUBJECT   AGECAT    SEX
# 1     101 25 to 44 Female
# 2     102 18 to 24 Female
# 3     103 45 to 64   Male
```

[top](#top) 

******


### How do I export my format catalog to a spreadsheet? {#export}

**Q:** I have a format catalog I created in R with the **fmtr** package. 
I want to store that catalog in a spreadsheet for documentation purposes. 
How can I do that?

**A:** There is a Base R function `as.data.frame()` that can be used to 
convert a **fmtr** user-defined format or a format catalog to a data frame.
From there, it is easy to export to Excel or any other file format you like.
Here is an example:
```{r eval=FALSE, echo=TRUE}
library(fmtr)
library(openxlsx)

# Create sample format catlog
fmts <- fcat(AGECAT = value(condition(x >= 18 & x <= 24, "18 to 24"),
                            condition(x >= 25 & x <= 44, "25 to 44"),
                            condition(x >= 45 & x <= 64, "45 to 64"),
                            condition(x >= 65, ">= 65"),
                            condition(TRUE, "Other")),
             SEX = value(condition(is.na(x), "Missing"),
                         condition(x == "M", "Male"),
                         condition(x == "F", "Female"),
                         condition(TRUE, "Other")),
             VAR = c("AGE" = "Age", 
                     "AGECAT" = "Age Group", 
                     "SEX" = "Sex")) 
                     
# View format catalog                     
fmts
# # A format catalog: 3 formats
# - $AGECAT: type U, 5 conditions
# - $SEX: type U, 4 conditions
# - $VAR: type V, 3 element

# Convert format catalog to data frame
dat <- as.data.frame(fmts)

# Write data frame to Excel using openxlsx
write.xlsx(dat, "c:\\mypath\\myxlfile.xlsx")
```

[top](#top)

******

### Can I create a user-defined format from data? {#asfmt}

**Q:** I have a dataset with a code list that I want to create a 
user-defined format from.  Is there a way to do that?


**A:** Yes.  There is a function `as.fmt()` that allows you to convert a 
data frame into a user-defined format.  But the input dataframe needs a 
specific structure.  Here is an example:

```{r eval=FALSE, echo=TRUE}
library(fmtr)

# Create sample input data 
dat <- read.table(header = TRUE, text ='
        Col1  Col2 
        A    "Label A"
        B    "Label B"
        C    "Label C"')

# Create main conditions
df1 <- data.frame(Name = "myfmt", 
                  Type = "U", 
                  Expression = paste0("x == '", dat$Col1, "'"),
                  Label = dat$Col2,
                  Order = NA)

# Create default condition
df2 <- data.frame(Name = "myfmt", 
                  Type = "U", 
                  Expression = "TRUE",
                  Label = "Other",
                  Order = NA)

# Append default condition
df <- rbind(df1, df2)

# View input data
df
#    Name Type Expression   Label Order
# 1 myfmt    U   x == 'A' Label A    NA
# 2 myfmt    U   x == 'B' Label B    NA
# 3 myfmt    U   x == 'C' Label C    NA
# 4 myfmt    U       TRUE   Other    NA

# Convert data frame to user-defined format 
fmt <- as.fmt(df)

# Apply the format
fapply(c("A", "B", "C", NA), fmt)
# [1] "Label A" "Label B" "Label C" "Other" 

```

[top](#top) 

******


### Can I create an input format? {#input}

**Q:** SAS® distinguishes between an input format and an output format.  Is
there a similar distinction in the **fmtr** package?

**A:** No.  All **fmtr** formats are output formats.

[top](#top)

******

### Can I create a numeric user-defined format? {#numeric}

**Q:** I have a column of data with some character values that I want
to convert to a number.  I'd like to create a format to do that.  Can 
I create a user-defined format that returns a number instead of a text string?

**A:** Yes. The second parameter of the `condition()` function accepts
a character, numeric, or logical value.  That means a **fmtr** user-defined format
can be used to translate incoming values, whether character or numeric, 
to a number.  Here is an example:

```{r eval=FALSE, echo=TRUE}
library(fmtr)

nfmt <- value(condition(x == "A", 1),
              condition(x == "B", 2),
              condition(TRUE, 3))

fapply(c("A", "B", "C"), nfmt)
# [1] 1 2 3

```

[top](#top) 

******

### Is there a way to set up a search path for a format catalog? {#path}

**Q:** In SAS® you can set up a search path for a format catalog, so 
you don't need to read it.  You can just reference the format names
and it will work.  Is there a similar functionality in the **fmtr** package?

**A:** Not exactly. What you can do is use the `file.find()` function
from the *common* package to search for the format catalog file, and
then read the catalog into your program using `read.fcat()`. From there you
just use the format catalog as normal.

[top](#top) 

******


### Is there a way to read in a format catalog from SAS®? {#sas7bcat}

**Q:** I have an existing format catalog from SAS® that I want to convert
to R.  Does the **fmtr** package provide a way to read in a SAS® format
catalog? 

**A:** Not directly. What you can do is export the SAS format catalog to 
a dataset, read the dataset into R, and rearrange the data to correspond to the
requirements of `as.fcat.data.frame()`. Then use `as.fcat.data.frame()` 
to convert the data frame to a *fmtr* format catalog.  Note that
SAS® provides a lot of functionality in their formats that cannot be reproduced
in **fmtr**.  So there is no guarantee that all your SAS formats will convert
as desired.

[top](#top)

******

### What happened to the labels() function? {#labelsmoved}

**Q:** I had been using the `labels()` function in a few programs,
and now they are all broken.  It appears the `labels()` function is no longer
part of the **fmtr** package.  What happened? 

**A:** The `labels()` function has been moved to the **common** package.
The reason it was moved is because this function is so generally useful, 
that the **common** package was deemed a more appropriate home.  You can
fix your code by simply adding a reference to the **common** package.

[top](#top)

******

### Can I change a User-Defined Format after it has been created? {#udfedit}

**Q:** I created a User_Define Format, and now I want to add a new item to it.
Is there a way to do that? 

**A:** Yes.  Reminder that, in the end, a User-Defined Format is a list, and can
be manipulated as such.  You can change a User-Defined Format the same way
you change a list.  For example:

```{r eval=FALSE, echo=TRUE}
library(fmtr)

# Create format
fmt <- value(condition(x == "A", "Group A"),
             condition(x == "B", "Group B"))

# Create sample data
dat <- c("A", "B", "C")

# Apply format
fapply(dat, fmt)
# [1] "Group A" "Group B" "C"

# Add "C" condition to format
fmt[[length(fmt) + 1]] <- condition(x == "C", "Group C")

# Apply revised format
fapply(dat, fmt)
# [1] "Group A" "Group B" "Group C"


```


[top](#top)

******