The DrugUtilisation package includes a range of functions that add drug-related information of subjects in OMOP CDM tables and cohort tables. In this vignette, we will explore these functions and provide some examples for its usage.
library(DrugUtilisation)
library(CodelistGenerator)
library(CDMConnector)
library(dplyr)
library(PatientProfiles)
<- mockDrugUtilisation(numberIndividual = 200) cdm
We will use Acetaminophen as our example drug to construct
our drug utilisation cohort. To begin, we will employ
getDrugIngredientCodes()
function from CodelistGenerator to
generate a concept list associated with Acetaminophen.
<- getDrugIngredientCodes(cdm, c("acetaminophen"))
conceptList
conceptList#> $acetaminophen
#> [1] 1125315 1125360 2905077 43135274
Next, we create a drug utilisation cohort by using the
conceptList
with the
generateDrugUtilisationCohortSet()
function. For a better
understanding of the arguments and functionalities of
generateDrugUtilisationCohortSet()
, please refer to the
Use DrugUtilisation to create a cohort vignette.
<- generateDrugUtilisationCohortSet(
cdm cdm = cdm,
name = "acetaminophen_example1",
conceptSet = conceptList
)
addRoute()
function utilises an internal CSV file
containing all possible routes for various drug dose forms supported by
the package. The function is designed to seamlessly incorporate route
information into your drug table for the supported dose forms. See the
example below to know how it works.
"drug_exposure"]] %>%
cdm[[addRoute()
#> # Source: SQL [?? x 8]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> drug_exposure_id person_id drug_concept_id drug_exposure_start_date
#> <int> <int> <dbl> <date>
#> 1 1 1 1503328 2022-01-11
#> 2 2 1 2905077 2021-10-26
#> 3 3 1 1539463 2021-08-10
#> 4 4 1 1516980 2021-12-20
#> 5 5 2 1516978 2013-05-09
#> 6 6 2 2905077 2013-01-20
#> 7 7 3 1125360 2013-05-25
#> 8 9 3 1516978 2010-07-16
#> 9 10 3 2905077 2013-01-26
#> 10 11 4 1125360 2003-12-11
#> # ℹ more rows
#> # ℹ 4 more variables: drug_exposure_end_date <date>,
#> # drug_type_concept_id <dbl>, quantity <dbl>, route <chr>
The patternTable()
function in the DrugUtilisation
package is a powerful tool for deriving patterns from a drug strength
table. This function extracts distinct patterns, associating them with
pattern_id
and formula_id
. The resulting
tibble provides the following data:
number_concepts
: the count of distinct concepts in the
patterns.number_ingredients
: the count of distinct ingredients
involved.number_records
: the overall count of records in the
patterns.Moreover, the tibble includes a column indicating potentially valid and invalid combinations.
patternTable(cdm)
#> # A tibble: 5 × 12
#> pattern_id formula_name validity number_concepts number_ingredients
#> <dbl> <chr> <chr> <dbl> <dbl>
#> 1 9 fixed amount formulati… pattern… 7 4
#> 2 18 concentration formulat… pattern… 1 1
#> 3 24 concentration formulat… pattern… 1 1
#> 4 40 concentration formulat… pattern… 1 1
#> 5 NA <NA> no patt… 4 4
#> # ℹ 7 more variables: number_records <dbl>, amount_numeric <dbl>,
#> # amount_unit_concept_id <dbl>, numerator_numeric <dbl>,
#> # numerator_unit_concept_id <dbl>, denominator_numeric <dbl>,
#> # denominator_unit_concept_id <dbl>
For detailed information about the patterns, their associated
formula, and combinations of amount_unit
,
numerator_unit
, and denominator_unit
, you can
refer to the data:
patternsWithFormula
Now that we have all the patterns and formulas supported, the
computation of daily doses can be performed using the
addDailyDose()
function. This function will add to the data
with additional columns, including those for quantity, daily dose, unit,
and route.
addDailyDose(
$drug_exposure,
cdmcdm = cdm,
ingredientConceptId = 1125315
)#> # Source: table<og_091_1717442365> [?? x 9]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> drug_exposure_id person_id drug_concept_id drug_exposure_start_date
#> <int> <int> <dbl> <date>
#> 1 2 1 2905077 2021-10-26
#> 2 6 2 2905077 2013-01-20
#> 3 7 3 1125360 2013-05-25
#> 4 10 3 2905077 2013-01-26
#> 5 11 4 1125360 2003-12-11
#> 6 13 4 1125360 1990-09-04
#> 7 15 4 43135274 1995-04-04
#> 8 21 6 43135274 2020-05-19
#> 9 25 7 2905077 2005-11-18
#> 10 26 7 2905077 2008-10-22
#> # ℹ more rows
#> # ℹ 5 more variables: drug_exposure_end_date <date>,
#> # drug_type_concept_id <dbl>, quantity <dbl>, daily_dose <dbl>, unit <chr>
There is also a function, dailyDoseCoverage()
, to check
the coverage of daily dose computation for chosen concept sets and
ingredients.
suppressWarnings(dailyDoseCoverage(cdm, 1125315))
#> ℹ The following estimates will be computed:
#> • daily_dose: count_missing, percentage_missing, mean, sd, min, q05, q25,
#> median, q75, q95, max
#> ! Table is collected to memory as not all requested estimates are supported on
#> the database side
#> → Start summary of data, at 2024-06-03 20:19:25
#>
#> ✔ Summary finished, at 2024-06-03 20:19:25
#> # A tibble: 84 × 13
#> result_id cdm_name group_name group_level strata_name strata_level
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 2 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 3 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 4 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 5 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 6 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 7 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 8 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 9 1 DUS MOCK ingredient_name acetaminophen overall overall
#> 10 1 DUS MOCK ingredient_name acetaminophen overall overall
#> # ℹ 74 more rows
#> # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
#> # estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#> # additional_name <chr>, additional_level <chr>
Additional drug usage details, including duration, initial dose,
cumulative dose, etc., can be incorporated into a cohort using the
addDrugUse()
function.
$acetaminophen_example1 |>
cdmaddDrugUse(ingredientConceptId = 1125315)
#> # Source: table<og_098_1717442372> [?? x 13]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date duration
#> <int> <int> <date> <date> <dbl>
#> 1 1 31 2000-06-11 2000-10-25 137
#> 2 1 183 2011-05-07 2013-07-10 796
#> 3 1 56 2022-01-27 2022-03-04 37
#> 4 1 66 2018-07-21 2018-10-11 83
#> 5 1 153 2022-03-16 2022-03-28 13
#> 6 1 117 1999-05-22 2009-03-01 3572
#> 7 1 91 2022-09-22 2022-10-10 19
#> 8 1 9 1989-02-04 1992-11-02 1368
#> 9 1 192 2017-07-10 2017-11-13 127
#> 10 1 200 2005-08-29 2005-08-31 3
#> # ℹ more rows
#> # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
#> # initial_quantity <dbl>, impute_duration_percentage <dbl>,
#> # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
#> # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>
The duration
parameter is a boolean variable
(TRUE
/FALSE
) determining whether to include
duration related columns, which correspond to:
duration
: duration is calculated as
cohort_end_date - cohort_start_date + 1
.impute_duration_percentage
: if a drug exposure record
does not have the duration of the exposure, or falls outside the
specified duration range, duration will be imputed. The number of
records that have been imputed or that would have been imputed (if we
choose not to impute the duration) is recorded in this column.To set the imputation method for duration, use the
imputeDuration
input, which can take values such as
none
(default), median
, mode
or a
numerical value. Define the durationRange
parameter as a
numeric vector of length two, where the first value should be equal or
smaller than the second one. If set to NULL, no restrictions are
applied.
The quantity
parameter, another boolean variable
(TRUE
/FALSE
), controls the inclusion of
quantity-related columns. If set to TRUE
(default), the
following columns are added:
cumulative_quantity
: cumulative sum of the column
quantity
of the drug_exposure
table during the
drug exposure period.initial_quantity
: quantity at
drug_exposure_start_date
.The dose
parameter, also a boolean variable
(TRUE
/FALSE
), governs the addition of daily
dose-related columns. When set to TRUE
, the following
columns are added:
initial_daily_dose_milligram
: dose at
drug_exposure_start_date
.cumulative_dose_milligram
: cumulative sum of the column
dose
of drug_exposure
table during the drug
exposure period.impute_daily_dose_percentage
: If daily dose is missing,
or falls outside the imputation range, records will be imputed. This
column shows the number of records that have been imputed or that would
have been imputed (if we choose not to impute the daily dose).Similar to duration imputation, use the imputeDose
parameter to set the method for imputing daily dose, with options like
none
(default), median
, mean
,
mode
. Define the imputation range with the
dailyDoseRange
parameter, a numeric vector of length two,
where the first value should be equal or smaller than the second one. If
set to NULL, no restrictions are applied.
These parameters offer flexibility in customizing the drug usage details added to the cohort. See the next example, where we use the cohort created at the beginning of this vignette acetaminophen_example1.
addDrugUse(
cohort = cdm[["acetaminophen_example1"]],
cdm = cdm,
ingredientConceptId = 1125315,
duration = TRUE,
quantity = TRUE,
dose = TRUE
)#> # Source: table<og_103_1717442377> [?? x 13]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date duration
#> <int> <int> <date> <date> <dbl>
#> 1 1 31 2000-06-11 2000-10-25 137
#> 2 1 183 2011-05-07 2013-07-10 796
#> 3 1 56 2022-01-27 2022-03-04 37
#> 4 1 66 2018-07-21 2018-10-11 83
#> 5 1 153 2022-03-16 2022-03-28 13
#> 6 1 117 1999-05-22 2009-03-01 3572
#> 7 1 91 2022-09-22 2022-10-10 19
#> 8 1 9 1989-02-04 1992-11-02 1368
#> 9 1 192 2017-07-10 2017-11-13 127
#> 10 1 200 2005-08-29 2005-08-31 3
#> # ℹ more rows
#> # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
#> # initial_quantity <dbl>, impute_duration_percentage <dbl>,
#> # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
#> # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>
If all these parameters are set to false, only
number_exposures
and number_eras
will be
added.
The way continuous exposures are joined can be configured by using different parameters. Let’s have a look to all the options we have.
This parameter sets the number of days between two continuous
exposures to be considered in the same era. If the previous exposure’s
end date minus the next exposure’s start date is less than or equal to
the specified gapEra
, these two exposures will be joined.
Let’s see an illustrative example.
First, let’s create a cohort with gapEra = 0
. For a
better understanding, we will observe only subject number 56.
<- generateDrugUtilisationCohortSet(
cdm cdm = cdm,
name = "acetaminophen_example2",
conceptSet = conceptList,
gapEra = 0
)
$drug_exposure %>%
cdmfilter(drug_concept_id %in% !!conceptList$acetaminophen) %>%
filter(person_id == 56)
#> # Source: SQL [2 x 7]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> drug_exposure_id person_id drug_concept_id drug_exposure_start_date
#> <int> <int> <dbl> <date>
#> 1 166 56 1125360 2022-01-27
#> 2 164 56 43135274 2021-08-08
#> # ℹ 3 more variables: drug_exposure_end_date <date>,
#> # drug_type_concept_id <dbl>, quantity <dbl>
This subject has two different drug exposure periods separated by less than 6 months. Hence, it has two different cohort periods:
"acetaminophen_example2"]] %>%
cdm[[addDrugUse(
ingredientConceptId = 1125315,
gapEra = 0
%>%
) filter(subject_id == 56)
#> # Source: SQL [2 x 13]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date duration
#> <int> <int> <date> <date> <dbl>
#> 1 1 56 2022-01-27 2022-03-04 37
#> 2 1 56 2021-08-08 2021-09-17 41
#> # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
#> # initial_quantity <dbl>, impute_duration_percentage <dbl>,
#> # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
#> # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>
Now, we merge this two periods by modifying the gapEra
input when creating the cohort. For a better understanding of
gapEra
arguments and functionalities, please see Use
DrugUtilisation to create a cohort vignette.
<- generateDrugUtilisationCohortSet(
cdm cdm = cdm,
name = "acetaminophen_example3",
conceptSet = conceptList,
gapEra = 180
)
$acetaminophen_example3 %>%
cdmaddDrugUse(
ingredientConceptId = 1125315,
gapEra = 180,
duration = TRUE,
quantity = FALSE,
dose = FALSE
%>%
) filter(subject_id == 56)
#> # Source: SQL [1 x 8]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date duration
#> <int> <int> <date> <date> <dbl>
#> 1 1 56 2021-08-08 2022-03-04 209
#> # ℹ 3 more variables: number_exposures <dbl>, impute_duration_percentage <dbl>,
#> # number_eras <dbl>
See that we only have one record with two exposures for subject
number 56. Note that the number of eras is still 1, as we have defined
the same gapEra
as when the cohort was created. However, it
is possible to specify a different gapEra
than the one
defined when the cohort was created.
$acetaminophen_example3 %>%
cdmaddDrugUse(
ingredientConceptId = 1125315,
gapEra = 0
%>%
) filter(subject_id == 56)
#> # Source: SQL [1 x 13]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date duration
#> <int> <int> <date> <date> <dbl>
#> 1 1 56 2021-08-08 2022-03-04 209
#> # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
#> # initial_quantity <dbl>, impute_duration_percentage <dbl>,
#> # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
#> # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>
Notice that number_eras
now indicates that we have two
eras within the same record.
This parameter defines how two different continuous exposures are joined in an era. There are four options:
eraJoinMode = "zero"
(default option): Exposures are
joined considering that the period between both continuous exposures
means the subject is treated with a daily dose of zero. The time between
both exposures contributes to the total exposed time.eraJoinMode = "join"
: Exposures are joined, considering
that the period between both continuous exposures means the subject is
treated with a daily dose of zero. The time between both exposures does
not contribute to the total exposed time.eraJoinMode = "previous"
: Exposures are joined,
considering that the period between both continuous exposures means the
subject is treated with the daily dose of the previous subexposure. The
time between both exposures contributes to the total exposed time.eraJoinMode = "subsequent"
: Exposures are joined,
considering that the period between both continuous exposures means the
subject is treated with the daily dose of the subsequent subexposure.
The time between both exposures contributes to the total exposed
time.overlapMode
parameterThis parameter defines how the overlapping between two exposures that do not start on the same day is resolved inside a subexposure. There are five possible options:
overlapMode* = "sum"
(default): The considered daily
dose is the sum of all the exposures present in the subexposure.overlapMode = minimum
: The considered daily dose is the
minimum of all the exposures in the subexposure.overlapMode = maximum
: The considered daily dose is the
maximum of all the exposures in the subexposure.overlapMode = previous
: The considered daily dose is
that of the earliest exposure.overlapMode = subsequent
: The considered daily dose is
that of the latest exposure.This parameter works similarly to overlapMode
, but it
customizes the overlapping between two exposures starting on the same
date. It includes the options sum
(default),
minimum
, and maximum
described in
overlapMode
.
For example, the following example sets a maximum gap of 30 days for exposures to be joined. It uses the daily dose of the previous subexposure when joining exposures, employs the minimum daily dose for exposures starting on the same day, and considers the minimum daily dose for exposures that overlap.
"acetaminophen_example1"]] %>%
cdm[[addDrugUse(ingredientConceptId = 1125315,
gapEra = 30,
eraJoinMode = "previous",
overlapMode = "minimum",
sameIndexMode = "minimum")
#> # Source: table<og_122_1717442402> [?? x 13]
#> # Database: DuckDB v0.10.0 [martics@Windows 10 x64:R 4.2.3/:memory:]
#> cohort_definition_id subject_id cohort_start_date cohort_end_date duration
#> <int> <int> <date> <date> <dbl>
#> 1 1 31 2000-06-11 2000-10-25 137
#> 2 1 183 2011-05-07 2013-07-10 796
#> 3 1 56 2022-01-27 2022-03-04 37
#> 4 1 66 2018-07-21 2018-10-11 83
#> 5 1 153 2022-03-16 2022-03-28 13
#> 6 1 117 1999-05-22 2009-03-01 3572
#> 7 1 91 2022-09-22 2022-10-10 19
#> 8 1 9 1989-02-04 1992-11-02 1368
#> 9 1 192 2017-07-10 2017-11-13 127
#> 10 1 200 2005-08-29 2005-08-31 3
#> # ℹ more rows
#> # ℹ 8 more variables: number_exposures <dbl>, cumulative_quantity <dbl>,
#> # initial_quantity <dbl>, impute_duration_percentage <dbl>,
#> # number_eras <dbl>, impute_daily_dose_percentage <dbl>,
#> # initial_daily_dose_milligram <dbl>, cumulative_dose_milligram <dbl>
This functions creates a tibble summarising the dose table across multiple cohorts. See an example below:
"acetaminophen_example1"]] <- cdm[["acetaminophen_example1"]] %>%
cdm[[addDrugUse(
cdm = cdm,
ingredientConceptId = 1125315
)
summariseDrugUse(cdm[["acetaminophen_example1"]])
#> # A tibble: 101 × 13
#> result_id cdm_name group_name group_level strata_name strata_level
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 DUS MOCK cohort_name acetaminophen overall overall
#> 2 1 DUS MOCK cohort_name acetaminophen overall overall
#> 3 1 DUS MOCK cohort_name acetaminophen overall overall
#> 4 1 DUS MOCK cohort_name acetaminophen overall overall
#> 5 1 DUS MOCK cohort_name acetaminophen overall overall
#> 6 1 DUS MOCK cohort_name acetaminophen overall overall
#> 7 1 DUS MOCK cohort_name acetaminophen overall overall
#> 8 1 DUS MOCK cohort_name acetaminophen overall overall
#> 9 1 DUS MOCK cohort_name acetaminophen overall overall
#> 10 1 DUS MOCK cohort_name acetaminophen overall overall
#> # ℹ 91 more rows
#> # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
#> # estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#> # additional_name <chr>, additional_level <chr>
We can also stratify our cohort and calculate the estimates within
each strata group by using the strata
parameter.
"acetaminophen_example1"]] <- cdm[["acetaminophen_example1"]] %>%
cdm[[addSex() # Function from PatientProfiles
summariseDrugUse(cdm[["acetaminophen_example1"]],
strata = list("sex" = "sex"))
#> # A tibble: 303 × 13
#> result_id cdm_name group_name group_level strata_name strata_level
#> <int> <chr> <chr> <chr> <chr> <chr>
#> 1 1 DUS MOCK cohort_name acetaminophen overall overall
#> 2 1 DUS MOCK cohort_name acetaminophen overall overall
#> 3 1 DUS MOCK cohort_name acetaminophen overall overall
#> 4 1 DUS MOCK cohort_name acetaminophen overall overall
#> 5 1 DUS MOCK cohort_name acetaminophen overall overall
#> 6 1 DUS MOCK cohort_name acetaminophen overall overall
#> 7 1 DUS MOCK cohort_name acetaminophen overall overall
#> 8 1 DUS MOCK cohort_name acetaminophen overall overall
#> 9 1 DUS MOCK cohort_name acetaminophen overall overall
#> 10 1 DUS MOCK cohort_name acetaminophen overall overall
#> # ℹ 293 more rows
#> # ℹ 7 more variables: variable_name <chr>, variable_level <chr>,
#> # estimate_name <chr>, estimate_type <chr>, estimate_value <chr>,
#> # additional_name <chr>, additional_level <chr>
Customize the estimates to be calculated by using the
drugEstimates
parameter. By default, it will compute the
minimum value, quartiles (5%, 25%, 50% - median, 75% and 95%), the
maximum value, the mean, the standard deviation, and the number of
missings values for each column added with
addDrugUse()
.
Specify the minimum number of individuals that a strata group must have in order to appear in the table.