%\VignetteIndexEntry{Time dependent covariates in Lexis objects: addCov & addDrug}
\SweaveOpts{results=verbatim,keep.source=TRUE,include=FALSE,eps=FALSE}
\documentclass[a4paper,twoside,12pt]{report}

\newcommand{\Title}{Time dependent covariates in\\ \texttt{Lexis} objects}
\newcommand{\Tit}{addLex}
\newcommand{\Version}{Version 6}
\newcommand{\Dates}{March 2024}
\newcommand{\Where}{SDCC}
\newcommand{\Homepage}{\url{http://bendixcarstensen.com/Epi}}
\newcommand{\Faculty}{\begin{tabular}{rl}
Bendix Carstensen
  & Steno Diabetes Center Copenhagen, Herlev, Denmark\\
  & {\small \& Department of Biostatistics,
               University of Copenhagen} \\
  & \texttt{b@bxc.dk}\\
  & \url{http://BendixCarstensen.com} \\[1em]
                      \end{tabular}}
\input{topreport}
<<echo=FALSE>>=
options(width = 90,
        SweaveHooks = list(fig = function()
        par(mar = c(3,3,1,1),
            mgp = c(3,1,0) / 1.6,
            las = 1,
           lend = "butt",
            bty = "n")))
library(Epi)
library(popEpi)
library(dplyr)
library(tidyr)
@ %
\renewcommand{\rwpre}{./addLexis}

<<echo = FALSE>>=
anfang <- Sys.time()
@ %

\chapter{Overview and rationale}

This note describes the functions \texttt{addCov.Lexis}, designed to
add values of clinical measurements, and \texttt{addDrug.Lexis}
designed to add drug exposure to time-split \texttt{Lexis} objects.

If time-dependent variables are binary, such as for example
``occurrence of CVD diagnosis'' it may be relevant to define a new
state as, say, \texttt{CVD}; this is the business of the funtions
\texttt{cutLexis} and its cousins. The purposes of the two functions
discussed here are to append quantitative variables that in principle
can take any value.

Both functions are so-called \texttt{S3} methods for \texttt{Lexis}
objects, so in code you can omit the ``\texttt{.Lexis}''.  Note that
neither \texttt{cutLexis}, \texttt{splitLexis} or \texttt{splitMulti}
are \texttt{S3} methods (there is no ``\texttt{.}'' in the names).

\section{\texttt{addCov.Lexis}}

\ldots provides the ability to amend a \texttt{Lexis} object with
clinical measurements taken at different times, and propagate the
values as LOCF (Last Observation Carried Forward) to all subsequent
records. This means that time-splitting of a \texttt{Lexis} object
\emph{after} adding clinical measurements will be meaningful, because
both \texttt{splitLexis} and \texttt{splitMulti} will carry variables
forward across the split records. The follow-up in the resulting
\texttt{Lexis} object will be cut at dates of clinical measurement.

\texttt{addCov.Lexis} will also propagate missing values supplied as
measurements. Therefore, if you want to have LOCF \emph{across}
supplied times of measurement you must explicitly apply
\texttt{tidyr::fill} to the resulting \texttt{Lexis} object, after a
suitable \texttt{group\_by}.

\section{\texttt{addDrug.Lexis}}

As opposed to this, \texttt{addDrug.Lexis} will first use drug
information at each date of recorded drug purchase, and subsequently
\texttt{compute} cumulative exposure measures at the times in the
resulting \texttt{Lexis} object. This is essentially by linear
interpolation, so it will not be meaningful to further split an
object resulting from \texttt{addDrug.Lexis}---LOCF is not meaningful
for continuously time-varying covariates such as cumulative exposure.

If persons have very frequent drug purchases, the intervals may become
very small and the sheer number of records may present an impediment
to analysis. Therefore the function \texttt{coarse.Lexis} is provided
to collapse adjacent follow-up records. Note that
\texttt{coarse.Lexis} is \emph{not} an \texttt{S3} method.

\chapter{\texttt{addCov.Lexis}}

\section{Rationale}
The function has arisen out of a need to attach values measured at
clinical visits to a Lexis object representing follow-up for events
constituting a multistate model. Hence the data frame with
measurements at clinical visits will be called \texttt{clin} for
mnemonic reasons.

\section{Example}
For illustration we devise a small bogus cohort of 3 people, where we
convert the character dates into numerical variables (fractional
years) using \texttt{cal.yr}. Note that we are using a character
variable as \texttt{id}:
<<>>=
xcoh <- structure(list(id = c("A", "B", "C"),
                    birth = c("1952-07-14", "1954-04-01", "1987-06-10"),
                    entry = c("1965-08-04", "1972-09-08", "1991-12-23"),
                     exit = c("1997-06-27", "1995-05-23", "1998-07-24"),
                     fail = c(1, 0, 1) ),
                   .Names = c("id", "birth", "entry", "exit", "fail"),
                row.names = c("1", "2", "3"),
                    class = "data.frame" )
xcoh$dob <- cal.yr(xcoh$birth)
xcoh$doe <- cal.yr(xcoh$entry)
xcoh$dox <- cal.yr(xcoh$exit )
xcoh
@ %

\subsection{A \texttt{Lexis} object}

Define this as a \texttt{Lexis} object with timescales calendar time
(\texttt{per}, period) and age (\texttt{age}):
<<>>=
Lcoh <- Lexis(entry = list(per = doe),
               exit = list(per = dox,
                           age = dox - dob),
                 id = id,
        exit.status = factor(fail, 0:1, c("Alive","Dead")),
               data = xcoh)
str(Lcoh)
(Lx <- Lcoh[,1:6])
@ %

\subsubsection{Factor or character \texttt{lex.id}?}

Note that when the \texttt{id} argument to \texttt{Lexis} is a
character variable then the \texttt{lex.id} will be a factor. Which,
if each person has a lot of records may save time, but if you subset
may be a waste of space. Moreover merging (\ie joining in the language
of \texttt{tidyverse}) may present problems with different
levels. \texttt{merge} from the \texttt{base} \R, will coerce to
factor with union of levels as levels, where as the \texttt{\_join}
functions from \texttt{dplyr} will coerce to character.

Thus the most reasonable strategy thus seems to keep \texttt{lex.id}
as a character variable.
<<>>=
Lx$lex.id <- as.character(Lx$lex.id)
str(Lx)
Lx
@ %

\subsubsection{Clinical measurements}

Then we generate data frame with clinical examination data, that is
date of examination in \texttt{per}, some (bogus) clinical measurements
and also names of the examination rounds:
<<>>=
clin <- data.frame(lex.id = c("A", "A", "C", "B", "C"),
                      per = cal.yr(c("1977-3-17",
                                     "1973-7-29",
                                     "1996-3-1",
                                     "1990-7-14",
                                     "1989-1-31")),
                       bp = c(120, 140, 160, 157, 145),
                     chol = c(NA, 5, 8, 9, 6),
                     xnam = c("X2", "X1", "X1", "X2", "X0"),
         stringsAsFactors = FALSE)
str(clin)
clin
@ %
We set up this data frame with an \texttt{id} variable called
\texttt{lex.id} and a date of examination, \texttt{per}, that has the
same name as one of the time scales in the Lexis object \texttt{Lx}.
Note that we have chosen a measurement for person \texttt{C} from
1989---before the person's entry to the study, and have an \texttt{NA}
for \texttt{chol} for person \texttt{A}.

\subsection{Adding clinical data}

There is a slightly different behaviour according to whether the
variable with the name of the examination is given or not, and whether
the name of the (incomplete) time scale is given or not:
<<>>=
(Cx <- addCov.Lexis(Lx, clin))
@ %
Note that the clinical measurement preceding the entry of person
\textsc{C} is included, and that the \texttt{tfc} (time from clinical
measurement) is correctly rendered, we a non-zero value at date of entry.

We also see that a variable \texttt{exnam} is constructed with
consecutive numbering of examinations within each person, while the
variable \texttt{xnam} is just carried over as any other.

If we explicitly give the name of the variable holding the examination
names we do not get a constructed \texttt{exnam}. We can also
define the name of the (incomplete) timescale to hold the time
since measurement, in this case as \texttt{tfCl}:
<<>>=
(Dx <- addCov.Lexis(Lx, clin, exnam = "xnam", tfc = "tfCl"))
summary(Dx, t=T)
@ %

\section{Exchanging split and add}

As noted in the beginning of this note, \texttt{addCov.Lexis} uses LOCF,
and so it is commutative with \texttt{splitLexis}:
<<>>=
# split BEFORE add
Lb <- addCov.Lexis(splitLexis(Lx,
                      time.scale = "age",
                          breaks = seq(0, 80, 5)),
                   clin,
                   exnam = "xnam" )
Lb
#
# split AFTER add
La <- splitLexis(addCov.Lexis(Lx,
                            clin,
                           exnam = "xnam" ),
                 time.scale = "age",
                     breaks = seq(0, 80, 5))
La
@ %
We see that the results are identical, bar the sequence of variables
and attributes.

We can more explicitly verify that the resulting data frames are the same:
<<>>=
La$tfc == Lb$tfc
La$age == Lb$age
La$per == Lb$per
@ %
The same goes for \texttt{splitMulti}:
<<>>=
## split BEFORE add
Mb <- addCov.Lexis(splitMulti(Lx, age = seq(0, 80, 5)),
                   clin,
                   exnam = "xnam" )
##
## split AFTER add
Ma <- splitMulti(addCov.Lexis(Lx,
                              clin,
                              exnam = "xnam" ),
                 age = seq(0, 80, 5))
La$tfc == Mb$tfc
Ma$tfc == Mb$tfc
@ %
In summary, because both \texttt{addCov.Lexis} and
\texttt{splitLexis}/\texttt{splitMulti} use LOCF for covariates the
order of splitting and adding does not matter.

This is certainly not the case with \textrm{addDrug.Lexis} as we shall see.

\section{Filling the \texttt{NA}s}

As mentioned in the beginning, clinical measurements given as
\texttt{NA} in the \texttt{clin} data frame are carried forward. If
you want to have these replaced by 'older' clinical measurements you
can do that explicitly by \texttt{dplyr::fill} with a construction
like:
<<>>=
cov <- c("bp", "chol")
Lx <- La
Lx <- group_by(Lx, lex.id) %>%
         fill(all_of(cov)) %>%
      ungroup()
class(Lx)
@ %
We see that the \texttt{Lexis} attributes are lost by using the
\texttt{group\_by} function, so we fish out the covariates from the
\texttt{tibble} and stick them back into the \texttt{Lexis} object:
<<>>=
Lx <- La
Lx[,cov] <- as.data.frame(group_by(Lx, lex.id)
                          %>% fill(all_of(cov)))[,cov]
class(Lx)
La
Lx
@ %
The slightly convoluted code where the covariate columns are
explicitly selected, owes to the fact that the \texttt{dplyr}
functions will strip the data frames of the \texttt{Lexis}
attributes. So we needed to use \texttt{fill} to just generate the
covariates and not touch the \texttt{Lexis} object itself.

This should of course be built into \texttt{addCov.Lexis} as a
separate argument, but is not yet.

Note that the \texttt{tfc}, time from clinical measurement, is now not
a valid time scale variable any more; the 5 in \texttt{chol} is
measured at 1973.7 but \texttt{tfc} is reset to 0 at 1977.21, even if
only \texttt{bp} but not \texttt{chol} is measured at that time.  If
you want that remedied you will have to use \texttt{addCov.Lexis}
twice, one with a \texttt{clin} data frame with only \texttt{bp} and
another with a data frame with only \texttt{chol}, each generating a
differently named variable holding the time from clinical measurement.

This is a problem that comes from the structure of the supplied
\emph{data} not from the program features; in the example we had
basically measurements of different clinical variables at different
times, and so necessarily also a need for different times since last
measurement.

\chapter{\texttt{addDrug.Lexis}}

The general purpose of the function is to amend a \texttt{Lexis}
object with drug exposure data. The data base with information on a
specific drug is assumed to be a data frame with one entry per drug
purchase (or prescription), containing the date and the amount
purchased and optionally the prescribed dosage (that is how much is
supposed to be taken per time). We assume that we have such a data
base for each drug of interest, which also includes an id variable,
\texttt{lex.id}, that matches the \texttt{lex.id} variable in the
\texttt{Lexis} object.

For each type of drug the function derives 4 variables:
\begin{description}[noitemsep]
 \item[\hspace*{1em}\texttt{ex}]: logical, is the person currently
   \texttt{ex}posed
 \item[\hspace*{1em}\texttt{tf}]: numeric, \texttt{t}ime since
   \texttt{f}irst purchase
 \item[\hspace*{1em}\texttt{ct}]: numeric, \texttt{c}umulative
   \texttt{t}ime on the drug
 \item[\hspace*{1em}\texttt{cd}]: numeric, \texttt{c}umulative
   \texttt{d}ose of the drug
\end{description}
These names are pre- or suf-fixed by the drug name, so that exposures
to different drugs can be distinguished; see the examples.

The resulting \texttt{Lexis} object has extra records corresponding to
cuts at each drug purchase and at each expiry date of a purchase.  For
each purchase the coverage period is derived (different methods for
this are available), and if the end of this (the expiry date) is
earlier than the next purchase of the person, the person is considered
off the drug from the expiry date, and a cut in the follow-up is
generated with \texttt{ex} set to \texttt{FALSE}.

\section{The help example}

The following is a slight modification of the code from the example
section of the help page for \texttt{addDrug.Lexis}

First we generate follow-up of 2 persons, and split the follow-up in
intervals of length 0.6 years along the calendar time scale,
\texttt{per}:
<<>>=
fu <- data.frame(doe = c(2006, 2008),
                 dox = c(2015, 2018),
                 dob = c(1950, 1951),
                 xst = factor(c("A","D")))
Lx <- Lexis(entry = list(per = doe,
                         age = doe- dob),
             exit = list(per = dox),
      exit.status = xst,
             data = fu)
Lx <- subset(Lx, select = -c(doe, dob, dox, xst))
Sx <- splitLexis(Lx, "per", breaks = seq(1990, 2020, 0.6))
summary(Sx)
str(Sx)
@ %
Note that as opposed to the previous example, the time scales are not
of class \texttt{cal.yr}, they are just numerical.

Then we generate example drug purchases for these two persons, one
data frame for each of the drugs \texttt{F} and \texttt{G}. Note that
we generate \texttt{lex.id}$\in (1,2)$ referring to the values of
\texttt{lex.id} in the lexis object \texttt{Sx}.
<<>>=
set.seed(1952)
rf <- data.frame(per = c(2005 + runif(12, 0, 10)),
                 amt = sample(2:4, 12, replace = TRUE),
              lex.id = sample(1:2, 12, replace = TRUE)) %>%
      arrange(lex.id, per)

rg <- data.frame(per = c(2009 + runif(10, 0, 10)),
                 amt = sample(round(2:4/3,1), 10, replace = TRUE),
              lex.id = sample(1:2, 10, replace = TRUE)) %>%
      arrange(lex.id, per)
@ %
We do not need to sort the drug purchase data frames (it is done
internally by \texttt{addDrug.Lexis}), but it makes it easier to grasp
the structure. Note that we generated the drug purchase files with the
required variable names.

The way purchase data is supplied to the function is in a
\texttt{list} where each element is a data frame of purchase records
for one type of drug. The list must be named, because the names are
used as prefixes of the generated exposure variables. We can show the
resulting data in a list:
<<>>=
pdat <- list(F = rf, G = rg)
pdat
Lx
@ %
Note that we have generated data so that there are drug purchases of
drug \texttt{F} that is \emph{before} start of follow-up for person 2.

We can then expand the time-split \texttt{Lexis} object, \texttt{Sx}
with the drug information. \texttt{addDrug.Lexis} not only adds 8
\emph{variables} (4 from each drug), it also adds \emph{records}
representing cuts at the purchase dates and possible expiry dates.
<<>>=
summary(Sx) ; names(Sx)
ex1 <- addDrug.Lexis(Sx, pdat, method = "ext") # default
summary(ex1) ; names(ex1)
print(ex1, nd = 2)
ex2 <- addDrug.Lexis(Sx, pdat, method = "ext", grace = 0.5)
summary(ex2)
print(ex2, nd = 2)
dos <- addDrug.Lexis(Sx, pdat, method = "dos", dpt = 6)
summary(dos)
print(dos, nd = 2)
fix <- addDrug.Lexis(Sx, pdat, method = "fix", maxt = 1)
summary(fix)
print(fix, nd = 2)
@ %

\section{A more realistic example with run times}

\subsection{Follow-up data: \texttt{DMlate}}

As example data we use rows from the \texttt{DMlate} example data from
the \texttt{Epi} package:
<<>>=
data(DMlate) ; str(DMlate)
Lx <- Lexis(entry = list(per = dodm,
                         age = dodm - dobth,
                         tfd = 0),
             exit = list(per = dox),
      exit.status = factor(!is.na(dodth),
                           labels = c("DM", "Dead")),
             data = DMlate[sample(1:nrow(DMlate), 1000),])
summary(Lx)
@ %
We split the data along the age-scale (omitting the variables we shall
not need):
<<>>=
Sx <- splitLexis(Lx[,1:7], time.scale="age", breaks = 0:120)
summary(Sx)
str(Sx)
@ %

\subsection{Artificial prescription data}

To explore how \texttt{addDrug.Lexis} works, we need drug exposure
data, but these are unfortunately not available, so we simulate three
datasets representing \texttt{pur}chases of three types of drugs:
<<>>=
set.seed(1952)

purA <-
  ( data.frame(lex.id = rep(Lx$lex.id,
                            round(runif(nrow(Lx), 0, 20))))
%>% left_join(Lx[,c("lex.id", "dodm", "dox")])
%>% mutate(per = dodm + runif(length(dodm), -0.1, 0.99) * (dox - dodm),
           amt = sample(4:20*10, length(dodm), replace = TRUE),
           dpt = amt * round(runif(length(dodm), 3, 7)))
%>% select(-dodm, -dox)
%>% arrange(lex.id, per)
  )
addmargins(table(table(purA$lex.id)))
str(purA)

purB <-
  ( data.frame(lex.id = rep(Lx$lex.id,
                            round(pmax(runif(nrow(Lx), -10, 15), 0))))
%>% left_join(Lx[,c("lex.id", "dodm", "dox")])
%>% mutate(per = dodm + runif(length(dodm), -0.1, 0.99) * (dox - dodm),
           amt = sample(4:20*10, length(dodm), replace = TRUE),
           dpt = amt * round(runif(length(dodm), 5, 9)))
%>% select(-dodm, -dox)
%>% arrange(lex.id, per)
  ) -> purB
addmargins(table(table(purB$lex.id)))
str(purB)

purC <-
  ( data.frame(lex.id = rep(Lx$lex.id,
                            round(pmax(runif(nrow(Lx), -5, 12), 0))))
%>% left_join(Lx[,c("lex.id", "dodm", "dox")])
%>% mutate(per = dodm + runif(length(dodm), -0.1, 0.99) * (dox - dodm),
           amt = sample(4:20*10, length(dodm), replace = TRUE),
           dpt = amt * round(runif(length(dodm), 5, 7)))
%>% select(-dodm, -dox)
%>% arrange(lex.id, per)
  )
addmargins(table(table(purC$lex.id)))
str(purC)
head(purC)
@ %
Note that the time scale is in years, so the \texttt{dpt} must be in
amount per year, so that $\mathtt{dpt}/\mathtt{amt}$ is the
approximate number of annual drug purchases.

We now have three artificial drug purchase datasets so we can see how
\texttt{addDrug.Lexis} performs on larger datasets:

\subsection{Using \texttt{addDrug}}

\subsubsection{100 and 500 persons}

We start out with a small sample and a three month grace period to limit
the number of gaps:
<<>>=
Sx1 <- subset(Sx, lex.id < 100)
pur <- list(A = subset(purA, lex.id < 1000),
            B = subset(purB, lex.id < 1000),
            C = subset(purC, lex.id < 1000))
system.time(ad1 <- addDrug.Lexis(Sx1, pur, tnam = "per", grace = 1/4))
summary(Sx1)
summary(ad1)
@ %
We then cut the number of persons in half to assess how run time
depends on no. of persons in the data:
<<>>=
Sx2 <- subset(Sx, lex.id < 500)
pur <- list(A = subset(purA, lex.id < 500),
            B = subset(purB, lex.id < 500),
            C = subset(purC, lex.id < 500))
system.time(ad2 <- addDrug.Lexis(Sx2, pur, tnam = "per", grace = 1/6))
summary(Sx2)
summary(ad2)
@ %
\ldots timing is broadly proportional to the number of persons.

\subsubsection{Fewer prescription records}

We can try to cut the number of purchases in half:
<<>>=
pur <- list(A = subset(purA, lex.id < 100 & runif(nrow(purA)) < 0.5),
            B = subset(purB, lex.id < 100 & runif(nrow(purB)) < 0.5),
            C = subset(purC, lex.id < 100 & runif(nrow(purC)) < 0.5))
sapply(pur, nrow)
system.time(ad3 <- addDrug.Lexis(Sx1, pur, tnam = "per", grace = 1/6))
summary(Sx1)
summary(ad3)
@ %
It also appears that the number of purchases per person is also a
determinant of the run time too; the timing is largely proportional to
the number of drug records.

In any concrete application it is recommended to run the function on
a fairly small sample of persons, say 1000 to get a feel for the run
time. It may also be a good idea to run the function on chunks of the
persons, to make sure that you do not lose all the processed data in a
crash.

\subsubsection{Fewer prescription types}

Finally we try to cut the number of drugs, to assess how this
influences the run time:
<<>>=
pur <- list(B = subset(purB, lex.id < 100),
            C = subset(purC, lex.id < 100))
sapply(pur, nrow)
system.time(ad4 <- addDrug.Lexis(Sx1, pur, tnam = "per", grace = 1/6))
summary(Sx1)
summary(ad4)
@ %
We see that the number of drugs also influence the run time
proportionally.

\subsection{Too many records ---\texttt{coarse.Lexis}}

If we look at the length of the intervals as given in \texttt{lex.dur}
we see that some are are quite small:
<<>>=
summary(ad1$lex.dur)
@ %
Half are smaller then 0.11 years, 40 days. We could without much loss
of precision in the analysis based on the \texttt{Lexis} object merge
adjacent records that have total risk time less than 3 months.

The function \texttt{coarse.Lexis} will collapse records with short
\texttt{lex.dur} with the subsequent record. The collapsing will use
the covariates from the first record, and so the entire follow-up from
the two records will have the characteristics of the first. Therefore
it is wise to choose first records with reasonably short
\texttt{lex.dur}---the approximation will be better than if the first
record was with a larger \texttt{lex.dur}. Therefore there are two
values supplied to \texttt{coarse.Lexis}; the maximal length of the
first record's \texttt{lex.dur} and the maximal length of the
\texttt{lex.dur} in the resulting combined record. The larger these
parameters are, the more the \texttt{Lexis} object is coarsened.
<<>>=
summary(ad1)
summary(adc <- coarse.Lexis(ad1, lim = c(1/6,1/2)))
summary(adc$lex.dur)
@ %
This could cut the number of units for analysis substantially, in this
case from about 27,000 to some 13,000.

\subsubsection{Records to be kept}

When we are dealing with drug exposure data we will be interested
keeping the record that holds the start of a drug exposure. Some may
argue that it does not matter much, though.

The records (\ie beginnings of FU intervals) that should be kept must
be given in logical vector in the argument \texttt{keep}:
<<>>=
summary(Sx2)
system.time(ad4 <- addDrug.Lexis(Sx2,
                                 pur,
                                tnam = "per",
                               grace = 1/6))
summary(ad4)
#
ad5 <- coarse.Lexis(ad4,
                    lim = c(1/4, 1/2))
summary(ad5)
@
We can identify the first date of exposure to drug \texttt{B}, say, by
the exposure (\texttt{B.ex}) being true and the cumulative time on the
drug (\texttt{B.ct}) being 0:
<<>>=
ad4$keep <- with(ad4, (B.ex & B.ct == 0) |
                      (C.ex & C.ct == 0))
ad6 <- coarse.Lexis(ad4,
                    lim = c(1/4, 1/2),
                   keep = ad4$keep)
summary(ad6)
@ %
We see the expected behaviour when we use \texttt{coarse.Lexis}: we get
fewer records, but identical follow-up. And the \texttt{keep} argument
gives the possibility to keep selected records, or more precisely
beginnings. \texttt{keep} prevents a record to be collapsed with a
previous one, but not with a subsequent one.

%% \subsection{The entire example dataset}

%% The entire amount of example data consist of some 10,000 persons and
%% some 200,000 prescriptions:
%% <<eval=FALSE>>=
%% dim(Sx)
%% pur <- list(A = purA,
%%             B = purB,
%%             C = purC)
%% sapply(pur, nrow)
%% system.time(adx <- addDrug.Lexis(Sx, pur, tnam = "per", grace = 1/6))
%% system.time(adc <- coarse.Lexis(adx, lim = c(1/6, 1/2)))
%% summary(Sx)
%% summary(adx)
%% summary(adc)
%% @ %
%% We see hat the number of records is quite large because we have cut at
%% all purchase dates and integer ages. For practical purposes we might
%% therefore want to merge successive records with a total duration
%% \texttt{lex.dur} less than some limit.

<<echo = FALSE>>=
ende <- Sys.time()
cat("  Start time:", format(anfang, "%F, %T"),
  "\n    End time:", format(  ende, "%F, %T"),
  "\nElapsed time:", round(difftime(ende, anfang, units = "mins"), 2), "minutes\n")
@ %

\bibliographystyle{plain}

\begin{thebibliography}{1}

\bibitem{Carstensen.2007a}
B~Carstensen.
\newblock Age-{P}eriod-{C}ohort models for the {L}exis diagram.
\newblock {\em Statistics in Medicine}, 26(15):3018--3045, 2007.

\bibitem{Carstensen.2008c}
B~Carstensen, JK~Kristensen, P~Ottosen, and K~Borch-Johnsen.
\newblock The {D}anish {N}ational {D}iabetes {R}egister: {T}rends in incidence,
  prevalence and mortality.
\newblock {\em Diabetologia}, 51:2187--2196, 2008.

\end{thebibliography}

\addcontentsline{toc}{chapter}{References}

\end{document}