%\VignetteIndexEntry{Meta-Analysis of Diagnostic Accuracy with mada}
%\VignetteKeyword{meta-analysis}
%\VignetteKeyword{diagnostic test}


% \documentclass[a4paper]{article}
% \usepackage{a4wide}
% \usepackage{color,amsthm}
% \usepackage{url}

\documentclass[nojss]{jss}
%\documentclass{jss}


%% need no 
\usepackage{Sweave}

\usepackage{amsmath}
\usepackage[utf8]{inputenc}

\DeclareMathOperator{\logit}{logit}
\newcommand{\T}{\mathsf{T}}
\newcommand{\sg}{\sigma}


\title{Meta-Analysis of Diagnostic Accuracy with  \pkg{mada}}
\author{Philipp Doebler \\
      TU Dortmund University
      \And
      Heinz Holling \\
      WWU M\"unster
      \And
      Bernardo Sousa-Pinto\\ 
      University of Porto}

\Plainauthor{Philipp Doebler, Heinz Holling} %% comma-separated
\Plaintitle{Meta-Analysis of Diagnostic Accuracy with mada} %% without formatting
%%\Shorttitle{\pkg{foo}: A Capitalized Title} %% a short title (if necessary)


\Address{
  Philipp Doebler\\
  Fachbereich Psychologie und Sportwissenschaft\\
  Westf\"alische Wilhelms-Universit\"at M\"unster\\
  D-48149 M\"unster, Germany\\
  E-mail: \email{doebler@uni-muenster.de}\\
  URL: \url{http://wwwpsy.uni-muenster.de/Psychologie.inst4/AEHolling/personen/P_Doebler.html}\\
  
  
  Heinz Holling\\
  Fachbereich Psychologie und Sportwissenschaft\\
  Westf\"alische Wilhelms-Universit\"at M\"unster\\
  D-48149 M\"unster, Germany\\
  E-mail: \email{holling@uni-muenster.de}\\
  URL: \url{http://wwwpsy.uni-muenster.de/Psychologie.inst4/AEHolling/personen/holling.html}
}


%% an abstract and keywords
\Abstract{The \proglang{R}-package \pkg{mada} is a tool for the meta-analysis of diagnostic accuracy. In contrast to univariate meta-analysis, diagnostic meta-analysis requires bivariate models. An additional challenge is to provide a summary receiver operating characteristic curves that seek to integrate receiver operator characteristic curves of primary studies. The package implements the approach of \citet{reitsma2005bivariate}, which in the absence of covariates is equivalent to the HSROC model of \citet{rutter2001hierarchical}. More recent models by \citet{doebler2012mixed} and \citet{boehningphm} are also available, including meta-regression for the first approach. In addition a range of functions for descriptive statistics and graphics are provided.}
\Keywords{diagnostic meta-analysis, multivariate statistics, summary receiver operating characteristic, \proglang{R}}
\Plainkeywords{diagnostic meta-analysis, multivariate statistics, summary receiver operating characteristic, R} %% without formatting


%% publication information
%% NOTE: Typically, this can be left commented and will be filled out by the technical editor
%% \Volume{50}
%% \Issue{9}
%% \Month{June}
%% \Year{2012}
%% \Submitdate{2012-06-04}
%% \Acceptdate{2012-06-04}

                                                                     
                                                                     
                                                                     
                                             
% JSS 1060 Doebler, Holling 
% 
% Meta-Analysis of Diagnostic Accuracy with mada
% 
% For further instruction on JSS style requirements please see the JSS style
% manual(in particular section 2.1 Style Checklist) at 
% http://www.jstatsoft.org/downloads/JSSstyle.zip
% 
% Please see FAQ at: http://www.jstatsoft.org/style
% And for further references please see RECENT JSS papers for detailed
% documentation and examples. 
% 


\begin{document}
\SweaveOpts{concordance=TRUE}



<<echo=FALSE>>=
options(prompt = "R> ", continue = "+  ", width = 70, useFancyQuotes = FALSE)
@

\maketitle

\section{Introduction}
While substantial work has been conducted on methods for diagnostic meta-analysis, it has not become a routine procedure yet. One of the reasons for this is certainly the complexity of bivariate approaches, but another reason is that standard software packages for meta-analysis, for example \pkg{Comprehensive Meta-Analysis} and \pkg{RevMan} \citep{cma,revman}, do not include software to fit models appropriate for diagnostic meta-analysis. For the recommended \citep{leeflang2008systematic} bivariate approach of \citet{rutter2001hierarchical} meta-analysts can use Bayesian approaches (for example in \proglang{WinBUGS} \citep{lunn2000winbugs} or \proglang{OpenBUGS} \citep{lunn2009bugs}), the stata module \pkg{metandi} \citep{harbord2010metandi}, or the \proglang{SAS} macro \pkg{METADAS} \citep{takwoingi2008metadas}. So currently available software is either relatively complex (\proglang{WinBUGS}/\proglang{OpenBUGS}) or proprietary (\proglang{stata}, \proglang{SAS}).

The open source package \pkg{mada} written in \proglang{R} \citep{Rmanual} provides some established and some current approaches to diagnostic meta-analysis, as well as functions to produce descriptive statistics and graphics. It is hopefully complete enough to be the only tool needed for a diagnostic meta-analysis. \pkg{mada} has been developed with an \proglang{R} user in mind that has used standard model fitting functions before, and a lot of the output of \pkg{mada} will look familiar to such a user. While this paper cannot provide an introduction to \proglang{R}, it is hopefully detailed enough to provide a novice \proglang{R} user with enough hints to perform diagnostic meta-analysis along the lines of it. Free introductions to \proglang{R} are for example available on the homepage of the \href{http://www.r-project.org/}{\proglang{R} project}. We assume that the reader is familiar with central concepts of meta-analysis, like fixed and random effects models \citep[for example][]{borenstein2009introductionto} and ideas behind diagnostic accuracy meta-analysis and (S)ROC curves \citep[starting points could be][]{sutton2000methods,walter2002properties,jones2005summary,leeflang2008systematic}.

\section{Obtaining the package}
Once \proglang{R} is installed and an internet connection is available, the package can be installed from CRAN on most systems by typing
<<eval=FALSE>>=
install.packages("mada")
@
Development of \pkg{mada} is hosted at \texttt{http://r-forge.r-project.org/projects/mada/}; the most current version is available there\footnote{For example by typing \texttt{install.packages("mada", repos="http://R-Forge.R-project.org")} at an \proglang{R} prompt.}, while only stable versions are available from CRAN. The package can then be loaded:
<<>>=
library("mada")
@

\section{Entering data}
Primary diagnostic studies observe the result of a \emph{gold standard} procedure which defines the presence or absence of a \emph{condition}, and the result of a  \emph{diagnostic test} (typically some kind of low cost procedure, or at least one that is less invasive than the gold standard). Data from such a primary study could be reported in a $2\times2$ table, see Table~\ref{2times2}.

\begin{table}[h]\caption{Data from the $i$th study in a $2\times2$ table.}\label{2times2}
\begin{center}
\begin{tabular}[tbp]{lcc}\hline
  		&with condition &without condition \\
\hline
Test positive	& $y_i$		& $z_i$ \\
Test negative	& $m_i-y_i$ 	&$n_i-z_i$\\ 
\hline
Total&$m_i$&$n_i$\\ \hline
\end{tabular}
\end{center}
\end{table}

The numbers $y_i$ and $z_i$ are the numbers of true-positives (TP) and false positives (FP), respectively, and $m_i-y_i$ and $n_i-z_i$ are the numbers of false negatives (FN) and true negatives (TN). Often derived measures of diagnostic accuracy are calculated from 2$\times$2 tables. Using the notation in Table~\ref{2times2}, one can calculate

\begin{eqnarray}
p_i = \text{sensitivity of $i$th study}  &=& \frac{y_i}{m_i}\\
u_i = \text{false positive rate of $i$th study} &=& \frac{z_i}{n_i}\\
1- u_i = \text{specificity of $i$th study}  &=& \frac{n_i - z_i}{n_i}.
\end{eqnarray}

Basically all functions in the \pkg{mada} package need data from 2$\times$2 tables. One can use \proglang{R} to calculate the table given specificities or sensitivities if the sample size in each group is known (sometimes there is insufficient data to reconstruct the 2$\times$2 table). The above formulae for the sensitivity for example implies that
\[
y_i = m_ip_i.
\]
If a primary study reports a sensitivity of .944 and that there were 142 people with the condition, we can calculate $y$ by
<<>>=
y <- 142 * .944 
y
@
Since this is not an integer, we need to round it to the nearest integer
<<>>=
round(y)
@
Note that \pkg{mada} is a bit paranoid about the input: it demands that the data and the rounded data are identical to prevent some obvious error.  Hence the use of the \code{round} function should not be omitted.

Let us now assume that the number of TP, FP, FN and TN is known for each primary study. A good way to organise information in \proglang{R} is to use \emph{data frames}, which can hold different variables. In our case each row of the data frame corresponds to one primary study. As an example we enter the data from six studies from a meta-analysis of the  AUDIT-C \citep[a short screening test for alcohol problems,][]{kriston2008meta} into a data frame
<<>>=
AuditC6 <- data.frame(TP = c(47, 126, 19, 36, 130, 84),
                      FN = c(9, 51, 10, 3, 19, 2),
                      FP = c(101, 272, 12, 78, 211, 68),
                      TN = c(738, 1543, 192, 276, 959, 89))
AuditC6
@
Note that many central functions in \pkg{mada} also accept four vectors of frequencies (TP, FN, FP, TN) as input. Nevertheless, it is convenient to store not only the observed frequencies, but also the study names in the same data frame. The following command shows how to do this for our shortened example:
<<>>=
AuditC6$names <- c("Study 1", "Study 2", "Study 4",
                   "Study 4", "Study 5", "Study 6")
@
The full data set with 14 studies is part of \pkg{mada}; 
let's load the data set and have a look at the last six studies:
<<>>=
data("AuditC")
tail(AuditC)
@
In the following we will use the \code{AuditC} data set as a running example.

\subsection{Zero cells}
In the analysis of data in 2$\times$2 tables zero cells often lead to problems or statistical artefacts since certain ratios do not exist. So called \emph{continuity corrections} are added to the observed frequencies; these are small positive numbers. One suggestions in the literature is to use $0.5$ as the continuity correction, which is the default value in \pkg{mada}. All relevant functions in \pkg{mada} allow user specified continuity corrections and the correction can be applied to all studies, or just to those with zero cells.


\section{Descriptive statistics}
Descriptive statistics for a data set include the sensitivity, specificity and false-positive rate of the primary studies and also their positive and negative likelihood ratios ($\mathrm{LR}_+, \mathrm{LR}_-$), and their diagnostic odds ratio \citep[DOR;][]{glas2003diagnostic}. These are defined as
\[
\mathrm{LR}_+ = \frac{p}{u} = \frac{\text{sensitivity}}{\text{false positive rate}},
\]
\[
\mathrm{LR}_- = \frac{1-p}{1-u},
\]
and
\[
\mathrm{DOR} = \frac{\mathrm{LR}_+}{\mathrm{LR}_-} = \frac{\mathrm{TP}\cdot\mathrm{TN}}{\mathrm{FN}\cdot\mathrm{FP}}.
\]
All these are easily computed using the \code{madad} function, together with their confidence intervals. We use the formulae provided by \citet{deeks2001systematic}. \code{madad} also performs $\chi^2$ tests to assess heterogeneity of sensitivities and specificities, the null hypothesis being in both cases, that all are equal. Finally the correlation of sensitivities and false positive rates is calculated to give a hint whether the cut-off value problem is present. The following output is slightly cropped.
<<eval=FALSE>>=
madad(AuditC)
@
\begin{Soutput}
Descriptive summary of AuditC with 14 primary studies.
Confidence level for all calculations set to 95 %
Using a continuity correction of 0.5 if applicable 

Diagnostic accuracies 
       sens  2.5% 97.5%  spec  2.5% 97.5%
 [1,] 0.833 0.716 0.908 0.879 0.855 0.899
 [2,] 0.711 0.640 0.772 0.850 0.833 0.866
...
[14,] 0.748 0.684 0.802 0.749 0.702 0.792

Test for equality of sensitivities: 
X-squared = 272.3603, df = 13, p-value = <2e-16
Test for equality of specificities: 
X-squared = 2204.8, df = 13, p-value = <2e-16

Diagnostic OR and likelihood ratios 
           DOR   2.5%     97.5%  posLR  2.5%  97.5% negLR  2.5% 97.5%
 [1,]   36.379 17.587    75.251  6.897 5.556  8.561 0.190 0.106 0.339
...
[14,]    8.850  5.949    13.165  2.982 2.448  3.632 0.337 0.264 0.430

Correlation of sensitivities and false positive rates: 
   rho  2.5 % 97.5 % 
 0.677  0.228  0.888 
\end{Soutput}
The \code{madad} function has a range of options with respect to computational details; for example one can compute 80\% confidence intervals:
<<eval=FALSE>>=
madad(AuditC, level = 0.80)
@
Also note that all the output of \code{madad} is available for further computations if one assigns the output of \code{madad} to an object. For example the false positive rates with their confidence intervals can be extracted  using the \code{\$} construct (output cropped):
<<results=hide>>=
AuditC.d <- madad(AuditC)
AuditC.d$fpr
@
\begin{Soutput}
$fpr
 [1] 0.12083333 0.15005507 0.06097561 0.22112676 0.18061486 0.43354430
 [7] 0.20988806 0.52006770 0.28906250 0.17008929 0.23068670 0.19131238
[13] 0.27564103 0.25070822

$fpr.ci
            2.5%     97.5%
 [1,] 0.10050071 0.1446182
...
[14,] 0.20834216 0.2984416
\end{Soutput}
\subsection{Descriptive graphics}
For the AUDIT-C data, the $\chi^2$ tests already suggested heterogeneity of sensitivities and specificities. The corresponding \emph{forest plots} confirm this:
<<eval=FALSE>>=
forest(madad(AuditC), type = "sens")
forest(madad(AuditC), type = "spec")
@
These plots are shown in Figure~\ref{forestplots}.
<<echo=FALSE, results = hide>>=
pdf(file = "pairedforest.pdf", width = 12, height = 6)
par(omi = c(0,0,0,0), mai = c(0.9,0.3,0.3,0.3))
plot.new()
par(fig = c(0, 0.5, 0, 1), pty = "s", new = TRUE)
forest(AuditC.d, type = "sens", xlab = "Sensitivity")
par(fig = c(0.5, 1, 0, 1), pty = "s", new = TRUE)
forest(AuditC.d, type = "spec", xlab = "Specificity")
dev.off()
@

\setkeys{Gin}{width=\textwidth}

\begin{figure}
\begin{center}
\includegraphics{pairedforest}
\caption{Paired forest plot for AUDIT-C data.}\label{forestplots}
\end{center}
\end{figure}




Apart from these univariate graphics \pkg{mada} provides a variety of plots to study the data on ROC space. Note that for exploratory purposes it is often useful to employ color and other features of\proglang{R}'s plotting system. Two high level plots are provided by \pkg{mada}: \code{crosshair} to produce crosshair plots \citep{phillips2010cross}, and \code{ROCellipse}. The following is an example of a call of \code{crosshair} that produces (arbitrarily) colored crosshairs and makes the crosshairs wider with increased sample size; also only a portion of ROC space is plotted.
<<>>=
rs <- rowSums(AuditC)
weights <- 4 * rs / max(rs)
crosshair(AuditC, xlim = c(0,0.6), ylim = c(0.4,1), 
          col = 1:14, lwd = weights)
@
Figure~\ref{diagplots} displays this plot and the next descriptive plot:
\code{ROCellipse} plots confidence regions which describe the uncertainty of the pair of sensitivity and false positive rate. These regions are ellipses on logit ROC space, and by back-transforming them to regular ROC space the (sometimes oddly shaped) regions are produced. By default this function will also plot the point estimates. The following example is a bit contrived, but showcases the flexibility of \code{ROCellipse}: here the plotting of the point estimates is suppressed manipulating the \code{pch} argument, but then points are added in the next step.
<<>>=
ROCellipse(AuditC, pch = "")
points(fpr(AuditC), sens(AuditC))
@

<<echo=FALSE, results=hide>>=
pdf(file = "diagplots.pdf", width = 12, height = 6)
par(omi = c(0,0,0,0), mai = c(0.9,0.9,0.3,0.3))
plot.new()
par(fig = c(0, 0.5, 0, 1), pty = "s", new = TRUE)
crosshair(AuditC, xlim = c(0,0.6), ylim = c(0.4,1), col = 1:14, lwd = weights)
par(fig = c(0.5, 1, 0, 1), pty = "s", new = TRUE)
ROCellipse(AuditC, pch = "")
points(fpr(AuditC), sens(AuditC))
dev.off()
@

\setkeys{Gin}{width=\textwidth}

\begin{figure}
\begin{center}
\includegraphics{diagplots}
\caption{A ``weighted'' crosshair plot with (arbitrary) coloring and a plot with confidence regions for primary study estimates.}\label{diagplots}
\end{center}
\end{figure}



\section{Univariate approaches}
Before the advent of the bivariate approaches by \citet{rutter2001hierarchical} and \citet{reitsma2005bivariate}, some univariate approaches to the meta-analysis of diagnostic accuracy were more popular.  Bivariate approaches cannot be recommended if the sample size is too small. The bivariate model of \citet{reitsma2005bivariate} for example has 5 parameters, which would clearly be too much for a handful of studies. Hence \pkg{mada} provides some univariate methods. Since pooling sensitivities or specificities can be misleading \citep{gatsonis2006meta}, options for the univariate meta-analysis of these are not provided. \pkg{mada} does provide approaches for the DOR \citep{glas2003diagnostic}, the positive and negative likelihood ratios, and $\theta$, the accuracy parameter of the proportional hazards model for diagnostic meta-analysis \citep{boehningphm}. In this vignette we explain the details on the DOR methodology and the methods for $\theta$.

\subsection{Diagnostic odds ratio}
In analogy to the meta-analysis of the odds ratio (OR) methods for the meta-analysis of the DOR can be developed \citep{glas2003diagnostic}. For the \emph{fixed effects} case a Mantel-Haenszel \citep[MH; see for example][]{deeks2001systematic} is provided by \pkg{mada}. The underlying fixed effects model has the form
\[
\mathrm{DOR}_i = \mu + \epsilon_i,
\]
where $\mu$ is true underlying DOR and the $\epsilon_i$ are independent errors with mean 0 and study specific variance. The MH estimator is a weighted average of DORs observed in the primary studies and is robust to the presence of zero cells. It takes the form
\[
\hat \mu = \sum_i \frac{\omega_i^{MH} \mathrm{DOR}_i}{\sum_i \omega_i^{MH}},
\]
where $\omega_i^{MH} = \frac{z_i(m_i - y_i)}{m_i + n_i}$ are the Mantel-Haenszel weights.

One obtains an estimator for a \emph{random effects} model following the approach of DerSimonian and Laird \citep[DSL;][]{dersimonian1986meta}. Here the underlying model is in terms of the $\log$ DORs. One assumes
\[
\log\mathrm{DOR}_i = \mu + \epsilon_i + \delta_i,
\]
where $\mu$ is the mean of the $\log$ DORs, $\epsilon_i$ and $\delta_i$ are independent with mean 0; the variance $\sigma_i^2$ of $\epsilon_i$ is estimated as
\[
\hat\sigma_i^2 = \frac{1}{y_i} + \frac{1}{m_i - y_i} +\frac{1}{z_i}  + \frac{1}{n_i -z_i}, 
\]
and the variance $\tau^2$ of $\delta_i$ is to be estimated. The DSL estimator then is a weighted estimator, too:
\[
\hat\mu = \sum_i\frac{\omega_i^{DSL} \mathrm{DOR}_i}{\sum_i \omega_i^{DSL}},
\]
where
\[
\omega_i^{DSL} = \frac{1}{\hat\sigma_i^2 + \tau^2}.
\]
The variance $\tau^2$ is estimated by the Cochran $Q$ statistic trick.

The function \code{madauni} handles the meta-analysis of the DOR (and the negative and positive likelihood ratios). One can use \code{madauni} in the following fashion:
<<>>=
(fit.DOR.DSL <- madauni(AuditC))
(fit.DOR.MH <- madauni(AuditC, method = "MH"))
@
Note that the brackets around \code{fit.DOR.DSL <- madauni(AuditC)} are a compact way to print the fit. The \code{print} method for \code{madauni} objects is not very informative, only the point estimate is returned along with (in the random effects case) an estimate of the $\tau^2$, the variance of the random effects. Note that estimation in the random effects case is performed on log-DOR scale, so that $\tau^2$ of the above DSL fit is substantial. To obtain more information the \code{summary} method can be used:
<<>>=
summary(fit.DOR.DSL)
@
In addition to the confidence intervals, Cochran's $Q$ statistic \citep{cochran1954combination} can be seen and Higgins $I^2$ \citep{higgins2003measuring}. Producing a forest plot of the (log-)DOR values together with the summary estimate is straightforward using the  \code{forest} method for the  \code{madauni} class:
<<>>=
forest(fit.DOR.DSL)
@
The resulting plot is shown in Figure~\ref{DORforest}.
<<echo=FALSE, results=hide>>=
pdf(file = "DORforest.pdf", width = 6, height = 6)
par(omi = c(0,0,0,0), mai = c(0.9,0.9,0.3,0.3))
forest(fit.DOR.DSL)
dev.off()
@
\setkeys{Gin}{width=0.5\textwidth}
\begin{figure}
\begin{center}
\includegraphics{DORforest}
\caption{Forest plot for a univariate random effects meta-analysis of the AUDIT-C data using the diagnostic odds ratio.}\label{DORforest}
\end{center}
\end{figure}


\subsection{Proportional hazards model approach}
The proportional hazards model approach \citep[PHM; see][]{boehningphm} builds on the assumption of a simple form of the ROC curves. The so called \emph{Lehmann model} \citep{le2006solution} is assumed. Let $p_i$ and $u_i$ denote the $i$th study's sensitivity and false positive rate  respectively. The relationship of $p_i$ and $u_i$ is then assumed to be
\[
p_i = u_i^{\theta_i}, 
\]
where $\theta_i > 0$ is a diagnostic accuracy parameter. The smaller $\theta$, the larger the area under the ROC curve and thus the more accurate the diagnostic test. For the meta-analysis of $\theta$ the APMLE estimator is implemented in \pkg{mada} for the case of homogeneity (i.e., fixed effects) and heterogeneity (i.e., random effects). Again the standard output of the \code{phm} function is rather sparse:
<<>>=
(fit.phm.homo <- phm(AuditC, hetero = FALSE))
(fit.phm.het <- phm(AuditC))
@
The \code{summary} method is more informative:
<<>>=
summary(fit.phm.homo)
@
The $\chi^2$ test goodness of fit test rejects the assumption of homogeneity, but the fit of the model for heterogeneity is better:
<<>>=
summary(fit.phm.het)
@
The estimation of $\theta$ results in an SROC curve; plotting this curve together with confidence bands obtained from the confidence interval of $\theta$ in the summary is done with the \code{plot} method. We also add the original data on ROC space with confidence regions and only plot a portion of ROC space.
<<>>=
plot(fit.phm.het, xlim = c(0,0.6), ylim = c(0.4,1))
ROCellipse(AuditC, add = TRUE)
@
The resulting plot is shown in Figure~\ref{phmplot}.
<<echo=FALSE, results=hide>>=
pdf(file = "phmplot.pdf", width = 6, height = 6)
par(omi = c(0,0,0,0), mai = c(0.9,0.9,0.3,0.3))
plot(fit.phm.het, xlim = c(0,0.6), ylim = c(0.4,1))
ROCellipse(AuditC, add = TRUE)
dev.off()
@
\setkeys{Gin}{width=0.5\textwidth}
\begin{figure}
\begin{center}
\includegraphics{phmplot}
\caption{Summary plot for the analysis of the AUDIT-C data with the PHM model.}\label{phmplot}
\end{center}
\end{figure}

Note that the SROC curve is not extrapolated beyond the range of the original data. The area under the SROC curve, the AUC, is also part of the summary above. For the PHM it is calculated by
\[
\mathrm{AUC} = \frac{1}{\theta + 1},
\]
and by the same relation a confidence interval for the AUC can be computed from the confidence interval for $\theta$. The \pkg{mada} package also offers the \code{AUC} function to calculate the AUC of other SROC curves which uses the trapezoidal rule.

\section{A bivariate approach}
Typically the sensitivity and specificity of a diagnostic test depend on each other through a cut-off value: as the cut-off is varied to, say, increase the sensitivity, the specificity often decreases. So in a meta-analytic setting one will often observe (negatively) correlated sensitivities and specificities. This observation can (equivalently) also be stated as a (positive) correlation of sensitivities and false positive rates. Since these two quantities are interrelated, bivariate approaches to the meta-analysis of diagnostic accuracy have been quite successful \citep{rutter2001hierarchical,van2002advanced,reitsma2005bivariate,harbord2007unification,arends2008bivariate}.

One typically assumes a binomial model conditional on a primary studies true sensitivity and false positive rates, and a bivariate normal model for the logit-transformed pairs of sensitivities and false positive rates. There are two ways to cast the final model: as a non-linear mixed model or as linear mixed model \cite[see for example][]{arends2008bivariate}. The latter approach is implemented in \pkg{mada}'s \code{reitsma} function, so we give some more details. We note that more generally the following can be seen as a multivariate meta-regression and so the the package \code{mvmeta} \citep{gasparrini2012} serves as a basis for our implementation. 

Let $p_i$ and $u_i$ denote the $i$th study's true sensitivity and false positive rate respectively, and let $\hat p_i$ and $\hat u_i$ denote their estimates from the observed frequencies. Then, since a binomial model is assumed conditional on the true $p_i$, the variance of $\logit(\hat p_i)$ can be approximated\footnote{This uses the delta method.} by
\[
\frac{1}{m_i \hat p_i(1 - \hat p_i)},
\]
and the variance of $\logit(\hat u_i)$ is then
\[
\frac{1}{n_i \hat u_i(1 - \hat u_i)}.
\]
So on the within study level one assumes, conditional on $p_i$ and $u_i$, that the observed variation is described by these variances and a normal model; let $D_i$ denote a diagonal 2$\times$2 matrix with the two variances on the diagonal. On the study level, one assumes that a global mean
\[
\mu=(\mu_1,\mu_2)^\T 
\]
and covariance matrix 
\[
\Sigma=\left(
\begin{array}{cc}
\sg_1^2   & 		\sg \\
\sg		&  \sg_2^2
\end{array}\right)
\]
describe the heterogeneity of the pairs $(\logit(p_i),\logit(u_i))$. So the model for the $i$th study is then
\[
(\logit(\hat p_i),\logit(\hat u_i))^\T \sim \mathrm{N}(\mu, \Sigma + D_i).
\]
Fitting this model in \pkg{mada} is similar to the other model fitting functions:
<<>>=
(fit.reitsma <- reitsma(AuditC))
@
The \code{print} method for \code{reitsma} objects has a scarce output. More information is offered by the \texttt{summary} method:
<<>>=
summary(fit.reitsma)
@
Note the sensitivity and false positive rate returned in this summary are just the back-transformed $\mu_1$ and $\mu_2$. 
In addition, please note that the methodology for estimating the I2 in the context of diagnostic test accuracy meta-analyses is not yet fully established. The obtained estimates are based on the approach described by \citet{zhou2014i2} for bivariate meta-analysis and based on the approaches described by \citet{holling2019i2}. For the latter, we compute both sample size-unadjusted and adjusted estimated according to all provided formulae for computing within-study variability.
One can then proceed to plot the SROC curve of this model. By default the point estimate of the pair of sensitivity and false positive rate is also plotted together with a confidence region. In the following example the SROC curve is plotted a bit thicker using the \texttt{sroclwd} argument, a caption is added to the plot and also the data and a legend. By default the SROC curve is not extrapolated beyond the range of the original data: 
<<>>=
plot(fit.reitsma, sroclwd = 2,
     main = "SROC curve (bivariate model) for AUDIT-C data")
points(fpr(AuditC), sens(AuditC), pch = 2)
legend("bottomright", c("data", "summary estimate"), pch = c(2,1))
legend("bottomleft", c("SROC", "conf. region"), lwd = c(2,1))
@
The output is shown in Figure~\ref{SROCAuditC}.
<<echo=FALSE, results=hide>>=
pdf(file = "SROCAuditC.pdf", width = 6, height = 6)
par(omi = c(0,0,0,0), mai = c(0.9,0.9,0.3,0.3))
plot(fit.reitsma, sroclwd = 2,
     main = "SROC curve (bivariate model) for AUDIT-C data")
points(fpr(AuditC), sens(AuditC), pch = 2)
legend("bottomright", c("data", "summary estimate"), pch = c(2,1))
legend("bottomleft", c("SROC", "conf. region"), lwd = c(2,1))
dev.off()
@
\setkeys{Gin}{width=0.5\textwidth}
\begin{figure}
\begin{center}
\includegraphics{SROCAuditC}
\caption{SROC curve for the \citet{reitsma2005bivariate} model.}\label{SROCAuditC}
\end{center}
\end{figure}



\subsection{Comparing SROC curves}
We show how to compare SROC curves. \citet{patrick1994validity} conducted a meta-analysis to (among other things) investigate the efficacy of self administered and interviewer administered questionnaires to detect nicotine use. The data sets \texttt{SAQ} and \texttt{IAQ} are the respective subsets of this data. First one fits bivariate models to the data sets:
<<>>=
data("IAQ")
data("SAQ")
# both datasets contain more than one 2x2-table per study
# reduce (somewhat arbitrarily) to one row per study by
# using the first coded table only:
IAQ1 <- subset(IAQ, IAQ$result_id == 1)
SAQ1 <- subset(SAQ, SAQ$result_id == 1)
fit.IAQ <- reitsma(IAQ1)
fit.SAQ <- reitsma(SAQ1)
@
Then one plots the SROC curves of these fits, beginning with the fit of the IAQ and adding the SAQ curve. Note that the \texttt{lty} arguments is used so that the curves can be distinguished.
<<>>=
plot(fit.IAQ, xlim = c(0,.5), ylim = c(.5,1),
     main = "Comparison of IAQ and SAQ")
lines(sroc(fit.SAQ), lty = 2)
ROCellipse(fit.SAQ, lty = 2, pch = 2, add = TRUE)
points(fpr(IAQ1), sens(IAQ1), cex = .5)
points(fpr(SAQ1), sens(SAQ1), pch = 2, cex = 0.5)
legend("bottomright", c("IAQ", "SAQ"), pch = 1:2, lty = 1:2)
@
<<echo=FALSE, results=hide>>=
pdf(file = "SAQIAQ.pdf", width = 6, height = 6)
par(omi = c(0,0,0,0), mai = c(0.9,0.9,0.3,0.3))
plot(fit.IAQ, xlim = c(0,.5), ylim = c(.5,1),
     main = "Comparison of IAQ and SAQ")
lines(sroc(fit.SAQ), lty = 2)
ROCellipse(fit.SAQ, lty = 2, pch = 2, add = TRUE)
points(fpr(IAQ), sens(IAQ), cex = .5)
points(fpr(SAQ), sens(SAQ), pch = 2, cex = 0.5)
legend("bottomright", c("IAQ", "SAQ"), pch = 1:2, lty = 1:2)
dev.off()
@
\setkeys{Gin}{width=0.5\textwidth}
\begin{figure}
\begin{center}
\includegraphics{SAQIAQ}
\caption{Comparison of interviewer and self-adminstered smoking questionaires with SROC curves.}\label{SAQIAQ}
\end{center}
\end{figure}
Figure~\ref{SAQIAQ} contains the resulting plot. The summary estimates are well separated, though the confidence regions slightly overlap. It would nevertheless be safe to conclude that IAQ is a more reliable way to measure smoking than SAQ.

\subsection{Bivariate meta-regression}
We demonstrate diagnostic meta-regression also using the data of \citet{patrick1994validity}. We use the complete data set, which is loaded by
<<>>=
data("smoking")
# again reduce to one result per study:
smoking1 <- subset(smoking, smoking$result_id == 1)
@
The \texttt{data.frame} contains the same variables as the \texttt{SAQ} and \texttt{IAQ} subsets, but the type is coded by the variable \texttt{type}:
<<>>=
summary(smoking1$type)
@
We use the factor \texttt{type} as a covariate in diagnostic meta-regression:
<<>>=
fit.smoking.type <- reitsma(smoking1, 
                            formula = cbind(tsens, tfpr) ~ type)
@
Note that the left hand side of the \texttt{formula} object always has to be of the form \texttt{cbind(tsens, tfpr)}, where \texttt{tsens} and \texttt{tfpr} are for \emph{transformed} sensitivity and false positive rate respectively.
We generate detailed output by:
<<>>=
summary(fit.smoking.type)
@
This output can be interpreted as follows: The $z$~value for the regression coefficient for the false-positive rates is significant, indicating that the interviewer administered questionnaires offer a better false-positive rate (the coefficient for the difference in false-positive rate for SAQ is positive, so the false positive rates are higher for the SAQ and, hence, lower for the IAQ). Interestingly the point estimate for the sensitivities does not indicate any effect.

Note that once meta-regression is used, one cannot reasonably plot SROC curves, since fixed values for the covariates would have to be supplied to do so. Also (global) AUC values do not make sense.

We can also compare the fit of two bivariate meta-regressions with a likelihood-ratio test. For this, we have to refit the models with the maximum likelihood method, as the likelihood-ratio test relies on asymptotic theory that is only valid if this estimation method is employed.

<<>>=
fit.smoking.ml.type <- reitsma(smoking1, 
                          formula = cbind(tsens, tfpr) ~ type, 
                          method = "ml")
fit.smoking.ml.intercept <- reitsma(smoking1, 
                                    formula = cbind(tsens, tfpr) ~ 1,
                                    method = "ml")
anova(fit.smoking.ml.type, fit.smoking.ml.intercept)
@
The meta-regression confirms that type explains some of the heterogeneity between the primary studies.

\subsection{Transformations beyond the logit}
All bivariate approaches explained so far use the conventional logit transformation. The \texttt{reitsma} function offers the parametric $t_\alpha$ family \citep{doebler2012mixed} of transformations as alternatives. The family is defined by
\[
t_\alpha(x) := \alpha\log(x) - (2-\alpha)\log(1-x),\quad x\in (0,1), \alpha\in [0,2].
\]
For $\alpha = 1$, the logit is obtained. In many cases the fit of a bivariate meta-regression can be improved upon by choosing adequate values for $\alpha$. The rational behind this is, that especially sensitivities tend to cluster around values like .95 and the symmetric logit transformation does not necessarily lead to normally distributed transformed proportions. As an example we study the \texttt{smoking} data again using maximum-likelihood estimation:
<<>>=
fit.smoking1 <- reitsma(smoking1, method = "ml")
fit.smoking2 <- reitsma(smoking1, 
                        alphasens = 0, alphafpr = 2, 
                        method = "ml")
AIC(fit.smoking1)
AIC(fit.smoking2)
@
The almost identical AIC values indicates, that the fit of the models is comparable. For purpose of inference, we likelihood-ratio tests are recommended, which are discussed for this type of transformation by \citet{doebler2012mixed}.


\subsection{Estimating Likelihood Ratios and Diagnostic Odds Ratio}
Based on the bivariate model for diagnostic test accuracy, it is possible to obtain pooled likelihood ratios and diagnostic odds ratio. The \texttt{SummaryPts} function applies a sampling based approach (as proposed by \citet{zwinderman2008}) to estimate the aforementioned properties.
As an example, applying the \texttt{SummaryPts} function to the \texttt{AuditC} data, we would get: 

<<>>=
summary_pts_audit <- SummaryPts(reitsma(AuditC))
summary(summary_pts_audit)
@

\subsection{Estimating Pooled Predictive Values}
Predictive values are particularly useful in the clinical practice. The negative predictive value indicates the probability of a patient with a negative test not having a certain disease/condition, while the positive predictive value indicates the probability of a patient with a positive test having that certain disease/condition. The predictive values are dependent not only on the sensitivity and specificity of the diagnostic test, but also on the prevalence of the disease/condition in the setting being studied.
The \texttt{predv\_r} and the \texttt{predv\_d} functions project probability distributions of predictive values based i. on pooled sensitivities and specificities (obtained using the bivariate approach) and ii. on prevalence ranges or distributions. An application of this approach has been used by \citet{sousapinto2021}
As an example, we will use the \texttt{AuditC} data. Let us consider that the prevalence of alcohol problems (the condition being assessed in the AuditC example) ranges between 5\% and 15\%. We can use the \texttt{predv\_r} function to obtain distributions for the negative and positive predictive values of the screening test for each prevalence value within that range:

<<>>=
pred_audit1 <- predv_r(AuditC, prop_min=0.05, prop_max=0.15)
summary(pred_audit1)
@

We observe that the, for the defined prevalence range, the mean estimate for the negative predictive value ranges between 98\% and 99\%, while the mean estimate for the positive predictive value ranges between 18\% and 42\%. 

Now let us consider that we do not want to project predictive values based on a prevalence range, but rather based on a prevalence probability distribution. We know that the prevalence of alcohol problems is given by a distribution, with a mean value of 10\% and a standard-deviation of 5. We can use the \texttt{predv\_d} function to obtain probability distributions for the negative and positive predictive values:

<<>>=
pred_audit2 <- predv_d(AuditC, prop_m=0.10, prop_sd=0.05)
summary(pred_audit2)
@

We obtain a distribution of negative predictive values defined by a mean of 98\% and a standard-deviation of 1\% (95\%CrI=96-100\%), and a distribution of positive predictive values defined by a mean of 30\% and a standard-deviation of 12\% (95\%CrI=10-53\%).

By default, the Zwindermann \& Bossuyt approach is used to generate samples based of sensitivities and false positive rates \citet{zwinderman2008}, based on which distributions of predictive values are obtained. For faster results, such an approach may not be used (\texttt{zb=FALSE}). Expected differences are small, especially if the number of participants of primary studies is sufficiently high.

If prevalence inputs (minimum and maximum, or mean and standard-deviation) are not provided, the \texttt{predv\_r} and the \texttt{predv\_d} functions will estimate those inputs based on data from primary studies. However, this may not be advisable, as studies focusing on diagnostic test accuracy are typically not specifically designed for prevalence assessment (case-control studies are particularly troublesome in this context).

\section{Further development}
In the future \pkg{mada} will support the mixture approach of \citet{holling2011likelihood} and Bayesian approaches.

\section*{Acknowledgements}
This work was funded by the DFG project HO 1286/7-2.

\bibliography{mada}{}
%\bibliographystyle{alpha}

\end{document}