\documentclass[a4paper]{article}
\usepackage[utf8]{inputenc}
\usepackage[T1]{fontenc}% gestion des accents (pour les pdf)
\usepackage{amsthm, amssymb, amstext}
\usepackage{natbib}
\bibliographystyle{plain}
\usepackage{hyperref}
\usepackage{geometry}
\geometry{left = 1in, right = 1in}
\setlength{\parskip}{\medskipamount}
\setlength{\parindent}{2em}
\addtolength{\textheight}{2ex}

%%%\numberwithin{figure}{section}
%%%\numberwithin{equation}{section}
%%numberwithin{equation}{subsection}

\theoremstyle{plain}
\newtheorem{thm}{Theorem}[section]
\newtheorem{prop}[thm]{Proposition}
\newtheorem{cor}[thm]{Corollary}

\theoremstyle{definition}
\newtheorem{defn}{Definition}[section]
\newtheorem{exmp}{Example}[section]

\theoremstyle{remark}
\newtheorem{rem}{Remark}
\newtheorem{note}{Note}

\renewcommand{\theenumi}{\roman{enumi}}
\renewcommand{\labelenumi}{(\theenumi)}

\newcommand{\ud}{\mathrm{d}}
\newcommand{\mb}{\mathbf}
\newcommand{\ms}{\mathscr}
\newcommand{\BMA}{\textsc{BMA}}
\newcommand{\PB}{\textsc{PB}}
\newcommand{\NL}{\textsc{NL}}
\newcommand\email{\begingroup \urlstyle{tt}\Url}


\title{ The \textsf{BMAmevt} package: \\
 Bayesian Model Averaging at work  for 
Multivariate Extremes}
\date{}

\author{Anne Sabourin}
					
                   
% \thanks{Advisors: Philippe Naveau (Laboratoire des Sciences du Climat et de l'Environnement, Gif-Sur-Yvette), Anne-Laure FougÃ¨res (Institut Camille Jordan, Lyon 1) }

 %% \texttt{\lowercase{sabourin@math.univ-lyon1.fr}}

%\VignetteIndexEntry{ Bayesian Model Averaging for Multivariate Extremes}
%\VignetteDepens{stats, utils}
%\VignetteKeyword{Extreme Value Theory, Multivariate Extremes,
%spectral Measure} 
%\VignettePackage{BMAmevt}


\begin{document}
\SweaveOpts{concordance=TRUE}

\maketitle

\begin{center}
%  \large
  %\vspace{0.2cm}
 %  Anne Sabourin \\
 % \email{sabourin@math.univ-lyon1.fr}
%  \normalsize

\email{	anne.sabourin@telecom-paristech.fr}\\
\email{annesab1@gmail.com}\\					
  \vspace{0.2cm}
  LTCI, TÃ©lÃ©com ParisTech\\UniversitÃ© Paris-Saclay, France 
  \vspace{0.2cm}
 \end{center}

\section{Overview}

This package  is a  `Bayesian Model
Averaging' (\BMA) toolkit. The main purpose is the estimation of  the
dependence structure between  the largest values of multivariate
data, using the probabilistic framework of  Multivariate Extreme
Value Theory. However, the principal functions implemented here  (MCMC
sampler and simple Monte-Carlo sampler for the marginal model
likelihoods, see below) could be
used in any other statistical context.  

This package has been developped to implement the methods proposed in 
\cite{Sabourin12,Sabourin14}.

% The  main  framework of multivariate extreme value theory is well-known
% in terms of probability: The dependence structure among very high
% values of a data set can be characterized by   the so-called 'Spectral
% measure', or 'Angular measure' \citep[see
% \emph{e.g.}][]{resnick1987extreme,Beirlant04,deHaanFerreira06},  which
% is a probability measure on  the positive quadrant of
% the unit sphere, \emph{i.e.},  for the $L^1$ norm, the simplex. 
% Inference and model choice issues  still remain:   no  parametric form can entirely
% capture the dependence structure. 
% Driven by interpretabilility needs and/or computational constraints, 
% the practitioner often makes an assertive choice and arbitrarily fits
% a specific parametric angular measure on the data.
% Another statistician could come up with another  model and a completely  different estimate. 
% How to take into account the  different fitted angular measures ? 
% One possible way to do this  is  to weigh them 
% according to the marginal model
% likelihoods. This strategy, the so-called Bayesian Model Averaging 
% (\BMA),   has been extensively studied in various contexts, but (to
% our knowledge) it has never been adapted   to  the realm  of multivariate extreme value theory. 

% Basic  tools  for a  \BMA
% approach to the estimation of the angular measure of  multivariate
% extreme value distributions are provided. 
The main functions are 
\begin{itemize}
\item \verb@posteriorMCMC@: A generic MCMC sampler (Metropolis-Hastings algorithm)  to estimate the posterior
distributions  in individual models,
\item \verb@marginal.lkl@ : A generic Monte-Carlo sampler to estimate the model likelihoods.
\end{itemize}
Plotting tools are also provided for the three dimensional case
(functions \verb@ dgridplot, discretize@) to display contour lines
and level sets of angular measures.

Any parametric  model may be passed as argument. To
`plug' one's  favorite model into the package, one  only needs to  define
\begin{itemize}
\item The likelihood function of the  model (a parametric density on
  the simplex or any other finite dimensional sample space),
\item The prior parameter distribution together with a  prior sampler.
\item The proposal distribution and simulating rule for the   Metropolis-Hastings algorithm.
\end{itemize}
By way of example, two parametric  models are pre-implemented: the
Pairwise Beta (\PB) model , valid in
arbitrary 
`` moderate''  dimension\footnote{It has been tested on five
  dimensional data sets in \cite{cooley2010pairwise}. It would be computationally  unrealistic   to implement our methods to much higher dimensions.}, and a specific `Nested Asymmetric' extension of the logistic model 
\cite{gumbel1960distributions}), only valid with three dimensional
data, which we refer to as the \NL\ model.  This nested extension  
 was cited as an example in \cite{tawn1990modelling} and
\cite{coles1991modeling}. See \cite{Sabourin12} for
more details.
\section{Tutorial}
We give here two basic examples showing  how to use the functions of the
package.

\subsection{Simulated data}
We consider the following problem: Given an angular dataset, how to
conduct the estimation in the PB and NL model, and how to merge the
estimated spectral measures ?

 
First, simulate the data:
\begin{Schunk}
\begin{Sinput}
> PBpar=c(0.7,3.1,0.45,2)
> NLpar=c(0.7,0.8)
> set.seed(1)
> mixDat=rbind(rpairbeta(n=50,dimData=3, par=PBpar),
+   rnestlog(n=50,par=NLpar))
\end{Sinput}
\end{Schunk}

Now, check that the two models have comparable posterior weights
\begin{Schunk}
\begin{Sinput}
> pWei=posteriorWeights (dat=mixDat,
+                   HparList=list( pb.Hpar, nl.Hpar ),
+                   lklList=list( dpairbeta, dnestlog ),
+                   priorList=list( prior.pb,  prior.nl ),
+                   priorweights=c(0.5,0.5),
+                   Nsim=50e+3,
+                   Nsim.min=10e+3, precision=0.1,
+                   show.progress=c(),
+                   displ=FALSE)
> pWei
\end{Sinput}
\end{Schunk}
You will obtain a matrix with first column (the posterior weights)  approximately equal to 
\verb@ c(0.31,0.69)@. 

Now, conduct the inference in each model,  separately:
\begin{Schunk}
\begin{Sinput}
> PBpost=posteriorMCMC.pb(dat=mixDat,Nsim=15e+3,Nbin=5e+3,
+   show.progress= c(100, 10e+3))
> NLpost=posteriorMCMC.nl(dat=mixDat,Nsim=15e+3,Nbin=5e+3,
+   show.progress= c(100, 10e+3))
\end{Sinput}
\end{Schunk}

It is recommended to check convergence: 
\begin{Schunk}
\begin{Sinput}
> library(coda)
> heidel.diag(PBpost$stored.vals)
> heidel.diag(NLpost$stored.vals)
\end{Sinput}
\end{Schunk}

Have a look at the posterior predictive spectral densities:
\begin{itemize}
\item In the \PB ~model only:
\begin{Schunk}
\begin{Sinput}
>   dev.new()
> PBpred=posterior.predictive.pb(post.sample=PBpost,from=NULL, to=NULL,
+   lag=40,npoints=60,eps=1e-3, equi=TRUE, displ=TRUE, main="PB predictive")
\end{Sinput}
\end{Schunk}

\item In the \NL ~model only:
\begin{Schunk}
\begin{Sinput}
> dev.new()
> NLpred=posterior.predictive.nl(post.sample=NLpost,from=NULL, to=NULL,
+   lag=40,npoints=60,eps=1e-3, equi=TRUE, displ=TRUE, main="NL predictive")
\end{Sinput}
\end{Schunk}

\item Finally, the \BMA  ~predictive is:
\begin{Schunk}
\begin{Sinput}
> dev.new()
> dgridplot(0.31*PBpred + 0.69*NLpred, npoints=60, eps=1e-3, equi=TRUE,
+      main="BMA predictive")
\end{Sinput}
\end{Schunk}
\end{itemize}

Compare to the `truth':
\begin{Schunk}
\begin{Sinput}
> pbdens=dpairbeta.grid(par=PBpar, displ=FALSE, equi=T, npoints=60,eps=1e-3)
> nldens=dnestlog.grid(par=NLpar, displ=FALSE,  equi=T, npoints=60,eps=1e-3)
> dev.new()
> dgridplot(0.5*(pbdens)+0.5*(nldens), equi=T, npoints=60, eps=1e-3,
+        main="Truth")
\end{Sinput}
\end{Schunk}

To obtain a quantitative assessment of the gain represented by the BMA
approach, you may compare the Kullback-Leibler (KL)  divergence and the
$L^2$ distance between the true density and each estimate:

\begin{Schunk}
\begin{Sinput}
> ## choose a grid size
> npoints=100
> eps=1e-3
> ##
> ## compute the true density on this grid
> TRUEdens=0.5*dnestlog.grid(par=NLpar, equi=F, npoints=npoints,eps=eps,
+                            displ=FALSE )+
+   0.5*dpairbeta.grid(par=PBpar,equi=F,npoints=npoints,eps=eps,displ=FALSE)
> ##
> ## check that the grid is fine enough
> scores3D(true.dens=TRUEdens, est.dens=TRUEdens, npoints=npoints,
+          eps=eps)
> ##
> ## compute the posterior predictive:
> #### in the PB model:
> rectPBpred=posterior.predictive.pb(post.sample=PBpost,from=NULL, to=NULL,
+   lag=40,npoints=npoints,eps=eps,equi=FALSE, displ=FALSE)
> #### in the NL model: 
> rectNLpred=posterior.predictive.nl(post.sample=NLpost,from=NULL, to=NULL,
+   lag=40,npoints=npoints,eps=eps,equi=FALSE, displ=FALSE)
> ##
> ## Finally, compare the performance scores:
> #### PB scores
> scores3D(true.dens=TRUEdens, est.dens=rectPBpred, npoints=npoints,
+          eps=eps)
> #### NL scores
> scores3D(true.dens=TRUEdens, est.dens=rectNLpred, npoints=npoints,
+          eps=eps)
> #### BMA scores
> scores3D(true.dens = TRUEdens,
+          est.dens = 0.31*rectPBpred+
+             0.69*rectNLpred,
+          npoints=npoints,
+          eps=eps)
\end{Sinput}
\end{Schunk}
 
The \BMA ~predictive spectral measure overcomes each individual model's
predictive, both for the $L^2$ score and the logarithmic  score. 

 
\subsection{Tri-variate Leeds dataset}
 (see \cite{Sabourin12} or \cite{cooley2010pairwise})

Let us consider the tri-variate air pollutant concentrations data
provided with the package. 

Again, estimate the posterior weights:

\begin{Schunk}
\begin{Sinput}
>   pWeiLeeds=posteriorWeights (dat=Leeds,
+                   HparList=list( pb.Hpar, nl.Hpar ),
+                   lklList=list(dpairbeta , dnestlog ),
+                   priorList=list(prior.pb, prior.nl ),
+                   priorweights=c(0.5,0.5),
+                   Nsim=50e+3,
+                   Nsim.min=10e+3, precision=0.1,
+                   displ=TRUE)
> pWeiLeeds
\end{Sinput}
\end{Schunk}
The \NL ~model obtains an overwhelming posterior weight. The BMA
framework  thus selects  a single model. It is unnecessary to 
compute  the parameter posterior distribution in the \PB ~model, since
averaging estimates will be  the same as selecting the \NL ~estimate.

This example is not an exceptional case: with increasing sample size, 
the BMA is bound to become a selecting tool. In the asymptotic limit,
the model which minimizes the Kullback-Leibler distance to the `truth'
will be chosen. See \emph{e.g.} \cite{berk1966limiting,
  kleijn2006misspecification}.  

%%\bibliographystyle{apalike}
\bibliography{biblio}   

\end{document}