\documentclass[11pt]{article} % !Rnw weave = knitr %% \VignetteIndexEntry{Normalization of power spectral density estimates} %% \VignetteEngine{knitr::knitr} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage{fancyvrb} \usepackage[pdfborder={0 0 0}]{hyperref} \usepackage{url} \usepackage{upquote} \usepackage{graphicx} \usepackage{grffile} \usepackage{amsmath} \usepackage{amssymb} \usepackage{float} \usepackage{natbib} \usepackage{geometry} \geometry{verbose,tmargin=3cm,bmargin=5cm,lmargin=2.5cm,rmargin=2.5cm} \usepackage[font=sf, labelfont={sf,bf}, margin=2cm]{caption} \usepackage{color} \usepackage{txfonts} %% \input{supp_mathsyms} %% \newcommand{\SC}[1]{\textsc{#1}} \newcommand{\SCY}[0]{\SC{Yes}} \newcommand{\SCN}[0]{\SC{No}} \newcommand{\Rcmd}[1]{\texttt{#1}} \newcommand{\psd}[0]{\href{https://github.com/abarbour/psd/}{\color{blue}\Rcmd{psd}}} \newcommand{\naive}[0]{na\"{\i}ve} \newcommand{\bidx}[1]{\index{#1}{\textbf{#1}}} \newcommand{\idx}[1]{\index{#1}{#1}} % \title{Normalization of Power Spectral Density estimates} \author{Andrew J. Barbour and Robert L. Parker} % \begin{document} \maketitle % \begin{abstract} A vast and deep pool of literature exists on the subject of spectral analysis; wading through it can obscure even the most fundamental concepts from the inexperienced practitioner. Appropriate interpretation of spectral analyses depends crucially on the normalization used, which is often the greatest source of confusion. Here we outline the normalization used by \psd{}, namely the \bidx{single-sided} \bidx{power spectral density} (PSD). We briefly outline the background mathematics, present an example from scratch, and compare the results with the normalization used by the spectrum estimator included in the base distribution of R: \Rcmd{stats::spectrum}. \end{abstract} \tableofcontents %% %% %% \section{Introduction} There can often be confusion about the different quantities used in spectral analysis, partly due to myriad nomenclature within the incredibly vast literature on the subject\footnote{ The type of confusion common in spectral analyses is illustrated in this thread on \Rcmd{R-help}: \url{https://r.789695.n4.nabble.com/Re-How-do-I-normalize-a-PSD-td792902.html} }. Commonly one finds similarly sounding phrases, including ``amplitude spectrum", ``energy spectral density", ``power", ``power spectra", and even ``spectra". These all mean \emph{something}, but are rarely equivalent, and are sometimes used improperly. To clarify these terms, we will tread somewhat lightly through the background and definitions used in spectral analysis, without overly complicating the discussion with proofs. \section{Power spectral density} We begin by considering a stationary stochastic process \dXstoch{}, a random function extending throughout all time with time-invariant properties. Our goal is to characterize \dXstoch{} with an ordinary function describing its properties in frequency (as the autocorrelation function does in time). Functions of a real variable (continuous time signals) will have a Fourier transform \begin{equation} \label{eq-FT-0} \stochXfo (f) = \Fo \{ \dXstoch{} \} = \intone \dXstoch{} e^{-2 \pi i f t} \, dt \end{equation} % and for discrete-time signals \dXstochd{} the spectrum is restricted to the finite interval \begin{equation} \label{eq-DFT-0} \stochXfo (f) = \sum_{n=-\infty}^{\infty} \dXstochd{} e^{-2 \pi i f n}, -1/2 \leq f \leq 1/2 \end{equation} which means we can reconstruct the original series from the inverse transform of (\ref{eq-DFT-0}), over frequencies $(\nhalf,\,\half)$: \begin{equation} \label{eq-FT-0d} \dXstochd{} = \Fo^{-1} \{ \stochXfo (f) \} = \int_{\nhalf}^{\half} \dXstoch{} e^{2 \pi i f n} \, df \end{equation} There is a problem with the definitions we have just presented. The integral in (\ref{eq-FT-0}) or the sum in (\ref{eq-DFT-0}) must converge to apply these equations to their corresponding stochastic processes; this would require some kind of decrease in amplitude, or energy, as $t$ or $n$ gets large which does not happen for a stationary process. And, if we put a stochastic process in $f(t)$ in (\ref{eq-FT-0}) we would obtain another random function, which is contrary to our stated goal of obtaining a descriptive function in frequency space. \subsection{An informal definition from a digital filtering analogy} Suppose we wanted to know how much variability the stationary process exhibits at a frequency $f_0$. We could design a narrow bandpass filter $\phi_f$ that only allowed signals through in the frequency band $( f_0 - \half \Delta f , f_0 + \half \Delta f )$ with unit gain. If we convolve a signal $X$ with that filter to obtain the process $X_\phi$, we will have a stochastic process having very limited frequency content. We would expect the variance (a measure of amplitude) to be proportional to the width $\Delta f$ of the bandpass filter, which prevents the condition of infinite energy at infinite time (or length). The variance of the filtered process $X_\phi$ will be some positive value times $\Delta f$, and will vary as the center frequency $f_0$ is varied. At the frequency $f_0$ the following is also true: $$ \Var \{X_\phi\} \propto \Var\{X\} $$ The variance of $X$ in a frequency band is called the \bidx{power} in that band, and so $S_X$ is the power spectrum of $X$, or more grandly its \bidx{power spectral density}: \begin{equation} \label{eq-PSD-informal} S_X ( f_0 ) \Delta f = \Var \{\phi_f \star X\} \end{equation} Equation (\ref{eq-PSD-informal}) is our informal definition of $S_X ( f_0 )$. Notice this definition works equally well for continuous or discrete processes. In the days before computers, analog spectral analyzers were built based on this principle: a large number of narrow bandpass filters followed by rectifiers to measure the variance in each band. We can demonstrate this analogy by plotting the PSDs obtained for a normally distributed process, and a bandpass filtered version of it, along with the filter's amplitude response. In Figure \ref{fig:filter} we see that within the pass-band the variance of the filtered process is equal to that of the infinitely long process. We could imagine the grey curve being comprised of a collection of passband responses. If we could construct a true bandpass filter, there would be zero energy outside the passband; but, in practice the filter weights create side-lobe energy which is unavoidable. Also note the artificial bias from spectral leakage, seen just outside the edges of the pass-band. %and near the zero frequency. % <<eval=TRUE, echo=TRUE, label="Filteredsignalanalogy">>= set.seed(1234) N <- 1028 x <- rnorm(N, mean = 0, sd = 1) # Load signal processing library(signal, warn.conflicts=FALSE) # Construct an FIR filter f <- c(0, 0.2, 0.2, 0.3, 0.3, 0.5)*2 m <- c(0, 0, 1, 1, 0, 0) fw <- signal::fir2(N, f, m) # complex filter response fh <- signal::freqz(fw, Fs=1) f <- c(0, 0.12, 0.12, 0.22, 0.22, 0.5)*2 fwl <- signal::fir2(N, f, m) fhl <- signal::freqz(fwl, Fs=1) f <- c(0, 0.28, 0.28, 0.38, 0.38, 0.5)*2 fwu <- signal::fir2(N, f, m) fhu <- signal::freqz(fwu, Fs=1) # convolution xf <- signal::filter(fw, x) # PSD using stats::spectrum Sx <- spectrum(x, pad=1, plot=FALSE, taper=0.2) Sxf <- spectrum(xf, pad=1, plot=FALSE, taper=0.2) @ % \begin{figure}[htb!] \begin{center} <<eval=TRUE, echo=FALSE, fig.width=6, fig.height=3, label=FILTERS>>= Sx$df <- Sx$bandwidth <- NA par(mar=c(0,0,2,0), oma=rep(0,4)) plot(Sx, col="grey", log="dB", ylim=c(-150,20), xlim=c(0.1,0.4), xlab="", ylab="", ci.col=NA, axes=FALSE, main="PSDs of raw and filtered process") lines(fhu$f, 20*log10(Mod(fhu$h)), col="red", lty=3, lwd=2) lines(fhl$f, 20*log10(Mod(fhl$h)), col="red", lty=3, lwd=2) lines(fh$f, 20*log10(Mod(fh$h)), col="red", lwd=2) plot(Sxf, log="dB", add=TRUE) @ \caption{Demonstration of the ``filter analogy". A normally-distributed process (with PSD in grey) is convolved with a bandpass filter (with a response shown in red); the resulting PSD is in black. In this analogy the PSD is the variance of a stationary process in an infinitesimally narrow band, as if we had passed our signal through a bank of narrow filters (dotted lines) and looked at the output from each one. } \label{fig:filter} \end{center} \end{figure} %fftshift<-function(X){ % Mi <- length(X) % Pi <- ceiling(Mi/2) % indices <- c(((Pi+1):Mi),seq_len(Pi)) % return(X[indices]) %} \subsection{A more formal definition} Let us return to having a single realization of a stochastic process in the time domain, \dXstoch{}, which we assume is sampled over a finite interval of time $(-T/2, \,T/2)$, and denoted by \dXrealiz{}. A realization of \dXstoch{} will have a Fourier transform: % \begin{equation} \label{eq-FT} \Xfo_T (f) = \Fo \{ \dXrealiz{} \} = \intone \dXrealiz{} e^{-2 \pi i ft} \, dt = \int_{-T/2}^{T/2} \dXstoch{} e^{-2 \pi i ft} \, dt \end{equation} % The \bidx{amplitude spectrum} is the modulus of $\Xfo_T$ and the \bidx{phase spectrum} is the argument of $\Xfo_T$, although these are generally not informative for physical applications, if ever. The \bidx{energy spectral density} is found from $\Xfo_T$ by finding the expectation of the squared amplitude spectrum: % \begin{equation} \label{eq-ESD} \dESD{} = \Ex \{ | \Xfo_T (f) | ^ 2 \} \end{equation} % We note the necessity for convergence mentioned previously: as $T$ grows to infinity, so too does \dESD{}. We divide it by the interval length $T$ to curb this growth, which gives us an expression for \bidx{power spectral density}: % \begin{equation} \label{eq-psddef1} \begin{split} \dS{} & = \lim_{T \to \infty} T^{-1} \dESD{} \\ & = \lim_{T \to \infty} \Ex \left\{ \frac{1}{T} \left | \int_{-T/2}^{T/2} \dXstoch{} e^{-2 \pi i ft} \, dt \right | ^ 2 \right\} \end{split} \end{equation} %\label{sy-psd} % which is real, non-negative, and exists for all stationary processes with zero mean and finite variance. Equation (\ref{eq-psddef1}) defines the \bidx{double-sided} PSD, because in the limit of $T$, the limits of integration are $\pm\infty$. If \dXstoch{} is real the power spectrum \dS{} is even; hence, we only need estimates for $f \ge 0$. The \bidx{single-sided} PSD is thus given by $2 S(f)$ for $f \ge 0$. In many cases this sidedness distinction, as we will see, explains errant factors of two in PSD normalizations. \subsection{Connection to the autocovariance function} What is the connection between the PSD, defined in Equation (\ref{eq-psddef1}), and the autocovariance function \dACV{\tau}? From Equation (\ref{eq-psddef1}) we see that \dS{} is obtained from products of \dXstoch{} with itself at any particular $f$, so it is related to the second-order moment of \dXstoch{} only. The \bidx{autocovariance} (ACV) \dACV{\tau} is also related to the second-order moment: % \begin{equation} \begin{split} \dACV{\tau} &= \Cov \left( \dXlag{}, \dXlag{} \right) \\ &= \Ex \left\{ \left( \dXlag{} - \Ex \{ \dXlag{} \} \right)^2 \right\} \end{split} \label{eq-acvdef} \end{equation} % It may be surprising to note, as well, that \dS{} is simply the Fourier transform of \dACV{\tau}: % \begin{equation} \label{eq-psddef2} \dS{} = \Fo \{ \dACV{\tau} \} = \intone \dACV{\tau} e ^ {-2 \pi i f \tau} \, d\tau \end{equation} % So, the functions \dACV{\tau} and \dS{} exist as a transform pair. For real data, \dACV{\tau} is always even, and always real. This implies that \dS{} is also a real and even function in $f$, which, because $S(f) >= 0$, restricts the functions \dACV{t} could possibly represent. Put another way, there are many examples of even-functions having non-positive Fourier transforms (\citet{bracewell2000} shows many). \subsection{Testing normalization} We can use the relationship with the ACV in (\ref{eq-psddef2}), or the informal definition in (\ref{eq-PSD-informal}), to test whether or not a PSD is properly normalized. To see this, we take the inverse Fourier transform of (\ref{eq-psddef2}): % \begin{equation} \label{eq-psdinv} \dACV{t} = \intone \dS{} e ^ {2 \pi i ft} \, df \end{equation} % and find the ACV of a zero-mean process for zero lag. From (\ref{eq-acvdef}) we have: % \begin{equation} \label{eq-acvprop} \dACV{0} = \Ex \{ \dXstoch{}^2 \} = \Var \{ \dXstoch{} \} = \sigma_\mathcal{X} ^ 2 \end{equation} % and by setting $t = 0$ in (\ref{eq-psdinv}) we have the basis of our normalization test: % \begin{equation} \label{eq-psdnorm} \sigma_\mathcal{X} ^ 2 = \intone S (f) \, df \end{equation} % That is, the area under the power spectrum is the variance of the process. So, a straightforward way to test normalization is to compute the PSD for a realization of \dXstoch{} with known variance, and zero mean [e.g. $\mathcal{N}(0,\sigma^2)$]; and then calculate the integrated spectrum. For example, the \idx{single-sided} PSD for a realization of a $\mathcal{N}(0, 1)$ process, sampled at 1 Hz, will be flat at 2 units$^2$ Hz$^{-1}$ across the frequency band $[0, \half]$, and will have an area equal to one. \subsection{Summary of nomenclature} In Table \ref{tbl:norm} we give a summary of some of the quantities we have reviewed. \input{supp_norms} %% %% %% \section{A from-scratch example: White noise.} Without using the tools in \psd{} we will build up an example using R commands, in order to highlight the topic of normalization. First, generate a normally distributed series\footnote{ Although a white noise process is not strictly bandlimited, we will use it to demonstrate differences in normalization. }, and then find its Discrete Fourier Transform (DFT)\footnote{ A proper DFT is normalized by the length of the series; however, most DFT calculators (including \Rcmd{stats::fft}) eschew this normalization for efficiency's sake. }. <<eval=TRUE, echo=TRUE, label="SyntheticwhitenoiseandaDFT">>= # using x from the filter analogy section xv <- var(x) X <- fft(x) class(X) length(X) @ We can easily find the amplitude and phase response followed by equivalent \idx{energy spectral density} estimates\footnote{ Note the equivalence between the complex conjugate based estimates. }: <<eval=TRUE, echo=TRUE, label="Amplitudeandphasespectra">>= Sa <- Mod(X) # Amplitude spectrum Sp <- Arg(X) # Phase spectrum XC <- Conj(X) Se <- Sa**2 Se_2 <- Mod(XC * X) all.equal(Se, Se_2) Se_2R <- Mod(X * XC) all.equal(Se, Se_2R) @ The single-sided \idx{power spectral density} estimates follow once we have the Nyquist frequency, defined as half the sampling rate: <<eval=TRUE, echo=TRUE, label="Nyquistfrequencies">>= fsamp <- 1 # sampling freq, e.g. Hz fNyq <- fsamp/2 # Nyquist freq Nf <- N/2 # number of freqs nyfreqs <- seq.int(from=0, to=fNyq, length.out=Nf) S <- Se[2:(Nf+1)] * 2 / N # Finally, the PSD! @ To approximate the integrated spectrum in the case of a ``flat" spectrum, we need an accurate measure of the first moment of the spectral values. The \Rcmd{median} estimator produces a biased estimate because the distribution of spectral values roughly follows a $\chi^2_\nu$ distribution, where $\nu$ is the number of degrees of freedom and, for this distribution, the expectation of the first moment. To find this value we perform a conjugate gradient based minimization of the best-fitting $\chi^2$ distribution, and compare this with the value returned by \Rcmd{mean}. Our starting point will be the estimated mean value. We visualize the fit with a ``Q-Q" plot, which shows PSD quantiles values as a function of $\chi^2$ quantiles, using the optimized value of the number of degrees of freedom; this is shown in Figure \ref{fig:qqchi}. <<eval=TRUE, echo=TRUE, label=OPTIMEXPECT>>= # 0) Setup optimization function for dof, using conjugate gradients\\ # min L1 |PSD - Chi^2(dof)| Chifit <- function(PSD){optim(list(df=mean(PSD)), function(dof){ sum(log(PSD)) - sum(log(dchisq(PSD, dof))) }, method="CG")} # 1) run optimization Schi <- Chifit(S) # Get 'df', the degrees of freedom print(dof <- Schi$par[[1]]) @ While the optimal degrees of freedom is very nearly the correct value of two (2), the value produced by \Rcmd{mean} is different by merely one percent (this is certainly an acceptable bias for our purposes). <<eval=TRUE, echo=TRUE, label=MEANEXPECT>>= # compare with the mean and median c(mSn <- mean(S), median(S)) @ \begin{figure}[htb!] \begin{center} <<eval=TRUE, echo=FALSE, fig.width=5, fig.height=5, label=QQFIT>>= par(pty="s", las=1) ttl <- expression("Q-Q plot for PSD and" ~~ chi^2 ~~ "fit") qqplot(log(qchisq(ppoints(N), df=dof)), log(S), main = ttl, ylab="Spectrum quantiles", xlab="Distribution quantiles") abline(0,1, col="red") @ \caption{``Q-Q" plot of quantiles of our example PSD against theoretical $\chi^2_\nu$ quantiles. The distribution is calculated with a value for degrees of freedom ($\nu$) obtained from the $L_1$-norm minimization procedure. Such a minimization is not generally required, since we have shown the estimator found with \Rcmd{mean} is reasonably accurate.} \label{fig:qqchi} \end{center} \end{figure} \begin{figure}[htb!] \begin{center} <<eval=TRUE, echo=FALSE, fig.width=6, fig.height=3.5, label=PSD>>= par(las=1, mgp = c(2.2, 1, 0)) ylab <- expression("units"^2 ~~ "/ frequency") plot(nyfreqs, S, type="h", xlab="Nyquist frequency", ylab=ylab, yaxs="i") abline(h=dof, lwd=2, col="red") @ \caption{Power spectral density estimates for a single realization of a $\mathcal{N}(0,1)$ process, in linear units. The horizontal line shows the expectation of the spectral estimates, obtained from the $\chi^2$ fit in Figure \ref{fig:qqchi}; this value is used to test normalization.} \label{fig:psdN} \end{center} \end{figure} An estimate of the integrated spectrum should roughly equal the known variance. Figure \ref{fig:psdN} plots the PSD of our white noise series with the value of $\nu$ from the optimization, with which we can perform a variance--normalization test: <<eval=TRUE, echo=TRUE, label="Testnormalization">>= mSn <- dof test_norm <- function(sval, nyq, xvar){svar <- sval * nyq; return(svar/xvar)} print(xv_1 <- test_norm(mSn, fNyq, xv)) xv_2 <- sum(S)/Nf * fNyq / xv # an alternate test all.equal(xv_1, xv_2) @ But what if the sampling frequency \texttt{fsamp} changes? An obvious change will be the actual Nyquist frequency, which means the variance-normalization test will fail if the PSD estimates are not re-scaled. We simply re-scale the frequencies and PSD with the sampling rate to obtain the properly-normalized spectra. <<eval=TRUE, echo=TRUE, label="Applycorrectnormalization">>= fsamp <- 20 fNyq <- fsamp / 2 freqs <- fsamp * nyfreqs Snew <- S / fsamp # Test variance crudely mSn <- mean(Snew) test_norm(mSn, fNyq, xv) @ In Figure \ref{fig:psdsamp} we plot the PSD with new normalization, and compare it to the previous normalization. Spectral values are shown as decibels (relative to 1 units$^2$ Hz$^{-1}$), using: <<eval=TRUE, echo=TRUE, label="DB">>= # decibel function dB <- function(y) 10*log10(y) @ \begin{figure}[htb!] \begin{center} <<eval=TRUE, echo=FALSE, fig.width=6, fig.height=3.8, label=PSD2>>= par(las=1) plot(freqs, dB(S), type="l", col="dark grey", xlab="Frequency", ylab="dB") lines(freqs, dB(Snew), lwd=1.3) lines(c(0,fNyq), rep(dB(mSn),2), lwd=3, col="red") abline(h=dB(1/fNyq), col="blue") @ \caption{Rescaled PSD estimates for a single realization of a $\mathcal{N}(0,1)$ process with a sampling rate of 20 s$^{-1}$ rather than 1 s$^{-1}$ as from before. The original spectrum (grey) is scaled to the appropriate level (black). The thick red line shows the mean (rescaled) spectral level, and the blue line shows the predicted mean value based on twice the sampling frequency.} \label{fig:psdsamp} \end{center} \end{figure} \section{Normalization used in \Rcmd{stats::spectrum}} The PSD estimator included in the core distribution of R is \Rcmd{stats::spectrum}, which calls either \Rcmd{stats::spec.ar} or \Rcmd{stats::spec.pgram} for cases of parametric and non-parametric estimation, respectively. For this discussion we compare to \Rcmd{spec.pgram}; the user can optionally apply a single cosine taper, and/or a smoothing kernel. By default \Rcmd{spec.pgram} assumes the sampling frequency for the input series is 1, and normalizes accordingly; however, the sampling information may be specified by creating a \Rcmd{ts} object from the series prior to spectrum estimation: <<eval=TRUE, echo=TRUE, label="Changethesamplingfrequency">>= fsamp <- 20 xt <- ts(x, frequency=fsamp) pgram20 <- spec.pgram(xt, pad=1, taper=0, plot=FALSE) pgram01 <- spec.pgram(ts(xt, frequency=1), pad=1, taper=0, plot=FALSE) @ We plot the two PSD estimates on the same scales, in Figure \ref{fig:rawpgram}, utilizing the plot method for \Rcmd{spec} objects: \Rcmd{plot.spec}. We also show horizontal lines corresponding to the inverse of twice the sampling rate, which puts the spectra about a factor of 2 too low: <<eval=TRUE, echo=TRUE>>= mSn/mean(pgram20$spec) @ \begin{figure}[htb!] \begin{center} <<eval=TRUE, echo=FALSE, fig.width=6, fig.height=3.5, label=NORMS>>= par(las=1) plot(pgram01, log="dB", xlim=c(0,10), ylim=36*c(-1,.3), main="", col="dark grey") plot(pgram20, log="dB", add=TRUE) abline(h=-dB(c(1, 20)*2), col=c("dark grey","black")) abline(v=.5*c(1,20), lty=3) #lines(c(0,fNyq), rep(dB(mSn),2), lwd=1.5, col="red") abline(h=dB(1/fNyq), col="blue") @ \caption{Power spectral densities from \Rcmd{spec.pgram} for the same data series. The grey series is the PSD for a sampling rate of 1; whereas, the black series is the PSD for a sampling rate of 20. The horizontal lines show levels corresponding to the inverse of twice the sampling rate (black and grey), and the expected spectral level for the 20 Hz sampling (blue). Vertical lines show the respective Nyquist frequencies.} \label{fig:rawpgram} \end{center} \end{figure} Because the frequencies are clearly correct, this factor of two likely means the spectra will fail our simple variance-normalization test. They do fail, by a factor of two, again too low: <<eval=TRUE, echo=TRUE, label="Testthenormalizationagain">>= test_norm(mean(pgram01$spec), 0.5, xv) test_norm(mean(pgram20$spec), 10, xv) @ But why? This errant factor of two comes from the assumption of a \idx{double-sided} spectrum, which is at odds with our definition of the \idx{single-sided} spectrum by (you guessed it) a factor of two. We can illustrate this with the following example, where we compare the PSDs from \Rcmd{spec.pgram} for a real and complex series: <<eval=TRUE, echo=TRUE, label="DoublesidedPSDfromspectrum">>= psd1 <- spec.pgram(x, plot=FALSE) psd2 <- spec.pgram(xc<-complex(real=x, imag=x), plot=FALSE, demean=TRUE) mx <- mean(Mod(x)) mxc <- mean(Mod(xc)) (mxc/mx)**2 mean(psd2$spec / psd1$spec) @ Again, a factor of two. This means that unless we are interested in analyzing complex timeseries, we need only multiply by two to obtain properly normalized spectra from \Rcmd{spectrum}, assuming the sampling information is included in the series. \section{PSD estimators in R} The suite of extensions having similar functionality to \psd{} is relatively limited; however, there are at least four which can produce sophisticated PSD estimates. We have summarized the available functions in Table \ref{tbl:methods} so far as we know. %\footnote{ %As of this writing (Jun 2020), \Rcmd{sapa} has been removed from CRAN. %}. \input{supp_specprogs} \section{Acknowledgements} We thank Richard Gaal for catching a subtle error in the from-scratch example. %\pagebreak \section*{Session Info} <<eval=TRUE, echo=TRUE, label=SI>>= utils::sessionInfo() @ %% bib and index %\pagebreak \bibliographystyle{apalike} \bibliography{REFS} \end{document}