---
title: "CorrectOverloadedPeaks"
subtitle: "A computational tool for automatic correction of overloaded signals in GC-APCI-MS"
author: "Jan Lisec"
date: "20.08.2016"
output: 
  rmarkdown::html_vignette:
    fig_width: 7
    fig_height: 7
vignette: >
  %\VignetteIndexEntry{CorrectOverloadedPeaks: a computational tool for automatic correction of overloaded signals in GC-APCI-MS}
  %\VignetteEngine{knitr::rmarkdown}
  %\usepackage[utf8]{inputenc}
---

This short Vignette will show how to correction overloaded signals in:

- an artificial test case
- a provided real data set

To achieve this we need to load the package functions as well as a small data example 
in `xcmsRaw` format.

```{r preload}
library(CorrectOverloadedPeaks)
data("mzXML_data")
```

Let us model a typical overloaded signal occurring frequently in GC-APCI-MS using the provided 
function `ModelGaussPeak`.

```{r}
pk <- CorrectOverloadedPeaks::ModelGaussPeak(height=10^7, width=3, scan_rate=10, e=0, ds=8*10^6, base_line=10^2)
plot(pk, main="Gaussian peak of true intensity 10^7 but cutt off at 8*10^6")
```

Now we roughly estimate peak boarders before applying the provided function `FitGaussPeak` 
to correct peak data.

```{r}
idx <- pk[,"int"]>0.005 * max(pk[,"int"])
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=pk[idx,"rt"], y=pk[idx,"int"], silent=FALSE, xlab="RT", ylab="Intensity")
```

The generated QC plot does show the optimal solution found (green line), indicating the substituted 
intensity values (grey circles) and obtained parameters (blue text) including the probably peak 
height (max_int=9.7*10^6) being very close to the true peak height (10^7). Now let's extend this 
simplified process to peaks from a real data set. The following function call will generate:

- a PDF in the working directory with QC-plots for 10 peaks from 5 chromatographic regions
- a console output with processing information
- a new file `cor_df_all.RData` in the working directory containing all extracted but non-corrected mass traces

```{r}
tmp <- CorrectOverloadedPeaks::CorrectOverloadedPeaks(data=mzXML_data, method="EMG", testing=TRUE)
```

Let us load these non-corrected mass traces for further visualization of package capabilities. 
For instance we can reprocess peak 2 from region 4 using the isotopic ratio approach:

```{r}
load("cor_df_all.RData")
head(cor_df_all[[1]][[1]])
tmp <- CorrectOverloadedPeaks::FitPeakByIsotopicRatio(cor_df=cor_df_all[[1]][[1]], silent=FALSE)
```

The extracted data contain RT and Intensity information for the overloaded mass trace (mz=350.164) 
as well as isotopes of this mz up to the first isotope which is not itself overloaded (M+2, green 
triangles). This isotope is evaluated with respect to its ratio to M+0 in the peak front (15.9%) 
and this ratio in turn is used to scale up the overloaded data points of M+0 (grey circles) as 
indicated by the black line. The data could of course be processed alternatively using the `Gauss` 
method as shown previously for artificial data.

```{r}
tmp <- CorrectOverloadedPeaks::FitGaussPeak(x=cor_df_all[[1]][[1]][,"RT"], y=cor_df_all[[1]][[1]][,"int0"], silent=FALSE, xlab="RT", ylab="Intensity")
```

```{r, echo=FALSE, prompt=FALSE}
if(file.exists("cor_df_all.RData")) file.remove("cor_df_all.RData")
if(file.exists("S5_35_01_2241_Int+LM.mzXML.pdf")) file.remove("S5_35_01_2241_Int+LM.mzXML.pdf")
if(file.exists("mzXML_data.pdf")) file.remove("mzXML_data.pdf")
```