---
title: "Requirements"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Requirements}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Introduction

This vignette gives an overview about requirements for **mpwR**. The outputs of the following software applications are supported:

* Spectronaut 
* MaxQuant 
* DIA-NN 
* Proteome Discoverer 

## Spectronaut

The following columns are required:

* R.FileName
* PG.ProteinGroups
* PG.Quantity 
* PEP.StrippedSequence
* PEP.NrOfMissedCleavages
* PEP.Quantity
* EG.Identified 
* EG.ModifiedPeptide 
* EG.PrecursorId 
* EG.ApexRT
* FG.Charge 

## MaxQuant

The following files and respective columns are required:

* evidence.txt 
    * Raw file 
    * Proteins 
    * Sequence
    * Missed cleavages 
    * Retention time 
    * Modified sequence 
    * Charge 
    * Protein group IDs
    * Peptide ID 
    * Potential contaminant 
    * Reverse
    * Intensity

* peptides.txt
    * Sequence 
    * Missed cleavages 
    * Last amino acid 
    * Amino acid after 
    * Intensity column(s)
    * LFQ intensity column(s) 
    * Protein group IDs
    * Evidence IDs 
    * Potential contaminant 
    * Reverse
    
* proteinGroups.txt 
    * ProteinIDs 
    * Majority protein IDs
    * Peptide counts (all) 
    * Intensity column(s) 
    * LFQ intensity column(s) 
    * id 
    * Peptide IDs 
    * Evidence IDs
    * Potential contaminant
    * Reverse
    * Only identified by site
    
## DIA-NN

The following columns are required:

* Protein.Group 
* Precursor.Id 
* Run 
* Stripped.Sequence 
* Protein.Ids 
* Modified.Sequence 
* PG.MaxLFQ 
* Precursor.Charge
* RT 
* Precursor.Id
* Precursor.Quantity
* PG.Q.Value
* Q.Value

## Proteome Discoverer

Please enable R-friendly headers for exporting the files. The following files and respective columns are required:

* PSMs.txt
    * Confidence
    * Spectrum File
    * Number of Missed Cleavages
    * Protein Accessions 
    * Annotated Sequence 
    * Modifications 
    * Charge 
    * RT in min

* PeptideGroups.txt
    * Number of Protein Groups
    * Number of Proteins 
    * Number of PSMs
    * Confidence
    * Sequence
    * Modifications
    * Number of Missed Cleavages
    * Found in Sample column(s) (Required: Data Distributions node in the consensus workflow; set Show Found in Samples parameter to True)

* Proteins.txt
    * Proteins Unique Sequence ID
    * Protein FDR Confidence Combined
    * Accession
    * Description
    * Found in Sample column(s) (Required: Data Distributions node in the consensus workflow; set Show Found in Samples parameter to True)

* ProteinGroups.txt
    * Protein Groups Protein Group ID
    * Group Description
    * Number of Proteins
    * Number of Unique Peptides
    * Found in Sample column(s) (Required: Data Distributions node in the consensus workflow; set Show Found in Samples parameter to True)