---
title: "Import"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Import}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

# Introduction

This vignette gives an overview about importing files for downstream analysis with  **mpwR**. 

## Import

Importing the output files from each software can be easily performed with `prepare_mpwR`. Please put all output files in one folder and follow the guidelines for naming the files. No other files/subfolders are allowed.

```{r import, eval=FALSE}
files <- prepare_mpwR(path = "Path_to_Folder_with_files")
```

## Naming Files
The software output files need to be named according some simple rules.

### Suffix
Please use the following suffixes for the corresponding software:

* Spectronaut
    * *.tsv - _Report

* MaxQuant
    * evidence.txt - _evidence
    * peptides.txt - _peptides
    * proteinGroups.txt - _proteinGroups
    
* DIA-NN
    * *.tsv - _Report
 
* ProteomeDiscoverer
    * PSMs.txt - _PSMs
    * PeptideGroups.txt - _PeptideGroups
    * Proteins.txt - _Proteins
    * ProteinGroups.txt - _ProteinGroups

Note that the suffixes for MaxQuant and ProteomeDiscoverer can be used as is.

### Prefix
Please use the same prefix for your analysis and the respective software output file(s). Each analysis needs a unique identifier.

Examples:

* Spectronaut
    * WorkflowA1_Report.tsv

* MaxQuant
    * WorkflowA2_evidence.txt
    * WorkflowA2_peptide.txt
    * WorkflowA2_proteinGroups.txt
    
* DIA-NN
    * WorkflowA3_Report.tsv
 
* ProteomeDiscoverer
    * WorkflowA4_PSMs.txt
    * WorkflowA4_PeptideGroups.txt 
    * WorkflowA4_Proteins.txt 
    * WorkflowA4_ProteinGroups.txt
    

## Handling Error Messages
Subsequently, common error messages are explained and solutions are pointed out.
<p>&nbsp;</p>
<p>&nbsp;</p>
### Wrong suffix
Error in prepare_mpwR(path = path) : 
  Unknown file suffix detected! Please use: _evidence, _peptides, _proteinGroups, _PSMs, _Proteins, _PeptideGroups, _ProteinGroups, _Report 
<p>&nbsp;</p>
Solution:
Check your suffixes and make sure to use the correct spelling and the correct one for the respective software.
<p>&nbsp;</p>
<p>&nbsp;</p>
### Wrong number of files
Error in prepare_mpwR(path = path) : 
  Wrong number of analyses detected for WorkflowA2! MaxQuant requires 3 input files. Remember: Use unique name for each analysis.
<p>&nbsp;</p>
Solution:
Make sure to include all output files from the respective software. MaxQuant requires 3 files, PD 4 files, DIA-NN and Spectronaut 1 file, respectively. Also use the same name (prefix) for all files.
<p>&nbsp;</p>
<p>&nbsp;</p>
### Subfolder
Error in data.table::fread(input_file, ...) : 
  File 'DIR//subfolder' is a directory. Not yet implemented.
<p>&nbsp;</p>
Solution:
Do not use subfolders in the directory of the output files. Put all files in a folder and use this path for prepare_mpwR.
<p>&nbsp;</p>
<p>&nbsp;</p>
### Same file name
Error in prepare_mpwR(path = path) : Please use a unique filename for each analysis.
<p>&nbsp;</p>
Solution:
Please use a unique name (prefix) for each respective output file(s). E.g. if you used...:

* WorkflowA1_Report.tsv 
* WorkflowA1_evidence.txt
* WorkflowA1_peptide.txt
* WorkflowA1_proteinGroups.txt

... the error message is displayed. As a solution, for example add the used software to the name:

* WorkflowA1_Spectronaut_Report.tsv 
* WorkflowA1_MQ_evidence.txt
* WorkflowA1_MQ_peptide.txt
* WorkflowA1_MQ_proteinGroups.txt

<p>&nbsp;</p>
<p>&nbsp;</p>
### Missing columns
Error in prepare_input(ordered_files[[i]][["data"]][["DIA-NN"]], software = "DIA-NN") : Not all required columns present in submitted data.

Here example for DIA-NN. Similar messages for other software tools. Please make sure to include all required columns for the used software. Details are in the vignette [Requirements](https://okdll.github.io/mpwR/articles/Requirements.html).