---
title: "Upload Functionality"
author: "James P. Gilbert"
date: "`r Sys.Date()`"
output:
  pdf_document:
    toc: yes
  html_document:
    number_sections: yes
    toc: yes
vignette: >
  %\VignetteIndexEntry{Upload Functionality}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
---

# Introduction
This vignette describes the functionality for uploading results to a pre-created database schema.
In the examples here, we assume the use of sqlite for simplicity.
However, in principle any platform supported by the `DatabaseConnector` and `SqlRender` packages should work.

# Creating a schema definition file
It is recommended that every analytics package that creates data output should contain a csv file.
This is a requirement for packages that use the OHDSI `Strategus` library.

Schema definitions should conform to the following column headers:

```
table_name, column_name, data_type, is_required, primary_key
```

In addition, other packages may make use of additional fields such as `optional` or `empty_is_na`.
These are not required for uploading to a schem.

An example csv file may look like this:

```
# File inst/settings/resulsDataModelSpecifications.csv
table_name,column_name,data_type,is_required,primary_key
table_1,database_id,varchar,Yes,Yes
table_1,analysis_id,bigint,Yes,Yes
table_1,analysis_name,varchar,Yes,No
table_1,domain_id,varchar(20),No,No
table_1,start_day,float,No,No
table_1,end_day,float,No,No
table_1,is_binary,varchar(1),Yes,No
table_1,missing_means_zero,varchar(1),No,No
table_2,database_id,varchar,Yes,Yes
table_2,analysis2_id,bigint,Yes,Yes
table_2,concept_id,int,Yes,No
table_2,logic_description,varchar,No,No
table_2,valid_start_date,Date,Yes,No
table_2,concept_name,varchar(255),Yes,No
table_2,p_10_value,float,Yes,No
table_3,database_id,varchar,Yes,Yes
table_3,analysis3_id,bigint,Yes,Yes
table_3,concept_id,int,Yes,No
table_3,logic_description,varchar,No,No
table_3,valid_start_date,Date,Yes,No
table_3,concept_name,varchar(255),Yes,No
table_3,p_10_value,float,Yes,No
```

Note the use of sql server data types in the `data_type` field as well as `yes` and `no` in the binary field `is_required`
and `primary_key`.

We should also define a function for loading this file that converts the column headers to camel case format.

```{r eval = FALSE}
#' Get Results Data Model Specifcations
getResultsDataModelSpec <- function() {
  # For loading inside an R package
  specPath <- system.file("settings", "resulsDataModelSpecifications.csv", package = utils::packageName())
  spec <- readr::read_csv(specPath, show_col_types = FALSE)
  colnames(spec) <- SqlRender::snakeCaseToCamelCase(colnames(spec))
  return(spec)
}
```

# Creating a schema
This section describes how to use `RMM` to create a schema from a specifications file.

```{r eval=FALSE}
connectionDetails <- DatabaseConnector::createConnectionDetails("sqlite", server = "MySqliteDb.sqlite")
connection <- DatabaseConnector::connect(connectionDetails)
sql <- ResultModelManager::generateSqlSchema(schemaDefinition = getResultsDataModelSpec())
DatabaseConnector::renderTranslateExecuteSql(connection, sql, database_schema = "main", table_prefix = "pre_")
DatabaseConnector::disconnect(connection)
```
Note that the SQL generated by `generateSqlSchema` contains the required parameter `database_schema` and the optional
parameter `table_prefix`.

# Uploading results
This section shows how to upload results conforming to the above specified schema.
It is assumed that a zip file has been created with csv files corresponding to the table names provided in the
`resulsDataModelSpecifications.csv` file.

```{r eval=FALSE}
ResultModelManager::unzipResults(zipFile = "MyResultsZip.zip", resultsFolder = "extraction_folder")
ResultModelManager::uploadResults(connectionDetails,
  schema = "main",
  resultsFolder = "extraction_folder",
  tablePrefix = "pre_",
  purgeSiteDataBeforeUploading = FALSE,
  specifications = getResultsDataModelSpec()
)
```
This will extract the zip file and upload results using `DatabaseConnector`.
For most platforms some form of bulk loading is supported, the setup for this is contained in the documentation for
`DatabaseConnector`, it is strongly advised that you set up and use this functionality for any platforms where you will
store large data sets.
However, this is not necessary for the sqlite example demonstrated here.