---
title: "Databases"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Databases}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---



There are many databases available in AWS services. This package deals with a subset of them, including:

- RDS: MariaDB, MySQL, Postgres
- Redshift

We only handle a subset as `sixtyfour` aims to be an opinionated package providing easy to use interfaces - achieving that goal means supporting the subset of the most well troden paths.


```r
library(sixtyfour)
```

All database related functions in `sixtyfour` should start with `aws_db`.

## Redshift

The `aws_db_redshift_create` function creates a cluster for you (a Redshift instance is called a "cluster").

There's an important distinction between Redshift and RDS. Redshift uses IAM username/password, whereas with RDS you can use username/password setup for each instance of a RDS database, or do authentication through IAM. However, with RDS you can't simply pass your IAM username/password (see notes below).

First, let's create a security group with an ingress rule so you can access your Redshift cluster.


```r
my_security_group <- aws_vpc_sg_with_ingresss("redshift")
```

Notes on the parameters used in the example below:

- `id`: this is an ID you come up with, it must be unique for all clusters within your AWS account
- `user`/`pwd`: your IAM username and password
- `security_group_ids`: it's best to first create a security group to handle permissions to your Redshift cluster. See example above for how to do that. Pass in it's identifier here
- `wait`: Since we're using `wait=TRUE` the call to `aws_db_redshift_create` will likely take about 3 minutes to run. You can set `wait=FALSE` and not wait, but then you'll want to check yourself when the instance is available.


```r
aws_db_redshift_create(
  id = "some-id",
  user = "your-username",
  pwd = "your-pwd",
  security_group_ids = list(my_security_group),
  wait = TRUE
)
```

Connect to the cluster


```r
con <- aws_db_redshift_con(
  user = "your-username",
  pwd = "your-pwd",
  id = "some-id"
)
```

List tables, create a table, etc


```r
library(DBI)
dbListTables(con)
dbWriteTable(con, "mtcars", mtcars)
dbListTables(con)
dbReadTable(con, "mtcars")
```

Use dplyr/et al.


```r
library(dplyr)
tbl(con, "mtcars") %>%
  filter(mpg == 4)
```

Important: Remember to delete your cluster when your done!

## RDS

The process for MariaDB, MySQL, and Postgres are more or less the same, so we'll only demonstrate MariaDB.

In a future version of this package you'll be able to use IAM to authenticate with RDS, but for now `sixtyfour` does not support IAM for RDS. The current "happy path" (read: easy) process of starting an RDS instance with `aws_db_rds_create` is as follows:

- Supply an ID (identifier) for your instance that you create
- A random username is created and used for database authentication
- A random password is created (via AWS Secretamanager) and used for database authentication
- A security groups is created
  - An ingress rule is added to the security group with your IP address
- On RDS instance creation, we use the above username, password, and security group
- The username, password, and security group information are stored in AWS Secretamanager for subsequent use

To connect to your RDS instance, you use `aws_db_rds_con`. The "happy path" for connecting is:

- Pass in only the ID used above
- We fetch secrets from the AWS Secretamanager and ask you which you'd like to use
- We gather all the necessary details to connect to your instance
- Return a DBI connection object, e.g., `MariaDBConnection` for MariaDB

Let's walk through the steps with some code.

First, create an RDS instance - in this case for MariaDB.

Notes on the parameters used in the example below:

- `id`: this is an ID you come up with. see `?aws_db_rds_create` for constraints
- `class`: The compute and memory capacity of the instance; see the [AWS docs](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.DBInstanceClass.html) for instance class options. The default for this parameter (`db.t3.micro`) gives you an instance with relatively small capacity in terms of memory and compute - but is set that way to minimize costs in case you forget to turn it off if you're not using it!
- `engine`: the default is `mariadb`; other options are mysql or postgres
- `wait`: Since we're using `wait=TRUE` (the default for this function) the call to `aws_db_rds_create` will likely take about 5 minutes to run. You can set `wait=FALSE` and not wait, but then you'll want to check yourself when the instance is available.


```r
aws_db_rds_create(
  id = "myinstance",
  class = "db.t3.micro",
  engine = "mariadb",
  wait = TRUE
)
```

Connect to the instance


```r
con <- aws_db_rds_con(id = "myinstance")
```

List tables, create a table, etc


```r
library(DBI)
dbListTables(con)
dbWriteTable(con, "mtcars", mtcars)
dbListTables(con)
dbReadTable(con, "mtcars")
```

Use dplyr/et al.


```r
library(dplyr)
tbl(con, "mtcars") %>%
  filter(mpg == 4)
```