---
title: "Creating a data.frame in C"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Creating a data.frame in C}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---


```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
knitr::knit_engines$set(callme = callme:::callme_engine)
library(callme)
```

```{css, echo=FALSE}
.callme         { background-color: #E3F2FD; }
pre.callme span { background-color: #E3F2FD; }
```

## Introduction

`data.frame` objects are mostly `list` objects with some added constraints.

In general a `data.frame` consists of

* a list (`VECSXP`)
* each item in the list is a column
* each element in the list must have the same length
* a character vector of column names (optional, but always included in practice)
* an (optional) character vector of row names
* the class `data.frame`

## Example: Create a data.frame in C

The following code generates a data.frame within C.

The C code is the equivalent of the following R code:

```{r eval=FALSE}
data.frame(
  idx = 1:10,
  x   = (0:9) +  10.0,
  y   = (0:9) + 100.0
)
```


```{callme}
#include <R.h>
#include <Rinternals.h>

//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
// Creating a data.frame within C and returning it to R
//
//  1. Create individual integer/real/whatever vectors
//  2. allocate a VECSXP of the correct size
//  3. assign each member into the data.frame
//  4. create names and assign them to the data.frame
//  5. set the class to "data.frame"
//  6. set rownames on the data.frame
//~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
SEXP create_data_frame_in_c(void) {


  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Each column of the data.frame gets allocated separately
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  int n = 10; // number of rows

  SEXP idx_ = PROTECT(allocVector(INTSXP , n));
  SEXP x_   = PROTECT(allocVector(REALSXP, n));
  SEXP y_   = PROTECT(allocVector(REALSXP, n));


  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Assign some dummy values into the columns
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  for (int i = 0; i < n; i++) {
    INTEGER(idx_)[i] = i + 1;
    REAL(x_)[i] = i + 10;
    REAL(y_)[i] = i + 100;
  }

  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Allocate a data.frame
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  SEXP df_ = PROTECT(allocVector(VECSXP, 3));

  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Add columns to the data.frame
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  SET_VECTOR_ELT(df_,  0, idx_);
  SET_VECTOR_ELT(df_,  1, x_);
  SET_VECTOR_ELT(df_,  2, y_);

  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Treat the VECSXP as a data.frame rather than a list
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  setAttrib(df_, R_ClassSymbol, mkString("data.frame"));

  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Set the names on the data.frame
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  SEXP names = PROTECT(allocVector(STRSXP, 3));
  SET_STRING_ELT(names,  0, mkChar("idx"));
  SET_STRING_ELT(names,  1, mkChar("x"));
  SET_STRING_ELT(names,  2, mkChar("y"));
  setAttrib(df_, R_NamesSymbol, names);

  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  // Set the row.names on the data.frame
  // Use the shortcut as used in .set_row_names() in R
  // i.e. set rownames to c(NA_integer, -len) and it will
  // take care of the rest. This is equivalent to rownames(x) <- NULL
  //~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  SEXP rownames = PROTECT(allocVector(INTSXP, 2));
  SET_INTEGER_ELT(rownames, 0, NA_INTEGER);
  SET_INTEGER_ELT(rownames, 1, -n);
  setAttrib(df_, R_RowNamesSymbol, rownames);

  UNPROTECT(6);
  return df_;
}
```


```{r}
create_data_frame_in_c()
```