---
title: "Doing Research with Parallel LLM API Calls"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Doing Research with Parallel LLM API Calls}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
  %\DontRun
---


```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  eval = FALSE
)
```

When doing research on large language models, we often need to compare a number 
of models; maybe with different arguments, maybe with different prompts, etc. 
In this example we want to call an LLM multiple times with various temperatures
and see the results. We suggest different first names and ask the model to pick one.

**Note: This vignette requires a valid OpenAI API key and will not run during package installation.**

```{r}
library(LLMR) 
library(ggplot2)
```



## Setup parallel processing
```{r}
# necessary step
setup_llm_parallel(workers = 20, verbose = TRUE)
```

## Create Configuration
```{r}
config <- llm_config(
  provider = "openai",
  model = "gpt-4.1-nano",
  api_key = Sys.getenv("OPENAI_API_KEY"),
  max_tokens = 10  # Very few tokens are requested
)
```

## The message

```{r}
messages <- list(
  list(role = "system", content = "You respond to every question with exactly one word.
                                   Nothing more. Nothing less."),
  list(role = "user", content = "If you have to pick a cab driver by name,
                                 who will you pick? D'Shaun, Jared, or JosÃ¨?")
)
```



Define temperature values to test
```{r}
temperatures <- seq(0, 1.5, 0.3)

# Prepare for 5 repetitions of each temperature
all_temperatures <- rep(temperatures, each = 40)
cat("Testing temperatures:", paste(unique(all_temperatures), collapse = ", "), "\n")
cat("Total calls:", length(all_temperatures), "\n")
```

Let us run this now. The `LLMR` package offers 4 parallelizing wrapper. Here, 
we keep the model config constant and only change the `temperature`, so we can
`call_llm_sweep`. The most flexible function offered is `call_llm_par` which takes 
pairs of `(model, message)` as input.

```{r}
# Run the temperature sweep
cat("Starting parallel temperature sweep...\n")
start_time <- Sys.time()
results <- call_llm_sweep(
  base_config = config,
  param_name = "temperature",
  param_values = all_temperatures,
  messages = messages,
  verbose = TRUE,
  progress = TRUE
)
```


```{r}
end_time <- Sys.time()
cat("Sweep completed in:", round(as.numeric(end_time - start_time), 2), "seconds\n")
```


Let us clean the output and visualize this:
```{r fig.width= 8}

results |> head()
  
# remove anything other than a-z, A-Z from response_text
# do not remove accented letter
results$response_text_clean <- gsub("[^a-zA-ZÃ€-Ã¿ ]", "", results$response_text)

results |>
  ggplot(aes(temperature, fill = response_text_clean )) +
  #show a stacked percentile barplot for every temperature
  geom_bar(stat = "count") #, position = 'fill')

```