By Eric Burden | November 26, 2021
‘Tis the season! In this article, I’d like to share how I used a smattering of R and some free online services to overcome a surprisingly tricky holiday speedbump.
The “Problem”
Every year, I, my brother, my sister, and our spouses draw names for Christmas gifts (mostly because we’d all rather buy presents for each others’ kids). This process has just a few requirements:
- Each person should draw the name of another person in the “gift pool” randomly.
- You cannot draw your own name or your spouse’s name.
- No one should know the name drawn by anyone else, until gifts are exchanged.
Our first attempt at this involved writing names on slips of paper, dropping the paper in a hat, then drawing names. The problem with this approach was that we may need to repeat the process in case the last name in the hat was the last person to draw or their spouse. And, in a pool of only six married couples, this happened fairly regularly.
Automate It!
Well, being who I am and doing what I do, it fell to me to sort this whole thing out (plus, I’m the oldest sibling, so if it’s not obviously someone else’s responsibility, then it’s probably mine). Thankfully, this particular problem was well within my wheelhouse. The first, and probably trickiest, part was to randomly assign names without assigning anyone their own or their spouse’s name for gifting. The following bit of R solves this pretty nicely:
# How this works isn't important for now. What is important is to know that
# `names` is a dataframe with two columns:
# - family: Number used to group individuals who should not be given each
# others' names for the gift exchange. Spouses, for example.
# - name: The name of the individual. Each name should be unique. Include
# last and middle names, if necessary.
names <- get_names_data()
# This is not a clever algorithm. We create a copy of `names` for the person
# who will draw a name (the "choosers") and for the person whose name will
# be drawn (the "choices"). Then the rows are randomly joined and checked
# to see if any row contains a `chooser_family` and `choice_family` that
# are the same. If so, do it again.
choosers <- dplyr::rename_all(names, stringr::str_replace, "^", "chooser_")
choices <- dplyr::rename_all(names, stringr::str_replace, "^", "choice_")
try_again <- T
while (try_again) {
a <- dplyr::mutate(choosers, join_key = sample(seq(name_count)))
b <- dplyr::mutate(choices, join_key = sample(seq(name_count)))
selections <- dplyr::full_join(a, b, by = "join_key")
try_again <- with(selections, any(chooser_family == choice_family))
}
selections <- dplyr::mutate(selections, viewed = FALSE)
# Write the selections somewhere (TBD)
write_selections_data(selections)
Note: I’ll include the library name on all library function calls, just to make it clear which library provides which function. This isn’t strictly necessary, but it is helpful for these self-contained examples.
Ignoring where we get the names
data from and where we save the selections
data to for now, let’s walk through the “algorithm” for randomly assigning
gift givers to gift receivers:
- Take the dataframe
names
and make a copy of it, pre-pending “chooser_” to all the column names. - Make another copy of
names
, pre-pending “choice_” to all the column names. - Add a column named
join_key
to bothchoosers
andchoices
containing a random number from 1 to the number of rows. - Join
choosers
andchoices
on thatjoin_key
. - If any record has a
chooser_family
andchoice_family
that are the same, do it again. - When every “chooser” is assigned a “choice” from another family, add a
viewed
column with all values defaulted toFALSE
.
We’ll use that viewed
column later to indicate which people have used our
process to view which name they’ve “drawn”. This is a super inefficient way
to do this, and it would be more efficient the more names/families are included,
but since it only needs to run once a year, it’s efficient enough.
Deploy It?
The next tricky problem is to provide this information to each person in a way that prevents them (and preferably me, too) from checking to see who drew their name before Christmas. Also, because we all live in different places, it would be nice to have a way to do this at any time, not just when we’re all together for Thanksgiving. Since I’m solving the “random selection” problem in R, and because Shiny apps are so darn easy to create (for simple apps) and deploy, I decided to host a super simple app on shinyapps.io.
This also means I need a persistence layer for this application, because apps hosted on the free tier are prone to being “put to sleep”, causing them to potentially lose state (and thus the permutation of names) between accesses. No problem, there’s at least one really well supported way to persistently save and access data from a Shiny app: Google Drive. Now, this isn’t the most sophisticated approach, but for something that only gets used by six people once a year, it’s sophisticated enough.
Step One: Setup
The final project setup looks like this:
root
├── app.R
├── .secrets
│ └── <OAUTH TOKEN>
├── christmas_drawing.Rproj
└── .Renviron
app.R
contains the code for a super simple shiny application. .secrets
is a folder containing an OAuth token for authenticating with Google Drive
(more on that in a second). There’s the RStudio *.Rproj
and a .Renviron
file for environment variables, and that’s pretty much it. Not shown are a
renv
directory used
to manage project dependencies and an
rsconnect
folder auto-generated by
RStudio when deploying the app. I’ve gotten into the habit of using renv
for
even super small projects like this one, mostly because it’s pretty frictionless.
Because I’m hosting this application on shinyapps.io, I’m
using a dedicated Google account for this project, and this app only lives for
a couple of days a year, I feel comfortable including my environment variables
in a .Renviron
file in my project root. .Renviron
contains two entries:
- GMAIL_ADDRESS: The username for the Google Account whose Drive we’re using.
- DRIVE_FOLDER_ID: The ID for the Drive folder used.
In order to access Google Drive, I needed to include the following snippet in
the app.R
file in order to get and cache the OAuth token in the project
directory and upload it to shinyapps.io:
googledrive::drive_auth(
email = Sys.getenv("GMAIL_ADDRESS"),
cache = "./.secrets/"
)
This code needs to be run at least once prior to deploying the app, so that the token will be there. Again, this is a small application being deployed for a limited amount of time, using a Google account that’s made specifically for this. This is not how you should handle credentials “in real life”. For a more responsible take on authenticating with Google in a more permanent application, see this documentation.
Step Two: The “Database”
As shown above, our “data layer” consists of two files: a “names” flat file that lists participants and their groups and a “selections” binary file that holds the random pairings between gift givers and receivers. The names are stored in a “*.csv” file because it’s easy and I can add to it over time as needed. The pairings are stored in a binary file mostly because that makes it harder to just double-click and peek at the contents. This helps with the third requirement listed above. It doesn’t make it impossible to peek, but so far it adds enough overhead that I haven’t been tempted to casually scope out who drew my name so I can drop hints…
To make this happen, I need to be able to read the “names” file and read/write the “selections” file from Google Drive. The following functions help with this:
#' Given a file name, return a Drive ID
#'
#' This function will raise an error if the requested file name does not exist
#' in the folder identified by `DRIVE_FOLDER`.
#'
#' @param file_name the name of the Drive file
#'
#' @return the ID of the Drive file
get_file_as_id <- function(file_name) {
(found_file_id
<- googledrive::drive_ls(DRIVE_FOLDER)
|> dplyr::filter(name == file_name)
|> dplyr::pull(id)
|> googledrive::as_id())
if (length(found_file_id) == 0) stop("No file with that name!")
found_file_id
}
#' Gets the list of names, stored in "names.csv"
#'
#' This file should contain two fields:
#' - family: Number used to group individuals who should not be given each
#' others' names for the gift exchange. Spouses, for example.
#' - name: The name of the individual. Each name should be unique. Include
#' last and middle names, if necessary.
#'
#' @return a dataframe of the "names.csv" data
get_names_data <- function() {
(get_file_as_id("names.csv")
|> googledrive::drive_read_string()
|> readr::read_csv())
}
#' Gets the list of gift-giving pairs, stored in "selections.rds"
#'
#' This file will contain the following fields:
#' - chooser_family: The `family` value of the individual giving a gift
#' - chooser_name: The `name` of the individual giving a gift
#' - choice_family: The `family` value of the individual receiving a gift
#' - choice_name: The `name` of the individual receiving a gift
#' - viewed: Logical indicating if the person identified by
#' `chooser_name` has viewed their choice in the app
#'
#' @return a dataframe of the "selections.rds" data
get_selections_data <- function() {
(get_file_as_id("selections.rds")
|> googledrive::drive_read_raw()
|> base::rawConnection()
|> base::gzcon()
|> base::readRDS())
}
#' Write to "selections.rds"
#'
#' Saves `selections` as a local RDS file, then uploads it to Google Drive
#'
#' @param selections a dataframe to write to "selections.rds"
write_selections_data <- function(selections) {
base::saveRDS(selections, "selections.rds")
googledrive::drive_upload(
"selections.rds",
DRIVE_FOLDER,
name = "selections.rds",
overwrite = TRUE
)
}
I opted to leave the doc comments in for clarity. I found it clearer to break
out the get_file_as_id()
function for error-checking and retrieving files
by name. DRIVE_FOLDER
is a global variable that contains the ID of the Google
Drive folder that contains the “names” and “selections” files. This should also
clear up the mystery of where the code that generates the pairs gets data from
and writes data to.
Step Three: The User Interface
There isn’t much to say here. The final application UI basically consists of a dropdown and a button, and when you choose your name and click the button, both are hidden and the name of the person you “drew” is revealed.
ui <- fluidPage(
shinyjs::useShinyjs(),
shiny::titlePanel("Draw a name for Christmas!"),
shiny::fluidRow(
shiny::column(
width = 6,
offset = 3,
shiny::h2("Who are you?"),
shiny::selectInput("chooser", "Your Name", choices = c("Loading..."))
)
),
shiny::fluidRow(
shiny::column(
width = 6,
offset = 3,
shiny::div(
id = "you-drew",
shiny::h2("You drew..."),
shiny::htmlOutput("drew_name")
),
shiny::div(
id = "draw-button",
shiny::actionButton(
"show_chosen_name",
"Draw a Name!",
class = "btn btn-success btn-block btn-lg"
)
)
)
)
)
This was the
simplest setup I could come up with that helps enforce the “no peeking” rule,
because there is one added twist: Once someone has “drawn” a name, the
viewed
boolean in the “selections” file is switched to TRUE
, and their name
is removed from the dropdown. This means that, if someone peeks to see who
person A drew, when person A tries to draw a name they’ll be missing from the
dropdown list, at which point they will complain and the process can be
restarted.
The most complex part of this is including shinyjs
to allow the app to
show/hide UI elements when the button is clicked.
Step Four: The Server
The server code for this app is similarly straightforward. It contains code to
fetch the “selections” data, populate the dropdown with the names of people who
haven’t drawn yet, and an observer for the button that marks the person who
drew, hides the button, then displays the name drawn. The “selections” data
is fetched in a tryCatch()
function to either fetch the data from Google Drive
or generate it from scratch from the “names” data if there is no “selections”
file. init_selections()
is basically just a wrapper function around the code
shown earlier that generates the pairs.
server <- function(input, output, session) {
# Hide this div on server start
shinyjs::hide("you-drew")
# Get the saved "selections" data, or generate it if it doesn't already exist.
selections <- tryCatch(
get_selections_data(),
error = \(e) init_selections()
)
# Get a list of "choosers" and populate the dropdown in the UI.
(choosers
<- selections
|> dplyr::filter(!viewed)
|> dplyr::pull(chooser_name))
shiny::updateSelectInput(session, "chooser", choices = choosers)
# Observer for the button. When clicked, indicate that the "chooser" has
# already viewed their "choice" and update the saved "selections" data.
# Then, display the "choice".
shiny::observeEvent(input$show_chosen_name, {
# Mark chooser
selections$viewed[selections$chooser_name == input$chooser] <- T
write_selections_data(selections)
# Get choice name
(choice
<- selections
|> dplyr::filter(chooser_name == input$chooser)
|> dplyr::pull(choice_name))
# Populate the display of the drawn name
(output$drew_name
<- choice
|> shiny::h1(class = "display text-danger")
|> shiny::renderUI())
# Hide the button and show the drawn name
shinyjs::show("you-drew")
shinyjs::hide("draw-button")
}
)
}
Step Five: Deploy It!
With all the code and environment variables in place, all that’s left is to deploy the application. I needed to run it at least one time locally to generate the OAuth token (in reality I ran it locally several times for debugging while I made it). Deployment was super straightforward thanks to the shinyapps.io integration with RStudio. I’ll refer you to the official documentation for a step-by-step set of instructions, just know there aren’t actually that many steps. Importantly, usage of this app fits well within the free usage tier.
All Together
Just to wrap this up in a neat bow, here’s the full contents of app.R
all in
one place, for context:
# Library Imports --------------------------------------------------------------
library(dplyr)
library(googledrive)
library(shiny)
library(shinyjs)
library(stringr)
library(tibble)
library(tidyr)
# Global Application Setup -----------------------------------------------------
#' The .Renviron file stores two important environment variables
#' - GMAIL_ADDRESS: The username for the Google Account whose Drive we're using
#' - DRIVE_FOLDER_ID: The ID for the Drive folder used
#' Authenticate with Google Drive and store the OAuth token in a `cache` folder
#' in the project directory. This needs to be run at least once before uploading
#' to shinyapps.io, to create the cached OAuth token.
googledrive::drive_auth(
email = Sys.getenv("GMAIL_ADDRESS"),
cache = "./cache/"
)
#' Using the `DRIVE_FOLDER_ID`, mark `DRIVE_FOLDER` as a Drive folder ID
(DRIVE_FOLDER
<- Sys.getenv("DRIVE_FOLDER_ID")
|> googledrive::as_id())
#' Given a file name, return a Drive ID
#'
#' This function will raise an error if the requested file name does not exist
#' in the folder identified by `DRIVE_FOLDER`.
#'
#' @param file_name the name of the Drive file
#'
#' @return the ID of the Drive file
get_file_as_id <- function(file_name) {
(found_file_id
<- googledrive::drive_ls(DRIVE_FOLDER)
|> dplyr::filter(name == file_name)
|> dplyr::pull(id)
|> googledrive::as_id())
if (length(found_file_id) == 0) stop("No file with that name!")
found_file_id
}
#' Gets the list of names, stored in "names.csv"
#'
#' This file should contain two fields:
#' - family: Number used to group individuals who should not be given each
#' others' names for the gift exchange. Spouses, for example.
#' - name: The name of the individual. Each name should be unique. Include
#' last and middle names, if necessary.
#'
#' @return a dataframe of the "names.csv" data
get_names_data <- function() {
(get_file_as_id("names.csv")
|> googledrive::drive_read_string()
|> readr::read_csv())
}
#' Gets the list of gift-giving pairs, stored in "selections.rds"
#'
#' This file will contain the following fields:
#' - chooser_family: The `family` value of the individual giving a gift
#' - chooser_name: The `name` of the individual giving a gift
#' - choice_family: The `family` value of the individual receiving a gift
#' - choice_name: The `name` of the individual receiving a gift
#' - viewed: Logical indicating if the person identified by
#' `chooser_name` has viewed their choice in the app
#'
#' @return a dataframe of the "selections.rds" data
get_selections_data <- function() {
(get_file_as_id("selections.rds")
|> googledrive::drive_read_raw()
|> base::rawConnection()
|> base::gzcon()
|> base::readRDS())
}
#' Write to "selections.rds"
#'
#' Saves `selections` as a local RDS file, then uploads it to Google Drive
#'
#' @param selections a dataframe to write to "selections.rds"
write_selections_data <- function(selections) {
base::saveRDS(selections, "selections.rds")
googledrive::drive_upload(
"selections.rds",
DRIVE_FOLDER,
name = "selections.rds",
overwrite = TRUE
)
}
#' Generate the data for "selections.rds"
#'
#' @return a dataframe of selections
init_selections <- function() {
# Read data from the "names.csv" file in Google Drive
names <- get_names_data()
name_count <- nrow(names)
# This is not a clever algorithm. We create a copy of `names` for the person
# who will draw a name (the "choosers") and for the person whose name will
# be drawn (the "choices"). Then the rows are randomly joined and checked
# to see if any row contains a `chooser_family` and `choice_family` that
# are the same. If so, do it again.
choosers <- dplyr::rename_all(names, stringr::str_replace, "^", "chooser_")
choices <- dplyr::rename_all(names, stringr::str_replace, "^", "choice_")
try_again <- T
while (try_again) {
a <- dplyr::mutate(choosers, join_key = sample(seq(name_count)))
b <- dplyr::mutate(choices, join_key = sample(seq(name_count)))
selections <- dplyr::full_join(a, b, by = "join_key")
try_again <- with(selections, any(chooser_family == choice_family))
}
selections <- dplyr::mutate(selections, viewed = FALSE)
# Write the selections data to Google Drive and return the dataframe
write_selections_data(selections)
selections
}
# Shiny App UI -----------------------------------------------------------------
ui <- fluidPage(
shinyjs::useShinyjs(),
shiny::titlePanel("Draw a name for Christmas!"),
shiny::fluidRow(
shiny::column(
width = 6,
offset = 3,
shiny::h2("Who are you?"),
shiny::selectInput("chooser", "Your Name", choices = c("Loading..."))
)
),
shiny::fluidRow(
shiny::column(
width = 6,
offset = 3,
shiny::div(
id = "you-drew",
shiny::h2("You drew..."),
shiny::htmlOutput("drew_name")
),
shiny::div(
id = "draw-button",
shiny::actionButton(
"show_chosen_name",
"Draw a Name!",
class = "btn btn-success btn-block btn-lg"
)
)
)
)
)
# Shiny App Server -------------------------------------------------------------
server <- function(input, output, session) {
shinyjs::hide("you-drew")
# Get the saved "selections" data, or generate it if it doesn't already exist.
selections <- tryCatch(
get_selections_data(),
error = \(e) init_selections()
)
# Get a list of "choosers" and populate the dropdown in the UI.
(choosers
<- selections
|> dplyr::filter(!viewed)
|> dplyr::pull(chooser_name))
shiny::updateSelectInput(session, "chooser", choices = choosers)
# Observer for the button. When clicked, indicate that the "chooser" has
# already viewed their "choice" and update the saved "selections" data.
# Then, display the "choice".
shiny::observeEvent(input$show_chosen_name, {
# Mark chooser
selections$viewed[selections$chooser_name == input$chooser] <- T
write_selections_data(selections)
(choice
<- selections
|> dplyr::filter(chooser_name == input$chooser)
|> dplyr::pull(choice_name))
(output$drew_name
<- choice
|> shiny::h1(class = "display text-danger")
|> shiny::renderUI())
shinyjs::show("you-drew")
shinyjs::hide("draw-button")
}
)
}
# Run the application
shiny::shinyApp(ui = ui, server = server)
In total, that’s about 200 lines of code, with a hefty fraction of it being comments so I’ll remember what I did if I need to make any changes to it next year.
Wrap Up
When I talk to folks who are learning to code, I often recommend finding small problems that can be solved either in their workplace or lives outside of work. This project fits squarely into that category, because it represents a small amount of code with a somewhat larger amount of problem-solving that touches a few different areas of application development. Shiny apps in particular are nice learning tools, in part because the tooling around them is so well documented and supported in RStudio.
A final word about code style: I’ve recently adopted the convention shown here:
(choice
<- selections
|> dplyr::filter(chooser_name == input$chooser)
|> dplyr::pull(choice_name))
Using the parentheses this way allows me to put the |>
pipe operator at the
beginning of the line instead of at the end, which is more consistent with
the coding style found in languages like Haskell and Elm, which I find really
helps line up these operations and makes for cleaner looking code. Your
mileage may vary, but I like it.