By Eric Burden | December 17, 2020
In the spirit of the holidays (and programming), I’ll be posting my solutions to the Advent of Code 2020 puzzles here, at least one day after they’re posted (no spoilers!). I’ll be implementing the solutions in R because, well, that’s what I like! What I won’t be doing is posting any of the actual answers, just the reasoning behind them.
Also, as a general convention, whenever the puzzle has downloadable input, I’m
saving it in a file named input.txt
.
Day 16 - Ticket Translation
Find the problem description HERE.
Part One - No Money? No Ticket!
This puzzle got me right in my comfort zone: manipulating tabular data! Granted,
that’s probably not the only (or even necessarily best) way to solve this
puzzle, but at this point I’m frankly incapable of not wanting to use dplyr
for one of these. Now’s my chance!
Also a first for me in these puzzles (I think): a closure! If you’re not
familiar with closures in R, it’s what you get when you have a function that
returns a function. The returned function takes the environment of the
function that produced it along for the ride. In this case, each function
produced by field_test_function()
carries the vectors for range_one
and
range_two
with it in its enclosing environment, which is mighty convenient.
library(stringr)
library(tidyr)
library(dplyr)
library(dequer)
test_input <- c(
"class: 1-3 or 5-7",
"row: 6-11 or 33-44",
"seat: 13-40 or 45-50",
"",
"your ticket:",
"7,1,14",
"",
"nearby tickets:",
"7,3,47",
"40,4,50",
"55,2,20",
"38,6,12"
)
real_input <- readLines('input.txt')
# Given a line from the input for a field test (i.e., in the format of
# "class: 1-3 or 5-7"), returns a closure that accepts a number and
# returns TRUE if the number matches one of the defined ranges from the
# input and FALSE if it does not.
field_test_function <- function(line) {
nums <- as.numeric(str_extract_all(line, '\\d+', simplify = TRUE))
range_one <- nums[1]:nums[2]
range_two <- nums[3]:nums[4]
function(n) {
(n %in% range_one) | (n %in% range_two)
}
}
# Given a list of input lines `input`, returns an environment containing
# a list of functions for testing fields `field_tests`, a numeric vector
# containing the values from your ticket `my_ticket`, and a list containing
# the values from all the other tickets `nearby_tickets`
parse_input <- function(input) {
env <- new.env() # Container environment
env$field_tests <- list() # List of functions for testing fields
env$my_ticket <- numeric(0) # Vector for 'my ticket' numbers
nearby_tickets <- stack() # Stack to hold other ticket numbers
mode <- 'tests' # Indicates what the for loop does with current line
# For each line in the input lines...
for (line in input) {
if (line == "") { next } # Skip blank lines
# Set `mode` to parse my ticket data
if (line == 'your ticket:') { mode <- 'my ticket'; next }
# Set `mode` to parse numbers from other tickets
if (line == 'nearby tickets:') { mode <- 'other tickets'; next }
# The default, for the first set of input lines create a function for
# each that will test a number to see if it falls within the given ranges
if (mode == 'tests') {
name <- str_extract(line, '^[\\w\\s]+(?=:)')
env$field_tests[[name]] <- field_test_function(line)
}
# For 'my ticket', just get the numbers
if (mode == 'my ticket') {
env$my_ticket <- as.numeric(unlist(strsplit(line, ',')))
}
# For 'other tickets', get the numbers and push them onto the
# `nearby_tickets` stack
if (mode == 'other tickets') {
ticket_nums <- as.numeric(unlist(strsplit(line, ',')))
push(nearby_tickets, ticket_nums)
}
}
env$nearby_tickets <- as.list(nearby_tickets) # Stack to list
env # Return the environment
}
# Given a list of vectors containing the field data from nearby tickets
# `nearby_tickets` and a list of tests to determine whether a number satisfies
# the ranges for a given field `field_tests`, builds and returns a data frame
# where each row represents the results of testing single field in a single
# ticket for a match against the rules for a named field. So, one row per test,
# per field, per ticket. Includes columns that represent a unique identifier for
# the ticket `ticket_no`, the order in which that field appears on the ticket
# `field_no`, the name of the field being tested for `field`, whether the
# value of that field matched the test for the named field `match`, and the
# value of the field being examined `field_val`
get_field_matches <- function(nearby_tickets, field_tests) {
# Structure for the resulting data frame
all_results <- data.frame(
ticket_no = numeric(0) , field_no = numeric(0),
field = character(0), match = logical(0),
field_val = numeric(0)
)
# For each vector in `nearby_tickets`...
for (ticket_no in 1:length(nearby_tickets)) {
# Shorthand reference to ticked field values
field_vals <- nearby_tickets[[ticket_no]]
# Check each field value against each field test function, convert the
# results into a data frame
matches <- sapply(field_tests, function(f) { f(field_vals) }) %>%
as.data.frame() %>%
mutate(
ticket_no = ticket_no,
field_no = row_number(),
field_val = field_vals
) %>%
pivot_longer(
-c('ticket_no', 'field_no', 'field_val'),
names_to = 'field',
values_to = 'match'
)
all_results <- bind_rows(all_results, matches) # Append to our data frame
}
all_results # Return the data frame
}
input_environment <- parse_input(real_input) # Unpack the input
# Parse the input data into a data frame indicating matches against the field
# test functions
field_matches <- get_field_matches(
input_environment$nearby_tickets,
input_environment$field_tests
)
# Determine the answer:
# - Group the data frame by `ticket_no`, `field_no`, and `field_val`
# - For each group, sum the number of fields where `match` == TRUE
# - Keep only records where no matches were found
# - Extract the `field_val` column as a vector
# - Sum the contents of the `field_val` column
answer1 <- field_matches %>%
group_by(ticket_no, field_no, field_val) %>%
summarise_at('match', sum) %>%
filter(match == 0) %>%
pull(field_val) %>%
sum()
Ah… Just look at those magrittr
pipes… Lovely. Just in case you haven’t
seen them before (unlikely if you’ve used R for very long), the %>%
s are
‘pipe’ operators. They pass the result from their left-hand side as the first
argument to the function (or expression) on their right-hand side.
Part Two - There Can Be Only One
This one took me a bit to realize that it wasn’t because I’d done something wrong that many of the fields appeared to satisfy more than one validation rule. Once I realized that was on purpose (and that initially one of the fields satisfied only a single validation rule), I was off to the races! So, identify the field(s) that satisfy only a single rule, assign that field the name of the field represented by that validation rule, remove that field name from the options, then do it again.
source('exercise_1b.R')
test_input <- c(
"class: 0-1 or 4-19",
"row: 0-5 or 8-19",
"seat: 0-13 or 16-19",
"",
"your ticket:",
"11,12,13",
"",
"nearby tickets:",
"3,9,18",
"15,1,5",
"5,14,9"
)
input_environment <- parse_input(real_input) # Unpack the input
# Parse the input data into a data frame indicating matches against the field
# test functions
field_matches <- get_field_matches(
input_environment$nearby_tickets,
input_environment$field_tests
)
# Get a data frame containing only valid tickets:
# - Group by `ticket_no`, `field_no`, and `field_val`
# - Mark any fields containing invalid data with `invalid_field` = TRUE
# - Group by `ticket_no`
# - Mark any tickets containing invalid fields with `exclude` = TRUE
# - Ungroup and remove any tickets containing invalid fields
valid_ticket_data <- field_matches %>%
group_by(ticket_no, field_no, field_val) %>%
mutate(invalid_field = !any(match)) %>%
group_by(ticket_no) %>%
mutate(exclude = any(invalid_field)) %>%
ungroup() %>%
filter(!exclude)
# Get a data frame consisting of one row per `field_no`, per `field` name where
# all fields in that number matched a particular field test (for example, all
# values in `field_no` 1 matched the 'arrival_platform' field):
# - Group by `field_no` and `field`
# - Identify groups where the `field_no` matched a particular field test every time
# - Filter, keeping only those groups
identify_fields <- valid_ticket_data %>%
group_by(field_no, field) %>%
summarise(all_matched = all(match)) %>%
filter(all_matched)
# Create an empty data frame with columns for `field_no` and `field`
confirmed_fields <- data.frame(field_no = numeric(0), field = character(0))
# Until all records in `identify_fields` have been transferred to
# `confirmed_fields`
while (nrow(confirmed_fields) < length(unique(identify_fields$field))) {
# Identify fields that only match one set of field rules, sequentially. On the
# first pass, many of the `field_no`s will satisfy more than one field test,
# so we start by transferring the `field_no`s that only match one test to
# `confirmed_fields`:
# - Starting with `identify_fields`
# - Remove any records containing a `field_no` that has already been confirmed
# - Group by `field`
# - Calculate the number of records in the group as `group_size`
# - Filter, keeping only `field`s where the `group_size` is 1, indicating that
# that `field_no` only matches that field's requirements
# - Select the `field_no` and `field` columns
newly_confirmed <- identify_fields %>%
filter(!(field_no %in% confirmed_fields$field_no)) %>%
group_by(field) %>%
mutate(group_size = n()) %>%
filter(group_size == 1) %>%
select(field_no, field)
# Add `newly_confirmed` rows to `confirmed_fields`
confirmed_fields <- bind_rows(confirmed_fields, newly_confirmed)
}
# Starting with the data frame of confirmed field names/numbers, filter the
# data frame to only 'departure' fields and pull out the `field_no` column
# as a vector
departure_field_nos <- confirmed_fields %>%
filter(str_detect(field, 'departure')) %>%
pull(field_no)
# Now that we know which `field_no`s (i.e., ticket vector indices) belong to
# the 'departure' fields, just select those values from `my_ticket` and
# multiply together
answer2 <- prod(input_environment$my_ticket[departure_field_nos])
Wrap-Up
This one was fun! It was a bit more code than some of the others, but it was
so worth it. Plus, I got to use a bit of ‘functional programming’ with my
closure (that’s my story and I’m sticking to it). I normally try not to lean
too heavily on external packages for these, but I happily relied on dplyr
and
magrittr
.
If you found a different solution (or spotted a mistake in one of mine), please drop me a line!