04-results.qmd

---
editor: 
  markdown: 
    wrap: 72
---

# Results {#sec-results}

```{r setup, file = "R/chapter_start.R", include = FALSE, cache = FALSE}
# a number of commands need to run at the beginning of each chapter. This
# includes loading libraries that I always use, as well as options for 
# displaying numbers and text.
library(janitor)
library(sf)
library(leaflet)
library(ggspatial)
library(wesanderson)
```

In this section we apply the model to scenarios where critical highway
links are removed from the model network. This includes first a detailed
analysis of a single scenario, where I-80 between Salt Lake and Tooele
Counties is severed. We compare the model output to an alternative
method that measures only the change in travel time and does not allow
for mode or destination choice. The model was then applied to 40
additional link closure scenarios throughout the state.

## Detailed Scenario Analysis

To illustrate how the logsum-based model framework captures the costs of
losing a link, we conducted a scenario where I-80 between Tooele and
Salt Lake Counties west of the Salt Lake City International Airport
becomes unavailable. Tooele is a largely rural county with increasing
numbers of residents who commute to jobs in the Salt Lake Valley. I-80
is the only realistic path between the two counties, as alternate routes
involve mountain canyons and many additional miles of travel.

The costs obtained by the logsum and travel time based methods for this
scenario are shown in @tbl-tooeletable. @fig-tooelemap presents the 
lost HBW utility in each TAZ associated with removing
the interstate link between Tooele and Salt Lake Counties. 
Multiplying these values by the population in each production TAZ, summing 
based on whether the TAZ is inside or outside Tooele County, and then multiplying
by the cost coefficient yields the HBW values for the Utility Logsum in 
@tbl-tooeletable.

In both the logsum and travel time analyses, we separate
the costs spatially, though the definitions of the two are slightly
different. In the logsum-based method, "Inside Tooele" are costs
associated with decreased accessibility for trips produced in Tooele
County. The increased costs in this case capture the loss in utility by
measuring increased multi-model travel impedances to destinations with
high attractiveness. In the travel-time method, by contrast, the "Inside
Tooele" costs are those for trips with an origin in Tooele County and a
destination in Salt Lake County, and are the increase in travel time
multiplied by a value of time and the number of vehicles making the
trip. The difference in definition is required by the difference in
framework construction.

In general, the logsum-based method arrives at cost estimates that are
less than half of the comparable estimates of the travel time-based
method. This is not unexpected, as the travel time-based method assumes
that all the preexisting trips would still occur, but on a longer path.
The logsum-based method on the other hand attempts to capture the fact
that when a path changes, people may shift their destination or their
mode of travel. To be clear, the framework we have developed for this
exercise does not explicitly model the selection of a new alternative
destination; rather, we calculate instead the value of a destination
choice set before and after the link is removed. But the implication is
that the availability of alternative destinations still provide some
benefit to the choice maker, a proposition that the travel time method
ignores.

```{r tooeletable, echo = FALSE}
#| label: tbl-tooeletable
#| tbl-cap: Comparison of Logsum-based and Time-based costs for I-80 at Tooele
tar_load(scenario_comparison)
options(knitr.kable.NA = '')
subtotal <- scenario_comparison %>% 
  filter(purpose %in% c("HBO", "HBW", "NHB")) %>%
  janitor::adorn_totals(name = "Comparable Total") %>% .[nrow(.),]

total <- scenario_comparison %>%
  janitor::adorn_totals() %>% .[nrow(.),]
  

bind_rows(scenario_comparison, subtotal, total) %>%
    mutate(
      purpose = gsub("F", " Freight", purpose),
      purpose = gsub("P", " Passenger", purpose),
      purpose = gsub("REC", " Recreation", purpose),
      purpose = gsub("II", "Internal", purpose),
      purpose = gsub("IX", "Inbound / Outbound", purpose),
      purpose = gsub("XX", "Through", purpose),
    ) %>%
  mutate(
    across(where(is.numeric),  ~ ifelse(. == 0, NA, .)),
    across(where(is.numeric), 
                ~ ifelse(is.na(.), NA, 
                         as.character(scales::dollar(., largest_with_cents = 100)))),
         ) %>%
kbl(
  col.names = c("Purpose", 
                rep(c("Other Counties", "Inside Tooele", "Total"), 2)),
  align = "lrrrrrr", booktabs = TRUE
) %>%
  kable_styling(latex_options = "scale_down") %>%
  add_header_above(c(" " = 1, "Utility Logsum" = 3, "Travel Time" = 3)) %>%
  pack_rows( index = c("Passenger" = 3, "External / Freight" = 5), 
             hline_after = F )
  
```

```{r tooelemap, echo = FALSE, cache = FALSE}
#| label: fig-tooelemap
#| fig-cap: "Tooele county HBW utility loss. Basemap from OpenStreetMap via @ggspatial."
#| fig-width: 7
tar_load(zonal_mapdata)

naf0 <- function(x){ifelse(x == 0, NA, x)}

zonal_mapdata <- zonal_mapdata %>%
  sf::st_transform(4326) %>% 
  mutate(across(c(HBW, HBO, NHB), naf0)) %>%
  filter(!is.na(HBW)) 

  # overall map
  ggplot(zonal_mapdata, aes(fill = HBW)) + 
    ggspatial::annotation_map_tile("cartolight", zoom = 10) + 
    coord_sf(crs = sf::st_crs(4326),  expand = FALSE) + 
    geom_sf(data = zonal_mapdata , inherit.aes = FALSE, aes(fill = HBW), 
            alpha = 0.7, lwd = 0)  + 
    scale_x_continuous(limits = c(-112.7, -111.5)) +
    scale_y_continuous(limits = c(40.2, 41)) +
    theme(axis.line = element_line(color = NA))  + 
    viridis::scale_fill_viridis("Total HBW Utility Loss", option = "magma", direction = 1) + 
     theme(axis.line = element_line(color = NA),
          axis.title.x=element_blank(),
          axis.text.x=element_blank(),
          axis.ticks.x=element_blank(),
          axis.title.y=element_blank(),
          axis.text.y=element_blank(),
          axis.ticks.y=element_blank())

```

Another key element of @tbl-tooeletable is that the largest single
element of costs in the scenario is associated with through freight, as
well as contributions from other purposes for which the logsum model
developed in this study did not include a corresponding methodology.
This was due to data limitations and the modeling approach of the
existing USTM, but it is obvious that a choice-based framework for
examining the costs of through and inbound / outbound traffic is an
important limitation in this scenario and potentially many others.

## Prioritization Scenario Results

We now apply the model to compare 40 additional scenarios where
individual highway facilities are removed from the model highway
network. @fig-traveltimerank presents the logsum and travel time results
for all 41 scenarios, ordered by decreasing logsum costs. In other
words, I-15 at the boundary between Utah and Salt Lake Counties is the
most "critical" link analyzed in this exercise as determined by the logsum
method. Were
this link to be cut, the people of Utah would have the highest costs per
day in lost destination and travel access of any other link. Each of the
highest-ranking roads in this analysis is an interstate facility in
northern Utah, which is more heavily populated than the remote areas in
the south. A map showing the locations of these scenarios is given in
@fig-linksmap. The concentration of the highest cost links in the Salt Lake
City metropolitan area is obvious, though the links with the highest
cost are not *in* Salt Lake City where multiple parallel routes exist.
Rather, they are in mountain canyons surrounding the main urban area.

```{r links-map}
#| label: fig-linksmap
#| layout-nrow: 2
#| fig-cap: Total cost of link closure for each scenario by the logsum method.  Basemap from OpenStreetMap via @ggspatial.
#| fig-height: 4
#| fig-subcap: 
#|   - Statewide
#|   - Wasatch Front
tar_load(mapdata)


my_breaks = c(-Inf, 0, 5e6, 1e7, 5e7, Inf)
my_labels = c("\\textless 0", "0-500k", "500k to 1M", "1M to 5M", "\\textgreater 5M")

ggplot(data = mapdata, mapping = aes(
  fill = cut(value, breaks = my_breaks, labels = my_labels))) +
  ggspatial::annotation_map_tile("stamenbw", zoom = 8, alpha = 0.7) + 
  viridis::scale_fill_viridis("Cost ($)",
                               discrete = TRUE, option  = "rocket", 
                               direction = -1) +
  # scale_color_manual("Cost (M$)", palette = ) + 
  geom_sf(size = 3, stroke = 1, alpha = 0.9, pch = 21)  + 
   theme(axis.line = element_line(color = NA),
        axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank(),
        axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank())
 
# make wasatch front inset 
wfbb <- tibble(lon = c(-112.2494, -111.6245), lat = c(40.1999, 41.2406)) |> 
  sf::st_as_sf(coords = c("lon", "lat"), 
           crs = 4326) %>% 
  sf::st_bbox() %>% 
  sf::st_as_sfc()

  
ggplot(data = mapdata |> sf::st_filter(wfbb), 
       mapping = aes(fill = cut(value, breaks = my_breaks,  labels = my_labels ))) +
  ggspatial::annotation_map_tile("stamenbw", zoom = 9, alpha = 0.7) + 
  viridis::scale_fill_viridis("Cost ($)",
                               discrete = TRUE, option  = "rocket", 
                               direction = -1) +
  # scale_color_manual("Cost (M$)", palette = ) + 
  geom_sf(size = 3, stroke = 1, alpha = 0.9, pch = 21)  + 
  theme(axis.line = element_line(color = NA),
        axis.title.x=element_blank(),
        axis.text.x=element_blank(),
        axis.ticks.x=element_blank(),
        axis.title.y=element_blank(),
        axis.text.y=element_blank(),
        axis.ticks.y=element_blank())
  
```

Perhaps strangely, many scenarios in the analysis show a *benefit* from
loss of the link. Investigating these scenarios showed that for many
paths, the shortest automobile travel time in the complete network is
*not* the shortest path by distance. When the shortest time path is
disrupted, the new shortest time path may be only a few minutes longer
by time but dozens of miles shorter (on a slower road). Because the
automobile operating costs are calculated per mile and not per minute,
this means that the new path actually produces a benefit. This challenge
is exacerbated by the apparent agency practice of placing artificial
time penalties on the network links in some canyons during calibration.
This exercise reveals one reason why such a practice should be
discouraged, and also highlights the importance of using consistent
functions for impedance calculation at all stages of the model.

```{r traveltimerank, echo = FALSE}
#| label: fig-traveltimerank
#| fig-cap: Scenario results of travel time and logsum methods in comparison.
#| fig-width: 7.5
#| fig-height: 6
linktable <- readxl::read_excel("images/linktable.xlsx") %>%
  mutate(scenario = substr(LINK_ID, 5,6)) %>%
  select(-c(LINK_ID))

tar_load(costs)
tar_load(timecost_table)

scalf <- function(x){
  scales::dollar(x, largest_with_cents = 100)
}

df <- costs |> 
  arrange(-value) |> 
  transmute(
    rownumber = row_number(),
    scenario = gsub("ROAD", "", scenario),
    value = value/100
  ) |> 
  left_join(timecost_table, by = "scenario") |>  
  left_join(linktable, by = "scenario") |> 
  select(rownumber, ROUTE, LOCATION, value, HBWHBONHB, FXPREC)  |> 
  set_names(c("rownumber", "Route", "Location", "Logsum", "HBW, HBO, NHB", 
              "Freight, External")) |> 
  pivot_longer(cols = Logsum:`Freight, External`, 
               names_to = "Purpose", values_to = "Value") |> 
  mutate(Method = ifelse(Purpose == "Logsum", "Logsum", "Travel Time"),
         Purpose = ifelse(Method == "Logsum", "HBW, HBO, NHB", Purpose),
         Scenario = str_c(Route, Location, sep = ":"))

ggplot(df, aes(x = rownumber, y = Value/1e6, color = Purpose, lty = Method)) +
  geom_line() + theme_bw() + 
  scale_color_manual("Purposes", values = wes_palette("Darjeeling1", 3)) + 
  scale_x_continuous(labels = unique(df$Scenario), breaks = 1:max(df$rownumber)) + 
  theme(axis.text.x = element_text(angle = 70, hjust = 1)) + 
  xlab("Scenario") + ylab("User impacts [$ million]") +
   theme(panel.grid.minor.x = element_blank())
```

@fig-traveltimerank also presents the costs
associated with link closure based on the travel time method for comparable 
purposes and also for freight. Many of
the most costly scenarios in the logsum model also appear to be costly
in the comparable elements of the travel time method. That is, the
scenario breaking the interstate link between Salt Lake and Utah
counties is the most costly scenario in both methods, and underscores
the significance of this link to Utah's economy and people. But while
many of the largest and most impactful scenarios have similar rankings
and scales, there are also drastic differences between the two methods
down the line. To put it simply, the choice of analysis method would
change the priority that UDOT places on its roads in terms of preparing
for incidents and hardening assets.

## Sensitivity Analysis

A primary limitation of the model framework presented to this point is
that the input parameters used for the mode and destination choice
utilities were gathered from several different sources including a
statewide trip-based model, a statewide activity-based model, and an
urban model for a small region. These models were transferred without a
knowledge of the individual parameter estimation statistics or overall goodness
of fit. How much would the findings presented to this point change if the
parameters in @tbl-coeffs were to change modestly?

To examine this possibility, we construct 25 independent draws of the
parameter coefficients using Latin Hypercube Sampling [@helton2003]. The
coefficient of variation for each parameter was assumed to be 0.1;
originally, a value of 0.3 was selected based on the uncertainty research of
@zhao2002, but this resulted in an unreasonable range of implied values of time
in many parameter draws.
Using each of the 25 draws, we ran the base scenario and three large-impact
scenarios and calculated the logsum-based costs.

@fig-saplot presents the estimated monetary costs for each of these
three scenarios under each of the 25 parameter draws. The results are
ordered in the figure by the estimated cost for the highest-impact
scenario. Two observations can be made from this figure. First,
different parameter values do not affect the scenarios uniformly. The
second observation is that despite the within-scenario variation, the
overall scale of the three scenarios is maintained regardless of the
drawn parameter values. Indeed, the three scenarios remain in their
priority ranking across all 25 draws of the choice model parameters. We
therefore do not expect that the selection of parameters is a major
element in the relative ranking of scenarios.

```{r saplot}
#| label: fig-saplot
#| fig-cap: Estimated logsum-based scenario costs in 25 different draws of the choice model parameters. 
library(wesanderson)
tar_load(sensitivity_plot)
sensitivity_plot |> 
  left_join(linktable, by = c("Scenario" = "scenario")) |> 
  transmute(scenario = str_c(ROUTE, LOCATION, sep = " "), Cost, iter) |> 
  ggplot(aes(x = iter, y = Cost, color = scenario)) + 
  geom_line() + 
  xlab("Draw") + ylab("Total Cost [$]") +
  scale_color_manual("Scenario", values = wes_palette("Darjeeling1", 3)) +
  theme_bw()
```