fix: reduce warm up time #72

rogerthatdev · 2023-07-28T20:01:43Z

Description

This PR:

will reduce the provisioning time by replacing the time_sleep resource with a 6 minute timeout using the http data source
change it so that google_compute_global_address waits for the compute api to be enabled (lack of consistently caused failure in local development terraform apply)

The http data source will send a request to the end point every 20 seconds until it gets a valid response. It will stop trying after 17 times, totaling to a 6 minute timeout (equal to the time_sleep it is replacing).

Checklist

Added steps to reproduce the changes in this pull request
Added relevant testing in this pull request
Please merge this PR for me once it is approved.

grayside · 2023-07-28T21:39:06Z

infra/main.tf

@@ -295,7 +298,7 @@ resource "time_sleep" "load_balancer_warm_up_time" {
    google_compute_global_forwarding_rule.http
  ]

-  create_duration = "370s"
+  create_duration = "150s"


I really appreciate the use of data collection to identify a more optimal number; however, I have found these kinds of propagation delays can vary widely based on regional traffic, systems maintenance, and so on.

If this data was gathered during a single work session, it may be a best case scenario. Is this faster than product readiness guarantees?

Instead of hard-coding this number, might we use https://registry.terraform.io/providers/hashicorp/http/latest/docs/data-sources/http with retries enabled (for 5xx responses) to detect when the forwarding rule is working as expected? (I'm not sure if 5xx errors are returned before it's ready)

I hadn't considered that! Thanks, i'll take a look at that suggestion. Converting this PR to a draft because of course the int test failed at resolving the URL 👎🏽

grayside

Approved with comments. I've had this in mind as a theory for a few weeks, thanks for showing an implementation!

grayside · 2023-08-02T18:51:32Z

infra/main.tf

+
+data "http" "load_balancer_warm_up" {
+  url = "http://${google_compute_global_address.default.address}/"
+  # Attempt retry every 20 seconds 17 times, totaling to a 6 minute timeout.


question: why 6 minutes? Depending on the answer, that may be good to add to the inline comment because it would help future maintainers understand how to repeat the reasoning if the number should be changed in the future.

grayside · 2023-08-02T18:59:39Z

infra/main.tf

+  retry {
+    attempts     = 17
+    max_delay_ms = 20000
+    min_delay_ms = 20000
+  }


discussion: Was curious if there were any surprises here, I think these parameters are likely being used with https://github.com/hashicorp/go-retryablehttp/blob/v0.7.4/client.go#L520C23-L520C23. Based on my reading, min_delay_ms would not normally be linear, but setting the max to the same value caps the exponential computation.

No op here, just thought that may be interesting if others haven't delved that far.

grayside · 2023-08-02T19:02:13Z

infra/main.tf

  depends_on = [
+    google_compute_global_address.default,
+    google_compute_url_map.default,
+    google_compute_backend_service.default,


note: google_compute_url_map.default should have an implicit dependency on google_compute_backend_service.default if you want one fewer explicit dependency here, but clarifying we're waiting on all the load balancer resources may be a good thing given how complicated this is.

Alright, I've talked myself out of this being a suggestion and made it a "note".

grayside · 2023-08-02T19:03:32Z

infra/outputs.tf

@@ -16,7 +16,7 @@

 output "frontend_url" {
  description = "IP address to site. Load balancer expected to take ~5 minutes for it to warm up."
-  value       = "http://${google_compute_global_address.default.address}/"
+  value       = data.http.load_balancer_warm_up.url


praise: Nice simplification of the output configuration based on the http data.

reduce load balancer warmup

5d4a0e1

github-actions bot added component: infra type: bug labels Jul 28, 2023

rogerthatdev added 2 commits July 28, 2023 14:11

adjust with 30 sec padding

db0281d

global address wait for services

9ae06e2