retry and fetch an updated firmware url if the first try 401s #194

joshk · 2024-05-24T05:11:32Z

this is a fix for #47

if the download returns a 401 it will retry once and then exit

I have also trapped Downloader exits and nil'd UpdateManager state where appropriate

lawik

Some notes on port numbering in tests but no major weirdness :)

test/nerves_hub_link/update_manager_test.exs

test/support/http_unauthorized_error_plug.ex

test/nerves_hub_link/update_manager_test.exs

lib/nerves_hub_link/update_manager.ex

joshk · 2024-05-26T06:06:09Z

@jjcarstens I'm wondering if maybe the better way to approach this issue is to have UpdateManager report back to the Socket on the status of the update (rescheduled, ignored, download error), and then Hub can decide if it sends the update again.

Thoughts?

lawik · 2024-05-26T08:43:13Z

For time being out of whack, one solution is to have the device include what they think the time is in join.

Then the server can tell whether the client is confused or has an expired thing. It can even pass down the current time. This could be an optional mechanism for getting the time (trust the server) which is helpful for a Pi for example. But mostly it lets one tell the difference between bad auth and being out of sync.

jjcarstens · 2024-05-26T14:35:52Z

A noble thought @lawik, but I don't think that will work. Time is a required component of the connect for a couple reasons:

Device certs - the SSL stack on device may not even get to a connect attempt because it will check the server cert first and believe it is expired when it compares its terribly off device time
Shared secret - Time is a component of the HMAC to assert validity. To detect the time is off and say "o, that's okay. Here's the time you should use" would be an anti-pattern negating the HMAC a bit

However, the middle ground is a decent idea to just include the date header in the rejection always. Though I am unsure if we could get that early enough in the SSL stack. Though Frank is currently working on a time server solution that could be used for conditions like this too which might be the better play to keep the time concern separate

lawik · 2024-05-26T15:59:13Z

Shared secret has this issue. The SSL is not failing, it gets a 401 and if we knew what it thinks the time is we can detect it and give a better "why is NervesHubLink not up"-experience.

My PIs have this and log a ton of 401s on every restart. My guess is that they generate times at UTC until NervesTime syncs. Or something. Time is not 1977-01-01 :) SSL works it seems

joshk · 2024-05-27T04:14:05Z

@jjcarstens I've reworked the logic of this feature to now send updates from the UpdateManager to the pid that called it. These updates are then sent to Hub, which can then update the registry and decide if there are further actions to take, like sending a new update to the device.

I think this is a better setup, and there are options to improve it further.

lib/nerves_hub_link/update_manager.ex

joshk · 2024-06-13T22:02:04Z

lib/nerves_hub_link/update_manager.ex

        %State{} = state
      ) do
-    state = maybe_update_firmware(update, fwup_public_keys, state)
+    state = maybe_update_firmware(update, fwup_public_keys, elem(from, 0), state)


@jjcarstens is it ok to use elem(from, 0)? or is this bad practice?

jjcarstens · 2024-06-14T00:39:17Z

lib/nerves_hub_link/socket.ex

+      if is_atom(result) do
+        %{result: result}
+      else
+        %{result: elem(result, 0), reason: elem(result, 1)}


If we're not doing anything with the {:error portion of the tuple, why have it? We could just simplify and make it an atom return

That or change the success condition to be {:ok, result} and we always return the same structure of %{result: :ok | :error, reason: reason}

there are six events that can be sent from the UpdateManager to Socket are, the update ...

started

was ignored

was rescheduled

completed successfully

failed due to a 401 error (download_unauthorized)

failed due to a non fatal download error (non_fatal)

I had modelled this as ...

:started

:ignored

{:rescheduled, ms, rescheduled_to(ms)}

:completed

{:error, :download_unauthorized}

and {:error, :non_fatal}

I could change :completed to {:ok, :completed} to match the structure of the error cases, although the :ok is kinda superfluous information. It would also mean that :ignored and :started would need to change.

In Hub each of these cases would update registry in slightly different ways.

Thoughts?

I also realized that I hadn't accounted for {:rescheduled, ms, rescheduled_to(ms)}

joshk force-pushed the firmware-update-401-retries branch 2 times, most recently from ca5748b to 7b0414b Compare May 24, 2024 05:21

lawik reviewed May 24, 2024

View reviewed changes

test/nerves_hub_link/update_manager_test.exs Outdated Show resolved Hide resolved

test/nerves_hub_link/update_manager_test.exs Outdated Show resolved Hide resolved

jjcarstens reviewed May 25, 2024

View reviewed changes

joshk force-pushed the firmware-update-401-retries branch 3 times, most recently from 63e983b to fcfd50f Compare May 26, 2024 02:50

joshk requested review from jjcarstens and lawik June 12, 2024 07:47

jjcarstens reviewed Jun 13, 2024

View reviewed changes

lib/nerves_hub_link/update_manager.ex Show resolved Hide resolved

joshk commented Jun 13, 2024

View reviewed changes

joshk added 4 commits June 14, 2024 10:04

retry and fetch an updated firmware url if the first try 401s

c352be7

send update statuses to hub, including download errors

981b98f

format keywords into maps for jsonification

daf434e

some rebase corrections

39df313

joshk force-pushed the firmware-update-401-retries branch from 0e54572 to 39df313 Compare June 13, 2024 22:07

joshk requested a review from jjcarstens June 13, 2024 22:08

jjcarstens reviewed Jun 14, 2024

View reviewed changes

joshk added 2 commits June 14, 2024 13:31

testing: update how port numbers are retrieved

75b67f2

a more robust jsonification

aca6574

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

retry and fetch an updated firmware url if the first try 401s #194

retry and fetch an updated firmware url if the first try 401s #194

joshk commented May 24, 2024

lawik left a comment

joshk commented May 26, 2024

lawik commented May 26, 2024 •

edited

Loading

jjcarstens commented May 26, 2024

lawik commented May 26, 2024

joshk commented May 27, 2024

joshk Jun 13, 2024

jjcarstens Jun 14, 2024

jjcarstens Jun 14, 2024

joshk Jun 14, 2024

retry and fetch an updated firmware url if the first try 401s #194

Are you sure you want to change the base?

retry and fetch an updated firmware url if the first try 401s #194

Conversation

joshk commented May 24, 2024

lawik left a comment

Choose a reason for hiding this comment

joshk commented May 26, 2024

lawik commented May 26, 2024 • edited Loading

jjcarstens commented May 26, 2024

lawik commented May 26, 2024

joshk commented May 27, 2024

joshk Jun 13, 2024

Choose a reason for hiding this comment

jjcarstens Jun 14, 2024

Choose a reason for hiding this comment

jjcarstens Jun 14, 2024

Choose a reason for hiding this comment

joshk Jun 14, 2024

Choose a reason for hiding this comment

lawik commented May 26, 2024 •

edited

Loading