-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split Update module into components and add tests #173
base: main
Are you sure you want to change the base?
Conversation
Thanks! Pushing to wrap up some feature work for an app release now, and I have only had the time for a first skim. Some high level comments for now.
Good
Good
Good, I think
Good, I think
I think these are related, and also what I was struggling with from a first look. For just this module, the abstractions seem a bit heavy-handed at the moment. My hunch is that this would benefit from thinking about it in a slightly wider scope, as in how could other modules consume this? Even if then possibly only implementing in the context of this one module at first. |
I think these are related, and also what I was struggling with from a
first look. For just this module, the abstractions seem a bit heavy-handed
at the moment. My hunch is that this would benefit from thinking about it
in a slightly wider scope, as in how could other modules consume this? Even
if then possibly only implementing in the context of this one module at
first.
Agree, I did not really put much thought into this and it has to be refined
w.r.t. the other modules/components. I will attempt this later this week,
but I definitely want to avoid this becoming a full-blown refactoring of
service composition within the codebase.
However, I pushed this out for early review because I am interested in your
thoughts on the overall mechanics (i.e. interfaces for dependencies +
functors + first-class modules) of this approach. It feels a bit
un-ocaml-ish to me, but then again - I cannot think of a much better
alternative? Been reading discussions
<https://discuss.ocaml.org/t/the-shape-design-problem/7810> (also
<https://discuss.ocaml.org/t/functors-for-dependency-injection/736>, also
<https://mcclurmc.wordpress.com/2012/12/18/ocaml-pattern-easy-functor/>) on
different approaches, but there does not seem to be a consensus on the
"best" way.
|
@knuton pushed updates that attempt to tidy-up / generalize things:
Also, answering my own question the mechanics, the Ocaml manual states that "a typical use of first-class modules is to select at run-time among several implementations of a signature", so I feel it kinda validates this approach. P.S. I think I also figured out how to elegantly wrap the test implementations into a mock'ed interface, so will update the test prototype soon too. |
Went all in with splitting the code into modular components and testing all of them independently. Also abandoned the "mock" approach, instead went for stub-implementations of the dependencies. The changes introduce 3 new abstractions/modules:
I also moved all of the runtime / nix-provided system configuration into a single config library as a starting point for a more structured approach towards config management. This is necessary to enable runtime test setup, but also sed-template-replacements-as-configuration is kind of a massive hack, so this at least consolidates it into once location and we can think where to go from here. To run the tests, do
(The env var / path should later be specified in either the dune config for tests, or, more generally as part of nix-shell setup? Something to be discussed.) There are separate tests for I also littered the code with various P.S. Since |
Assuming you are suggesting to just read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A collection of initial and mostly stylistic comments as I work my way from bottom up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like the little spec language.
Going to send a lightweight spec list next.
Sat down to write a spec, but instead accidentally refactored the Even with the added abstractions for effects, I think this kind of reorganization should make testing considerably easier. |
5e009c3
to
a74ed8e
Compare
Incorporated these changes, but I went further and made It could be potentially simplified even further by writing higher level helper funs for dealing with semver comparisons RAUC booted/primary slot semantics, but let's leave it for the future. |
Pushed some semi-final (hopefully!) changes:
At this point I see two main remaining things:
|
@knuton pushed updates to the earlier tests and a new set of tests that attempt to exhaustively check all the possible (162) combinations as per the spec you provided, which I distilled into this function. Some cases (well, actually about a third of them) are failing now (see attachment), because the current implementation does not match the spec. Many are irrelevant and are due to the spec not distinguishing between "doing nothing" and producing an error/warning (and then doing nothing), which can be easily resolved. But some are really questionable, e.g. in test case 90 UpdateService would attempt to update the inactive slot, whereas nothing should be done (according to the spec, but also probably a good idea in practice). I am using an intermediate representation of expected outcomes for now to provide more room for interpreting the results. Will add the side-effect / IO / failure tests next. P.S. Alcotest has automagic terminal width detection that leads to truncated output and unreadable test case descriptions. We are running an outdated version of alcotest that does not support the |
This is very interesting
I think you are right, though I admit that I wouldn't have minded calling them "mocks".
Great. I was indeed struggling with this distinction when drafting a spec, and I think it was part of what led me to think about separating data from interpretation and action. You did warm me to the goldfish memory design, so maybe it's enough to keep the action separate. What do you think about an interpreter ("free monad") approach to the effects? This may help check the expected action/effect often being "nothing", while still being able to expect a specific interpretation. This may bridge to your intermediate representations. |
1811312
to
138613e
Compare
Updates:
I consider this complete now, unless there are specific things that you want me to change. Commits could be squashed too if you prefer that. Thoughts for the future: as we discussed, the "absolute worst" scenario is if we release a version of PlayOS that boots and seems "good", but that somehow breaks the auto-update process (and therefore the system never self-updates). Therefore, it would be a good idea to be a bit more cautious before marking the newly installed system as good, e.g. add checks for whether RAUC is functional, the UpdateService loop completes successfully, etc. Now it only checks whether it is able to boot? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running the VM, with ./build vm
and ./result/bin/run-in-vm
, and openning the backdor with socat
(see command given by the output of run-in-vm), I am getting at a high rate the following error message:
[ 107.255679] playos-controller[774]: playos-controller: [ERROR] failed to get version information: E("The name de.pengutronix.rauc was not provided by any .service files")
[ 107.255861] playos-controller[774]: playos-controller: [INFO] update mechanism: (ErrorGettingVersionInfo
[ 107.255974] playos-controller[774]: "E(\"The name de.pengutronix.rauc was not provided by any .service files\")")
I don’t get this message on the main
branch.
RAUC is not available when running PlayOS via |
I am converting this back to draft, because while implementing end-to-end tests (PR incoming today), I realized this code will break when deployed, because the nix template variables are being replaced only in specific modules and they are moved into config.ml in this PR. Will fix it ASAP. |
Yes, but the problem is the super high rate (multiple times each second), making the backdoor when developping useless, and flooding the logs. In |
Ah, I see, strange - I will look into it! |
@guyonvarch should be fixed now, see the last commit. This was quite an ugly bug, thank you for pointing it out! |
During refactoring they were turned into variables/promises that got resolved once and afterwards would no longer run, hence no sleep timeouts were being applied and UpdateService was busy-looping without a break! Added (a rather ugly) test case to cover this. Could be incorporated into prop-tests, but would be easier once the Update state is ADT is refactored. Thanks to @guyonvarch for spotting this one.
Also ensure that all the template variables exist in the target file.
Commit 3cdf833f23977a77a62cfabf737f3ae75275698e introduced a bug by expecting curl to be in PATH.
Without this, logs look like this: playos-controller: [INFO] update mechanism: (UpToDate ((latest <opaque>) (booted <opaque>) (inactive <opaque>))) This also causes e2e tests to fail, which expect to observe the version number in the logs.
5e780c6
to
0035640
Compare
Previously logs were split into multiple lines, like so: [INFO] update mechanism: (Downloading (url http://update-server.local/9.9.9-TEST/playos-9.9.9-TEST.raucb) (version 9.9.9-TEST)) since the Downloading state has been simplified to only include the version and no url, it now gets logged in a single line.
Previously, the base/proxy-and-update e2e test passed, because the controller considered `latest < {booted,inactive}` an error state and triggered an update check "quickly" (in 30 seconds). This was changed with commit d8053bd which changed the interpration of this state to `UpToDate` i.e. a non-error state, therefore entering a long sleep of 1 hour. As a work-around, we restart the controller to trigger an update check. This is ugly and potentially hides bugs, should be eliminated once the runtime configuration option is added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First batch of comments on the non test part, I’ll have a look at the test afterward.
Co-authored-by: Joris Guyonvarch <joris@guyonvarch.me>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mostly read through the tests, but without necessarily getting everything, as there are non trivial parts.
Ocaml doesn’t help in this regard, requiring to put the most general function at the end of the file, you get the big picture in the end.
(* Using string repr is a horrible hack, but avoids having to | ||
pattern match on every variant in the state ADT *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we have to match on every variant in the ADT, could you provide an example?
Also, I don’t understand at this point where the magic pattern is created, is it in Update.sexp_of_state
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we have to match on every variant in the ADT, could you provide an example?
Because we need to define equality for state
, which would require defining equality for all the cases, including e.g. UpToDate of version_info
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I don’t understand at this point where the magic pattern is created, is it in Update.sexp_of_state?
Do you mean it's unclear how it's used? Here are a few places:
playos/controller/tests/update_tests.ml
Line 65 in 1831876
StateReached (Installing (_MAGIC_PAT ^ expected_bundle_name upstream_version)); playos/controller/tests/update_tests.ml
Line 77 in 1831876
StateReached (ErrorInstalling _MAGIC_PAT); playos/controller/tests/update_tests.ml
Lines 12 to 14 in 1831876
let expected_bundle_name vsn = Mock_update_client.test_bundle_name ^ _MAGIC_PAT ^ vsn ^ _MAGIC_PAT in
This is in an unfinished state (esp. the test part), but pushing this out for early review/feedback.
This PR does the following:
Update
module to use abstract interfaces for its external dependencies (RAUC, Curl/Connman, runtime configuration) rather than direct bindings / values.Update.run
function into a separate single transition function (run_step
) and the infinite recursive process (run
).Update
module.Some of the code is shifted around/generalized as a side-effect of the changes, but this PR should (hopefully!) not introduce any logical / semantic changes, it only creates new ways for interacting with the code and the sample tests.
This is WIP since several things need to be worked out in more detail:
update_deps
module/file is rather ad-hocI also left some TODOs as inline comments.
@knuton let me know of what you think about this "direction" and specific things that you think need to be done differently. If the general direction seems reasonable, I would attempt to flesh out the test framework and add more complex tests cases. Any ocaml-foo tips are also very welcome.
Apologies for the huge diff, the process was very unlinear, so did not produce structured git commits.
P.S. Is there a specific ocaml style used at Dividat? Maybe we could adopt one of the standard ocamlformat profiles?
Footnotes
I want to be able to record every mock call and update the mocked responses at run time. I have an idea how to do this with a simple PPX "decorator", but no idea how to do it otherwise. There's ocaml-mock, but it's very primitive / requires lots of boilreplate. ↩