-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Combine SSHConnection._init_connection
and Microvm.wait_for_ssh_up
#4850
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
roypat
changed the title
Combine SSHConnection._init_connection and Microvm.wait_for_ssh_up
Combine Oct 14, 2024
SSHConnection._init_connection
and Microvm.wait_for_ssh_up
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #4850 +/- ##
=======================================
Coverage 84.01% 84.01%
=======================================
Files 251 251
Lines 28019 28019
=======================================
Hits 23541 23541
Misses 4478 4478
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
roypat
force-pushed
the
wait_for_up_fix
branch
from
October 14, 2024 16:31
1483b3f
to
c9a90d1
Compare
bchalios
previously approved these changes
Oct 15, 2024
We use `Microvm.wait_for_ssh_up` to wait for a booted guest's userland to be initialized enough that it can process incoming SSH connections. However, it turns out that we werent waiting the intended 10s, we were waiting 3s. The reason for this is that before the `true` command in `wait_for_ssh_up` is even sent to the guest, we go through the `Microvm.ssh` property, which calls the `Microvm.ssh_face` function, which constructs an `SSHConnection` object, whose construtor calls `SSHConnection._init_connection`. And inside `_init_connection`, we _already_ try to send a `true` command to the guest, to ascertain it is up and running. However, here we do it in a retry loop (needed because if port 22 is still closed, `ssh` exits with non-zero code and "connection refuses") that does 20 retries at 0.15s intervals. Or tries to 3 seconds. Most importantly, the retries in `_init_connection` are done without timeouts, so even _if_ we actually hung during connection establishment, `wait_for_ssh_up` wouldn't abort after 10 seconds, because it would get stuck inside of `_init_connection`, which doesn't know anything about a 10s timeout. Fix all this up by merging `_init_connection` and `wait_for_ssh_up`. Skip sending the `true` command in `wait_for_ssh_up`, and just call `ssh_iface(0)` directly. Increment the retry time in `_init_connection` to actually be 10s. And add a 10s timeout to the commands in `_init_connection`. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
roypat
force-pushed
the
wait_for_up_fix
branch
from
October 15, 2024 09:57
d51998c
to
36d5bed
Compare
bchalios
approved these changes
Oct 15, 2024
roypat
added
the
Status: Awaiting review
Indicates that a pull request is ready to be reviewed
label
Oct 15, 2024
pb8o
approved these changes
Oct 15, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We use
Microvm.wait_for_ssh_up
to wait for a booted guest's userland to be initialized enough that it can process incoming SSH connections. However, it turns out that we werent waiting the intended 10s, we were waiting 3s. The reason for this is that before thetrue
command inwait_for_ssh_up
is even sent to the guest, we go through theMicrovm.ssh
property, which calls theMicrovm.ssh_face
function, which constructs anSSHConnection
object, whose construtor callsSSHConnection._init_connection
. And inside_init_connection
, we already try to send atrue
command to the guest, to ascertain it is up and running. However, here we do it in a retry loop (needed because if port 22 is still closed,ssh
exits with non-zero code and "connection refuses") that does 20 retries at 0.15s intervals. Or tries to 3 seconds. Most importantly, the retries in_init_connection
are done without timeouts, so even if we actually hung during connection establishment,wait_for_ssh_up
wouldn't abort after 10 seconds, because it would get stuck inside of_init_connection
, which doesn't know anything about a 10s timeout.Fix all this up by merging
_init_connection
andwait_for_ssh_up
. Skip sending thetrue
command inwait_for_ssh_up
, and just callssh_iface(0)
directly. Increment the retry time in_init_connection
to actually be 10s. And add a 10s timeout to the commands in_init_connection
.License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
PR.
CHANGELOG.md
.TODO
s link to an issue.contribution quality standards.
rust-vmm
.