-
Notifications
You must be signed in to change notification settings - Fork 124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: No time-based loss detection before largest_acked
& fixes
#1998
base: main
Are you sure you want to change the base?
Conversation
Failed Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Benchmark resultsPerformance differences relative to d6279bf. coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [99.781 ns 100.06 ns 100.35 ns] change: [-0.0691% +0.3854% +0.8727%] (p = 0.13 > 0.05) coalesce_acked_from_zero 3+1 entries: No change in performance detected.time: [117.84 ns 118.21 ns 118.61 ns] change: [-0.1498% +0.3222% +0.8142%] (p = 0.19 > 0.05) coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [117.66 ns 118.18 ns 118.78 ns] change: [-0.3301% +0.3180% +0.8978%] (p = 0.32 > 0.05) coalesce_acked_from_zero 1000+1 entries: Change within noise threshold.time: [98.191 ns 98.432 ns 98.741 ns] change: [+0.2702% +1.2029% +2.1014%] (p = 0.01 < 0.05) RxStreamOrderer::inbound_frame(): No change in performance detected.time: [110.98 ms 111.11 ms 111.30 ms] change: [-0.1477% -0.0232% +0.1488%] (p = 0.82 > 0.05) transfer/pacing-false/varying-seeds: No change in performance detected.time: [28.531 ms 29.337 ms 30.147 ms] change: [-0.8630% +4.2908% +9.9250%] (p = 0.12 > 0.05) transfer/pacing-true/varying-seeds: Change within noise threshold.time: [37.814 ms 39.294 ms 40.754 ms] change: [+0.5884% +7.1476% +14.299%] (p = 0.04 < 0.05) transfer/pacing-false/same-seed: 💚 Performance has improved.time: [20.227 ms 21.095 ms 21.965 ms] change: [-23.657% -19.714% -15.468%] (p = 0.00 < 0.05) transfer/pacing-true/same-seed: 💚 Performance has improved.time: [20.852 ms 21.936 ms 23.034 ms] change: [-52.505% -49.129% -45.393%] (p = 0.00 < 0.05) 1-conn/1-100mb-resp (aka. Download)/client: No change in performance detected.time: [113.70 ms 114.11 ms 114.50 ms] thrpt: [873.36 MiB/s 876.33 MiB/s 879.50 MiB/s] change: time: [-0.7008% -0.2614% +0.1863%] (p = 0.26 > 0.05) thrpt: [-0.1860% +0.2621% +0.7057%] 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [312.75 ms 316.12 ms 319.49 ms] thrpt: [31.300 Kelem/s 31.633 Kelem/s 31.974 Kelem/s] change: time: [-2.4277% -0.8752% +0.6564%] (p = 0.27 > 0.05) thrpt: [-0.6521% +0.8830% +2.4881%] 1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.time: [33.867 ms 34.070 ms 34.294 ms] thrpt: [29.160 elem/s 29.351 elem/s 29.527 elem/s] change: time: [-1.5619% -0.6708% +0.2182%] (p = 0.13 > 0.05) thrpt: [-0.2177% +0.6753% +1.5866%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
Firefox builds for this PRThe following builds are available for testing. Crossed-out builds did not succeed.
|
largest_acked
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1998 +/- ##
=======================================
Coverage ? 95.31%
=======================================
Files ? 112
Lines ? 36352
Branches ? 0
=======================================
Hits ? 34648
Misses ? 1704
Partials ? 0 ☔ View full report in Codecov by Sentry. |
Kwik failure: ptrd/kwik#46 |
Quinn failure: quinn-rs/quinn#1929 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! Patch looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems good.
If we pace, we might get the initial server flight before sending sufficient 0-RTT data to pass the QNS check. Broken out of mozilla#1998
This caused some test failures, and @martinthomson discovered this as the reason at the IETF 120 hackathon. Broken out of mozilla#1998
Packets are only declared as lost relative to `largest_acked`. If we hit a PTO while we don't have a largest_acked yet, also do a congestion control reaction (because otherwise none would happen). Broken out of mozilla#1998
This is what RFC9000 says, and it helps with servers that expect ACKs in the Initial packet number space to happen. It also eliminates a difference in behavior we had when a client received an Initial packet containing an ACK and a CRYPTO frame vs. a CRYPTO frame and an ACK, where that CRYPTO frame caused the Initial packet number space to be dropped. Broken out of mozilla#1998
We don't track which packets are coalesced with others, so when we detect a loss in one packet number space, we cannot resend any coalesced packets in other packet number space. This PR tries to approximate this behavior by scheduling un-ACKed Handshake and 0-RTT packets for RTX when packets are lost in the Initial space. Broken out of mozilla#1998
Funtion is only used internally. Broken out of mozilla#1998
Funtion is only used internally. Broken out of #1998
* fix: Don't encode large RTT guesses in tickets Because under lossy conditions (e.g., QNS `handshakeloss` test), the guess can be multiple times the actual RTT, which when encoded in the resumption ticket will cause an extremely slow second handshake, often causing the test to time out. Broken out of #1998 Fixes #2088 * Fixes & tests * Suggestion from @mxinden * Fix
We don't track which packets are coalesced with others, so when we detect a loss in one packet number space, we cannot resend any coalesced packets in other packet number space. This PR tries to approximate this behavior by scheduling un-ACKed Handshake and 0-RTT packets for RTX when packets are lost in the Initial space. Broken out of #1998
This caused some test failures, and @martinthomson discovered this as the reason at the IETF 120 hackathon. Broken out of #1998
Signed-off-by: Lars Eggert <lars@eggert.org>
Rather than having the caller determine for which space a PTO should be calculated for. Broken out of mozilla#1998 I'm ambivalent if we want this change - thoughts?
These should pass once the PR series replacing mozilla#1998 has landed. Broken out of mozilla#1998
We'd previously only mark 1 one or two packets as lost when a PTO fired. That meant that we potentially didn't RTX all data that we could have (i.e., that was in lost packets that we didn't mark lost). This also changes the probing code to suppress redundant keep-alives, i.e., PINGs that we sent for other reasons, which could double as keep-alives but did not. Broken out of mozilla#1998
I will refactor this PR into smaller standalone ones.
#1603 broke this, by falsely enabling time-based loss detection before we have received a valid
largest_acked
.This makes it clear that time-based LD should be disabled before an ACK is received.
The
DetectAndRemoveLostPackets
pseudo code also has theassert
above.When a PTO fires, we now explicitly also do a congestion reaction.
Fixes #1874
Also includes some fixes to hopefully make retransmissions during the handshake more robust, by timing out all packets in all packet number spaces affected by a PTO (making their contents eligible for RTX) and by not applying
ack_delay
to the application packet number space PTO during the handshake.