Reverted: Increase stream-cancellation-delay default to 1000 millis #590

mdedetrich · 2024-09-04T12:54:05Z

References: #422

Still need to update tests as per @jrudolph original comment, i.e.

That broke lots of tests that probably didn't automatically propagate closing so they now fail while waiting for the now ten times longer cancellation times... (Might still might sense to "just" fix those tests)

pjfanning · 2024-09-04T13:00:01Z

Reading the Akka issue and PR seems to indicate that this will cause test failures. I can help if needed to try to fix up the test issues. I'm hoping to get Pekko HTTP 1.1.0 released but it seems useful to fix this.

mdedetrich · 2024-09-04T13:04:55Z

Reading the Akka issue and PR seems to indicate that this will cause test failures. I can help if needed to try to fix up the test issues. I'm hoping to get Pekko HTTP 1.1.0 released but it seems useful to fix this.

The test failures are expected because some of the tests expect the stream to be cancelled within 100ms and now that is has been increased to 1000ms those tests are now broken (as per @jrudolph 's comment, the change itself is safe).

mdedetrich · 2024-09-04T13:06:12Z

Ill have time to look into this tonight

JD557 · 2024-09-04T13:11:00Z

Should this actually be marked as closing the issue?
It's better than the current status quo, but the underlying issue is still there.

For reference, we've used 1s for a long time with no issue until today, where we saw this resurface again. So 1s is not bullet proof.

mdedetrich · 2024-09-04T13:11:53Z

Should this actually be marked as closing the issue? It's better than the current status quo, but the underlying issue is still there.

For reference, we've used 1s for a long time with no issue until today, where we saw this resurface again. So 1s is not bullet proof.

Fair point, ill change this so it won't close original issue

mdedetrich · 2024-09-04T22:19:35Z

PR is ready, I managed to fix the issues with the tests.

mdedetrich · 2024-09-04T22:21:47Z

http-core/src/test/resources/reference.conf

@@ -6,4 +6,5 @@ pekko {
    default-dispatcher.throughput = 1
  }
  stream.materializer.debug.fuzzing-mode = off
+  stream.testkit.all-stages-stopped-timeout = 20 s


This timeout is related to Utils.assertAllStagesStopped which is designed to check that all stream stages finish when the test does. Since stream-cancellation-delay creates a stage in of itself this needs to be increased.

To me it makes sense to do this globally (for tests) rather than on a test by test basis because there is no issue in having it too long as it only increases the failure test case, not the positive.

The default value of this 5 s

I cannot remember the reason to choose 100ms for the delay cancellation limit. Having to increase the all-stages-stopped-timeout to higher values is somewhat of an indication of what kind of real world implications this change might have: cleanup of broken/closed connections might now take longer.

IIRC, the whole reason to introduce the delay was to deal with the fact that stream cancellation could not be well-configured in Akka 2.5. With 2.6, it's not 100% clear if the delay is even still needed or whether it could be solved in a better way (i.e. by resolving the cancellation race more intelligently, taking the cancellation reason into account at the right/more places) to avoid symptoms as shown in the ticket.

I cannot remember the reason to choose 100ms for the delay cancellation limit. Having to increase the all-stages-stopped-timeout to higher values is somewhat of an indication of what kind of real world implications this change might have: cleanup of broken/closed connections might now take longer.

iirc its only in a few tests which used Utils.assertAllStagesStopped started timing out so I can granularly update those tests. My ultimate reasoning for making this change global is was to reduce surprises for people writing future tests (i.e. giving more QoL to developers), more specifically avoiding the "I just updated/wrote a test and now its failing due to a finely tuned timeout and now I have to figure out that it was because of that said timeout"

I think that as long as you handle shutdown of pekko http gracefully I don't think it should cause issues in prod however I admit that many people may not even be handling shutdowns of pekko http correctly.

IIRC, the whole reason to introduce the delay was to deal with the fact that stream cancellation could not be well-configured in Akka 2.5. With 2.6, it's not 100% clear if the delay is even still needed or whether it could be solved in a better way (i.e. by resolving the cancellation race more intelligently, taking the cancellation reason into account at the right/more places) to avoid symptoms as shown in the ticket.

If this is the case then it should definitely be looked into, I am not aware of the updates to stream cancellation in akka 2.6 but judging by the date of the report at #422 (comment), if that stream cancellation update is now the default it didn't seem to fix the underlying issue.

My ultimate reasoning for making this change global is was to reduce surprises for people writing future tests (i.e. giving more QoL to developers), more specifically avoiding the "I just updated/wrote a test and now its failing due to a finely tuned timeout and now I have to figure out that it was because of that said timeout"

I agree with the goal of helping test writers but it's not good if tests only succeed or fail after a long timeout. Event the old 2 second timeout might have been somewhat of a test fixing bankruptcy issue where we didn't have the time to make sure all tests run through in a timely fashion.

I agree with the goal of helping test writers but it's not good if tests only succeed or fail after a long timeout. Event the old 2 second timeout might have been somewhat of a test fixing bankruptcy issue where we didn't have the time to make sure all tests run through in a timely fashion.

Should I go through and undo this global timeout increase and only apply it to the specific tests?

Perhaps a nice middle ground would be to increase the timeout in the configuration for the test classes that are affected? Not as neat as specifying it for an individual test, but more granular than applying it globally.

Thats definitely true, however its usually only one specific test per test classes that requires an override (as an example with WebSocketIntegrationSpec its only send back 100 elements and then terminate without error even when not ordinarily closed that even needs the override.

Basically its only tests which explicitly cancel some part of request/response as part of what is being tested and given how big test suites normally are in pekko-http it might not be enough of a difference to be useful.

mdedetrich · 2024-09-04T22:22:52Z

http2-tests/src/test/scala/org/apache/pekko/http/impl/engine/http2/H2SpecIntegrationSpec.scala

@@ -306,6 +306,7 @@ class H2SpecIntegrationSpec extends PekkoFreeSpec(
        executable,
        "-k", "-t",
        "-p", port.toString,
+        "-o", "9",


See https://github.com/summerwind/h2spec?tab=readme-ov-file#usage for reference. The default value is 2 seconds, I got to 9 with manual testing by incrememting 1 each time.

pjfanning

lgtm

mdedetrich · 2024-09-04T22:23:20Z

@pjfanning Feel free to merge when you see fit

…)" This reverts commit 932a22f.

…)" (#597) This reverts commit 932a22f.

mdedetrich marked this pull request as draft September 4, 2024 12:54

mdedetrich mentioned this pull request Sep 4, 2024

Random occasional SubscriptionWithCancelException$NoMoreElementsNeeded after migrating from Akka to Pekko #422

Open

pjfanning added this to the 1.1.0 milestone Sep 4, 2024

mdedetrich force-pushed the increase-stream-cancellation-delay-default branch 2 times, most recently from fe0067b to 3998b45 Compare September 4, 2024 21:51

Increase stream-cancellation-delay default to 1000 millis

5bdce88

mdedetrich force-pushed the increase-stream-cancellation-delay-default branch from 3998b45 to 5bdce88 Compare September 4, 2024 22:05

mdedetrich marked this pull request as ready for review September 4, 2024 22:18

mdedetrich requested review from jrudolph, raboof, gmethvin, nivox, He-Pin, nvollmar, samueleresca, pjfanning and Roiocam September 4, 2024 22:19

mdedetrich commented Sep 4, 2024

View reviewed changes

pjfanning approved these changes Sep 4, 2024

View reviewed changes

mdedetrich requested a review from pjfanning September 4, 2024 22:23

pjfanning merged commit 932a22f into apache:main Sep 4, 2024
10 checks passed

mdedetrich deleted the increase-stream-cancellation-delay-default branch September 4, 2024 22:27

This was referenced Sep 9, 2024

Add overridden duration timeout to StreamTestKit apache/pekko#1468

Open

clear warnings in http-core #115

Open

pjfanning added a commit that referenced this pull request Sep 12, 2024

Revert "Increase stream-cancellation-delay default to 1000 millis (#590…

ef1228e

…)" This reverts commit 932a22f.

pjfanning mentioned this pull request Sep 12, 2024

Revert "Increase stream-cancellation-delay default to 1000 millis" #597

Merged

pjfanning added a commit that referenced this pull request Sep 21, 2024

Revert "Increase stream-cancellation-delay default to 1000 millis (#590…

c09961c

…)" (#597) This reverts commit 932a22f.

pjfanning removed this from the 1.1.0 milestone Sep 21, 2024

pjfanning changed the title ~~Increase stream-cancellation-delay default to 1000 millis~~ Reverted: Increase stream-cancellation-delay default to 1000 millis Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reverted: Increase stream-cancellation-delay default to 1000 millis #590

Reverted: Increase stream-cancellation-delay default to 1000 millis #590

mdedetrich commented Sep 4, 2024 •

edited

Loading

pjfanning commented Sep 4, 2024

mdedetrich commented Sep 4, 2024 •

edited

Loading

mdedetrich commented Sep 4, 2024

JD557 commented Sep 4, 2024

mdedetrich commented Sep 4, 2024

mdedetrich commented Sep 4, 2024

mdedetrich Sep 4, 2024

jrudolph Sep 6, 2024 •

edited

Loading

mdedetrich Sep 6, 2024 •

edited

Loading

jrudolph Sep 6, 2024

mdedetrich Sep 6, 2024

raboof Sep 9, 2024

mdedetrich Sep 9, 2024 •

edited

Loading

mdedetrich Sep 4, 2024

pjfanning left a comment

mdedetrich commented Sep 4, 2024

Reverted: Increase stream-cancellation-delay default to 1000 millis #590

Reverted: Increase stream-cancellation-delay default to 1000 millis #590

Conversation

mdedetrich commented Sep 4, 2024 • edited Loading

pjfanning commented Sep 4, 2024

mdedetrich commented Sep 4, 2024 • edited Loading

mdedetrich commented Sep 4, 2024

JD557 commented Sep 4, 2024

mdedetrich commented Sep 4, 2024

mdedetrich commented Sep 4, 2024

mdedetrich Sep 4, 2024

Choose a reason for hiding this comment

jrudolph Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

mdedetrich Sep 6, 2024 • edited Loading

Choose a reason for hiding this comment

jrudolph Sep 6, 2024

Choose a reason for hiding this comment

mdedetrich Sep 6, 2024

Choose a reason for hiding this comment

raboof Sep 9, 2024

Choose a reason for hiding this comment

mdedetrich Sep 9, 2024 • edited Loading

Choose a reason for hiding this comment

mdedetrich Sep 4, 2024

Choose a reason for hiding this comment

pjfanning left a comment

Choose a reason for hiding this comment

mdedetrich commented Sep 4, 2024

mdedetrich commented Sep 4, 2024 •

edited

Loading

mdedetrich commented Sep 4, 2024 •

edited

Loading

jrudolph Sep 6, 2024 •

edited

Loading

mdedetrich Sep 6, 2024 •

edited

Loading

mdedetrich Sep 9, 2024 •

edited

Loading