Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove test flakiness #181

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

Abestanis
Copy link
Collaborator

@Abestanis Abestanis commented Oct 27, 2024

This is an attempt to reduce the test flakiness. I noticed that setUp calls are not executed on the same FakeAsync zone than the tests, which means that they are more likely to leak into other tests and flushing mikro-tasks in the tests doesn't actually flush the ones scheduled in setUp. Maybe this is enough to fix the flakiness in the tests. It definitely helped to reveal two instances where we forgot to dispose a timer or animation controller.

I also removed all runAsync calls. As a general rule of thumb, we don't really want the tests to do any real async work, (network requests, file accesses or even worse, waiting for a timeout should all be mocked), so we should never use runAsync.

Fixes #162.

@Abestanis Abestanis requested a review from nt4f04uNd October 27, 2024 18:22
@Abestanis Abestanis marked this pull request as draft October 27, 2024 18:25
@Abestanis Abestanis force-pushed the feature/test_cleanup branch from c67e075 to fa7b741 Compare October 27, 2024 20:31
@Abestanis Abestanis force-pushed the feature/test_cleanup branch from aa6c5f1 to 334cac1 Compare October 27, 2024 20:39
@Abestanis Abestanis force-pushed the feature/test_cleanup branch from 2d8d846 to 25134ad Compare October 27, 2024 21:00
@Abestanis
Copy link
Collaborator Author

Given the fact that 100 cycles of the tests passed (see CI run for 25134ad) I'd say we can be pretty confident that this fixes the flakiness. 😁

@Abestanis Abestanis force-pushed the feature/test_cleanup branch from 25134ad to f5936a1 Compare October 27, 2024 23:47
@Abestanis Abestanis marked this pull request as ready for review October 27, 2024 23:48
@Abestanis Abestanis changed the title Reduce test flakiness Remove test flakiness Oct 27, 2024
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestengly, a side effect I see on many screenshots is that the shadow from the drawer seems to have gone

It's OK, just something I noticed reviewing golden changes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually see any difference in any shadow, the only diff I can see is in the AlbumArtRotating in the lower left because it has a slightly different angle. In which screenshot do you see that difference?

Difference

Pixel diff of this screenshot

AsyncCallback callback, {
AsyncCallback? goldenCaptureCallback,
VoidCallback? initialization,
VoidCallback? postInitialization,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we align runAppTestWithoutUi with this one and pass this callback as well?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but it would be really useless because we would call it directly before calling the callback. I don't think it's worth it:

Future<void> runAppTestWithoutUi(
    FutureOr<void> Function() callback, {
    VoidCallback? initialization,
    VoidCallback? postInitialization,
  }) async {
    addTearDown(postTest);
    return runTest(
      () => withClock(clock, () async {
        await _setUpAppTest(initialization);
        try {
          postInitialization?.call();
          await callback();
        } finally {
          // [...]
        }
      }),
      () {},
    );
  }

/// 4. Runs the test from the [callback].
/// 5. Optionally, runs [goldenCaptureCallback].
/// 6. Stops and disposes the player.
/// 7. Un-pumps the screen and flushes all micro-tasks and stream events.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also cross-reference runAppTestWithoutUi here as "See also", discussing their differences and when one should use one above the other one

Same for the comment in runAppTestWithoutUi

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

test/test.dart Outdated Show resolved Hide resolved
test/test.dart Outdated
await pumpWidget(const SizedBox());
// Wait for any asynchronous events and stream callbacks to finish.
await pump(const Duration(seconds: 1));
// Wait for ui animations.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is referring to "system UI" (but don't remember for sure)

Let's put that into the comment tho

Otherwise pumping out the UI above would already stop animations and we wouldn't need pumpAndSettle

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I expanded the comment.

test/test.dart Outdated
},
() => test(tester),
goldenCaptureCallback: () => tester.screenMatchesGolden(
Invoker.current!.liveTest.test.name.split(' | theme')[0].replaceAll(' ', '.'),
Copy link
Owner

@nt4f04uNd nt4f04uNd Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filename is already being built here https://github.com/nt4f04uNd/sweyer/blob/master/test/test.dart#L259

I don't completely understand the purpose of this code, but it seems like using the description parameter would already would enrich it with the required parameters, maybe except from the group name

This code just seems to be not very scalable

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows us to automatically deduce the golden filename from the test name, which I think is a big plus for us.

... maybe except from the group name

That's exactly the crux of the issue, we don't have the group information in the description, and the only way I could see to get it is via the test name.

Filename is already being built here https://github.com/nt4f04uNd/sweyer/blob/master/test/test.dart#L259

It's not the same, that function expects as an argument the very string we are creating here.

This code just seems to be not very scalable

Since we create the test description directly above, I changed it to use that description directly so it will still work if we change TestDescription.buildDescription in the future. The only other thing this relies on now is that the group comes before the test description in the test name, an I think it's worth accepting that assumption, especially since this is just used in the tests.

@@ -10,46 +10,46 @@ void main() {
final favoriteSong2 = songWith(id: 4, title: 'Song 4', isFavoriteInMediaStore: true);
final favoriteSong3 = songWith(id: 5, title: 'Song 5', isFavoriteInMediaStore: true);

setUp(() async {
await setUpAppTest(() {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this is a good API change that we not longer can use setUp between tests?

Copy link
Owner

@nt4f04uNd nt4f04uNd Nov 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd argue it's not - it reduces the reusability of setup logic

In some cases it might also make that tests utilities are not composable with each other - this is a real case in tests that I had recently, when I made a not very composable API desicion, which put constraints on how tests could be written, so I had to rewrite it at some point after

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the description of the PR I see that you want them to live inside the same zone as other test logic

It this possible to achieve in some other way?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It this possible to achieve in some other way?

Not that I can see. I also don't particularly like this, but the async zone is created by the test internals after the setUp is run, so there is no way for code that is run in setUp to be run in that zone. And the fact that we have futures running in a different zone is the main problem that causes the flakiness we are seeing.

We could try to have our own version of setUp that that somehow magically registers itself via some global and which we could run in _setUpAppTest, but this feels brittle and very complicated.

@Abestanis
Copy link
Collaborator Author

Hey, I hope you're doing good and had a good start into the new year! 🎆

@Abestanis
Copy link
Collaborator Author

The timeout on the CI is very weird, there were only changes in comments, and it runs fine on my fork. Maybe something has changed on the CI. I'm looking into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Investigate test flakiness
2 participants