Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce TAPE Unit Test Flakiness #328

Closed
wilsonbb opened this issue Dec 18, 2023 · 4 comments
Closed

Reduce TAPE Unit Test Flakiness #328

wilsonbb opened this issue Dec 18, 2023 · 4 comments
Assignees
Labels
bug Something isn't working

Comments

@wilsonbb
Copy link
Collaborator

After the merging of the refactor into main in #308, there seems to be some flakiness in

An example part of the error from last week's failed smoke test:

tests/tape_tests/test_ensemble.py::test_parquet_construction[parquet_ensemble_from_source]
  /opt/hostedtoolcache/Python/3.11.7/x64/lib/python3.11/site-packages/tape/ensemble.py:94: RuntimeWarning: coroutine 'wait_for' was never awaited
    self.client.close()

@dougbrn noted this could be trying to send messages to a closed client (dask/distributed#2956) caused by changes to conftest. So far have gone through all of the changes, but nothing has stood out. Will be making another pass.

@wilsonbb wilsonbb added the bug Something isn't working label Dec 18, 2023
@wilsonbb wilsonbb self-assigned this Dec 18, 2023
@dougbrn
Copy link
Collaborator

dougbrn commented Dec 19, 2023

I was encountering this pretty consistently on #327, the 3.9 build was failing consistently on the test_parquet_construction from hipscat, it did eventually run after ~10 restarts...

@wilsonbb
Copy link
Collaborator Author

An example of the above failure: https://github.com/lincc-frameworks/tape/actions/runs/7225142286/job/19687939027

Of particular relevance seems to be

tests/tape_tests/test_ensemble.py::test_parquet_construction[read_parquet_ensemble1]
  /opt/hostedtoolcache/Python/3.9.18/x64/lib/python3.9/site-packages/tape/ensemble.py:94: RuntimeWarning: coroutine 'wait_for' was never awaited
    self.client.close()

@dougbrn
Copy link
Collaborator

dougbrn commented Feb 8, 2024

As of #376, this seems to be happening much less, but it's important to emphasize that the underlying issue of how clients are being shared/opened/closed is not changed at all and instead I've just made most things client-free.

@wilsonbb
Copy link
Collaborator Author

Revisiting, unit test flakiness has decreased significantly, but as noted the underlying connection issues we're seeing with client open/close are still present in the multi-ensemble context. Closing this issue as flakiness has not been a concern for a month, but #362 remains open to investigate and resolve these client issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants