Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI hangs indefinitely intermittently after tests #415

Closed
jbr opened this issue Sep 15, 2023 · 11 comments
Closed

CI hangs indefinitely intermittently after tests #415

jbr opened this issue Sep 15, 2023 · 11 comments

Comments

@jbr
Copy link
Contributor

jbr commented Sep 15, 2023

Example output before hanging indefinitely:

> interop-test-client@0.0.0 test
> mocha "src/**/*.spec.ts"



  interoperation test
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
    ✔ Prio3Count is compatible with Janus (24706ms)
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
    ✔ Prio3Sum is compatible with Janus (13273ms)
Successful upload
Successful upload
Successful upload
Successful upload
Successful upload
    ✔ Prio3Histogram is compatible with Janus (11218ms)


  3 passing (49s)
@jbr jbr changed the title interop-test-client hangs indefinitely intermittently interop-test-client tests hangs indefinitely intermittently Sep 15, 2023
@jbr jbr changed the title interop-test-client tests hangs indefinitely intermittently interop-test-client tests hang indefinitely intermittently Sep 15, 2023
@divergentdave
Copy link
Contributor

This seems like it ought to be a problem with mocha, c8, npm, or node, since this is the last thing that's printed before exiting normally.

@jbr jbr changed the title interop-test-client tests hang indefinitely intermittently CI hangs indefinitely intermittently after tests Sep 15, 2023
@jbr
Copy link
Contributor Author

jbr commented Sep 15, 2023

Updated title. c8 seems like the most likely cause, but I'm not eager to try to debug that

@jbr
Copy link
Contributor Author

jbr commented Sep 15, 2023

When I run the interop-test-client tests locally, I get the following output immediately after 3 passing, which concurs with the "it's c8" assessment. Do we even want to be running coverage for the test client tests?

----------|---------|----------|---------|---------|--------------------------------------------------------------------------------------------------------
File      | % Stmts | % Branch | % Funcs | % Lines | Uncovered Line #s
----------|---------|----------|---------|---------|--------------------------------------------------------------------------------------------------------
All files |   72.36 |    27.65 |      60 |   72.36 |
 index.ts |   72.36 |    27.65 |      60 |   72.36 | ...160,162-165,167-168,180-181,183-186,189-192,195-196,206,220-222,229-232,236-237,274,279-283,301-304
----------|---------|----------|---------|---------|--------------------------------------------------------------------------------------------------------
ERROR: Coverage for lines (72.36%) does not meet global threshold (80%)
npm ERR! Lifecycle script `test:coverage` failed with error:
npm ERR! Error: command failed
npm ERR!   in workspace: interop-test-client@0.0.0
npm ERR!   at location: /Users/jbr/code/divviup/divviup-ts/packages/interop-test-client

@jbr
Copy link
Contributor Author

jbr commented Sep 15, 2023

Possibly related bcoe/c8#454

@divergentdave
Copy link
Contributor

I reproduced it locally after running sudo --preserve-env=DOCKER_HOST strace -f --trace=execve npm run test:coverage a couple times. I saved off a copy of the coverage directory before killing it, for investigation.

@divergentdave
Copy link
Contributor

The node process that is using an entire core of CPU is the one running mocha, so this could be related to coverage collection, but it's not an issue with the coverage reporters. I took the filters off strace, and the only syscall happening during the problem is getpid(), several times per second.

@jbr
Copy link
Contributor Author

jbr commented Sep 15, 2023

That seems possibly consistent with the c8 issue above, which is addressed by #417

@jbr
Copy link
Contributor Author

jbr commented Sep 15, 2023

It also looks like the exit flag is the crude fix, and it's possible there's something in the interop-test-client that doesn't entirely clean up after itself, per the documentation for exit

@divergentdave
Copy link
Contributor

I tried reproducing this with wtfnode hacked into mocha's entrypoint and exit function, as described in that issue, but it never hangs when I do so. It does print out stdout and stderr in the list of handles. I could believe that Node.js internals act strangely with stdout/stderr readiness for writing, etc. when the FDs are shared by two node processes due to the use of c8. We haven't changed much of consequence in the interop test package recently (except maybe --experimental-specifier-resolution=node?) so I don't think we've introduced any dangling handles. I think the exit flag workaround makes sense here.

@divergentdave
Copy link
Contributor

One more data point: I added inspect to mocha's node-option configuration, and attached Chrome's devtools to each invocation. After getting "Waiting for the debugger to disconnect..." and closing the debugger, I got the high CPU usage. Whatever the bug is, it seems likely it involves something deep in Node internals.

@jbr
Copy link
Contributor Author

jbr commented Sep 27, 2023

Closing this as it has not recurred since #417

@jbr jbr closed this as completed Sep 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants