-
-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: move mocha to vitest for beacon-node change #6028
Conversation
Performance Report✔️ no performance regression detected Full benchmark results
|
You should not propose such a substantial change with the sole argument
You must proof:
|
There is a lot of discussion and analysis shared with @wemeetagain and had long collaborative sessions with @matthewkeil to influence towards that This PR is still draft when we reached to stability level, the details for the questions you asked will be added in description. |
Some straight forward errors that popup after moving the test runner, which never been shown while running in mocha.
and
|
I think the answer to most of the questions is that mocha has been swallowing errors and hanging tests. |
There was some very very weird behavior with how the |
One other very simple case is below; This I agree we should not be using lodestar/packages/beacon-node/test/unit/util/kzg.test.ts Lines 75 to 78 in e42d6cc
|
Thanks for the context, that's much more helpful. Can you figure out a way to prevent the type of re-formatting example below? from
to
it adds a ton of diff making this PR very hard to review |
Codecov Report
@@ Coverage Diff @@
## unstable #6028 +/- ##
==========================================
Coverage ? 0
==========================================
Files ? 0
Lines ? 0
Branches ? 0
==========================================
Hits ? 0
Misses ? 0
Partials ? 0 |
Seems we had zero code coverage before, will be fixed this PR as well and for other packages as migrate. |
Thanks for the added details but I'm still missing specific concrete reasons for the points outlined here. I'm not questioning that another test runner is showing better results, but I believe it's very important to try to understand better the source of the problem. Otherwise we are playing blind games hoping that some other thing works. |
We are not really skipping them right now, see rationale in the PR that added the workaround. Of course this is not ideal solution long term.
I am not sure this is related to mocha, it seemed to happen since node v20 or around that time. There a bunch of issues around this topic recently, see Debugging hanging processes |
Why The following two changes answers why our tests were unstable. #6028 (comment) Why is it mocha's fault? It's not mocha fault for those issues, but mocha was hiding these errors. And because we just see a test hanging without any error traces we were unable to identify and fix the bugs. One other point is that having resource leaks is a common scenario in NodeJs ecosystem and all major tests runners support features to detect those natively. e.g. Jest have --detectopenhandles, Vitest have --reporter=hanging-process. Having support of such features natively to test runner helps us to detect and solve such issues on time. What is this test rummer doing exactly that addresses the proven techincal issues above? My understanding is that mocha was designed and developed before the |
If a dev team build an impression that some tests are not stable, so not important to check if they fail. It's equivalent to not having such tests. Not often times people run tests locally and usually rely on CI to detect if something really is broken and required to look into.
I believe in case of Lodestar the observation is same even before Node v20. Secondly there can be issues introduced by runtime, but having a stable test runner which helps to detect those issues is more helpful. As I mentioned in earlier comment Jest and Vitest have this support natively so would be a good tool in our kit in case of runtime issues. |
That's a good point, mocha is very limited in that regard, you can't even trace max event listener warnings as far as I know.
I trust in your judgement that vitest is a better and more stable test runner compared to mocha. Even if the exit issue is not directly related to mocha, just the fact that errors might be swallowed by mocha is enough of a selling point for me |
Those are fixes of our code, what I we should understand better is why mocha is not displaying those errors. And what other tests runners do different to actually show those errors. As far as I understand the current position is:
Also, ping on please addressing the formatting this issue raised before to reduce the diff |
I would rathe list it like that:
So now comes to the point what could be wrong in the mocha, I am not fully sure but here is my assumption;
The vitest on the other hand runs it's binary in the esm enabled process, so it does not need to spawn a child process.
Had disabled those prettier formatting on large block of codes, so it can be reviewed easily. |
If someone want to explore the wrapper I created around mocha for debugging can look here. |
I don't think we need to find the root cause of the mocha issues in order to switch test runners. We've had inconsistent and buggy results with mocha with reproduceable errors. We've seen those inconsistencies go away when using a different test runner. Thats good enough for me. Edit: I'm mostly just frustrated that our tests are so unreliable that we've had to basically ignore them. |
@nazarhussain thank you so much for the detailed breakdown of your investigation on the issue!
yeah definitely not, I'm ok with moving on. Just wanted to ensure this is a very well motivated decision. |
expect(chain.seenSyncCommitteeMessages.get(slot, subnet, validatorIndex)).to.be.equal( | ||
toHexString(syncCommittee.beaconBlockRoot), | ||
"should add message root to seenSyncCommitteeMessages" | ||
expect(chain.seenSyncCommitteeMessages.get(slot, subnet, validatorIndex)).toBe( | ||
toHexString(syncCommittee.beaconBlockRoot) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aren't we losing information here if the assertion fails?
"should add message root to seenSyncCommitteeMessages"
This message seems helpful to me instead of just seeing value x does not equal value y
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we are loosing some information here, that will require to some effort to get the tests in right order to get that information.
"validator with doppelganger protection should be running" | ||
); | ||
expect(validator0WithoutDoppelganger[0].isRunning).to.be.equal( | ||
true, | ||
"validator without doppelganger protection should be running before first epoch" | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The amount of information we lose here in terms of reading the test code and possible errors when assertions fails is not ideal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree and reason is that we didn't had the tests in ideal position earlier.
These are the best practices;
- Have simple and straight forward expectations.
- All the information related to test should be capture from test description, so if that fails you know what's wrong.
- If test failure needs to have further explanation, that implies that test description need to be more informative.
- If a test need mixed or confused expectations, it needs to be divided into multiple test cases.
- If an assertion is complex then should be extracted as custom assertion so it have consistent messaging across the tests.
These practices can't be enforce if assertion libraries are flexible like chai. So why Jest, Vitest and similar libraries don't provide feature for custom messages.
Let's take an example.
We need to compare a
is greater than b
and if we do with nodejs assertion module, we will have.
assert(a > b);
Such assertion is confusing unless provided more details as;
assert(a > b, "a is not greater than b");
The problem with this approach is that one developer may write a different assertion message and other may write different.
But if you use frameworks like Jest, Vitest they provide special assertions e.g.
expect(a). toBeGreaterThan(b);
With this assertion it's very clear and will have consistent error messages across every project. For vitest it have expect.extend
function to use such custom assertions that can be reused across the project.
Now take our example code.
expect(validator0WithDoppelganger[0].isRunning).to.be.equal(
true,
"validator with doppelganger protection should be running"
);
Having such assertion messages clearly sign of bad practice. It's a a clear an test case, not an assertion message. It should have been like this;
it("validator with doppelganger protection should be running", () => {
expect(validator0WithDoppelganger[0].isRunning).to.be.true;
})
If that test was failed, you clearly knows what's wrong and you don't need custom assertions messages to explain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this approach is that one developer may write a different assertion message and other may write different.
I don't mind different message as long as those are correct and informative. In my opinion the assertion itself only tells you what is expected but not why. Could split out each assertion into its own test case but would have to see how practical that is.
Definitely have to update the tests-style-guide as it specifically mentions assertion messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should at least add previous messages as comments, as it was done in other places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we need to update that, never saw that earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now take our example code.
Just to clarify this example, does it mean that what is currently in the assertion message should be moved up to it
message and what is in the it
message should be moved up to describe
, and then possibly have nested describe
blocks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logically yes.
@@ -9,8 +9,6 @@ import {getDevBeaconNode} from "../../../../../utils/node/beacon.js"; | |||
import {BeaconNode} from "../../../../../../src/node/nodejs.js"; | |||
|
|||
describe("beacon state api", function () { | |||
this.timeout("30s"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the default timeout vitest has? Don't we need to increase the timeout of this test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default timeout for vitest is 5s. We didn't saw any failure due to this at least 7-8 last attempts. So I would not try to increase that timeout now.
packages/beacon-node/test/unit/chain/forkChoice/forkChoice.test.ts
Outdated
Show resolved
Hide resolved
// "expect updateHead to be called" | ||
expect(chainStub.recomputeForkChoiceHead).toHaveBeenCalled(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure why there was "expect updateHead to be called"
as assertion message, maybe the default message of .to.be.called
was not great. But I'd say adding this as a comment here is just noise, the line below literally says the same thing.
Note: There are a bunch of places where this is the case, I'd suggest this is cleaned up as otherwise it would advocate for this kind of style of commenting which is just not a good practice.
const endpoint = `${baseUrl}${remoteServiceRoutes.pending}`; | ||
service = new MonitoringService("beacon", {endpoint, collectSystemStats: false}, {register, logger}); | ||
it("should abort pending requests if monitoring service is closed", () => | ||
new Promise<void>((done, error) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't we have to await this promise here? I'd assume otherwise this test case just finishes right away without waiting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's returning the promise, so eventually gonna be awaited for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern feels kinda wrong to me but I guess can keep it 🤷
const endpoint = `${baseUrl}${remoteServiceRoutes.pending}`; | ||
service = new MonitoringService("beacon", {endpoint, collectSystemStats: false}, {register, logger}); | ||
it("should abort pending requests if monitoring service is closed", () => | ||
new Promise<void>((done, error) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pattern feels kinda wrong to me but I guess can keep it 🤷
To be considered, should the contribution.md be updated here Debugging tests? What's the equivalent we would use now to run a single test suite? |
You can also set the minimum number tests to be failed, in cases when we know few certain tests would fail.
|
Failed test runs on unstable https://github.com/ChainSafe/lodestar/actions/runs/6517197158/job/17701469723#step:6:4037
https://github.com/ChainSafe/lodestar/actions/runs/6517197158/job/17701499293#step:6:1483
The second one is likely unrelated but haven't seen assertion error before |
🎉 This PR is included in v1.12.0 🎉 |
Motivation
Improve the developer productivity by reducing the time spent on the CI jobs reiterations, by making the tests run produce more stable results.
Description
We observed over the months that our tests runs are not stable. Specially the E2E tests were really fragile. This introduces a friction and delay in the development cycle. We endup considering the E2E tests not required for the CI, which is same as not having any e2e tests at all. Reducing the trust level on the code changes we introduce.
Based on every day understanding of debugging those tests I realized that something is not right with our tests runner. One clear identification is that we explicitly set force exit for mocha. All tests should exit naturally, if we don't set this exit option, tests does not exit even on local development machines.
lodestar/packages/beacon-node/.mocharc.yml
Line 6 in e42d6cc
I also observed one other indicator that
mocha
is not serving well. Below code is throwing error in the tests, but the tests are passing just fine.lodestar/packages/beacon-node/test/unit/util/kzg.test.ts
Lines 75 to 78 in e42d6cc
There are few other nits and bits that suggests that
mocha
test runner is not performing well. So based on assumption I tried other runner and found a lot of errors right away, some were logical issues, some were mocks and few runtime errors, all those issues were buried by mocha specially when code is running in the worker thread. Some of those issues are mentioned in the PR comments. By moving to other test runner, identified of those issues and for future we don't encounter the same problem I suggest to migrate to another test runner completely which makes our tests more stable.This PR will introduce migration to other test runner.
29.7
the Jest support for ESM modules are still experimental while the Vitest is all build on EMS and support it natively.The migration can be divided into steps:
unit
ande2e
tests for thebeacon-node
packagejest-codemods
tool. Used that in this current PR as well.Here are some key points for migration;
npx jest-codemods
to automatically migrate all assertions.this.timeout
can be moved as third parameter fordescribe
orit
.sinon.createStubbedInstance
can be migrated withvi.mock
andvi.mocked
methods.Steps to test or reproduce
Run all tests.