Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test: Add checks for the middle page after VM creation #1421

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yunmingyang
Copy link
Contributor

Check status changes during the VM creation.

@martinpitt martinpitt marked this pull request as draft February 5, 2024 06:38
@martinpitt
Copy link
Member

@yunmingyang FYI: Moving to draft while you are still working on this (half of the tests are red). Please move back to "ready for review" when you're done. Also, what is a "middle page"?

Thanks!

@yunmingyang
Copy link
Contributor Author

@martinpitt Thanks. The "middle page" is that, before the VM creation is success and there is a VM detail page, the VM status is "Creating VM", the page after we clicking the VM name will not be directed the VM detail page, but a page like
Screenshot from 2024-02-05 14-43-59
I called it the "middle page" as it seems that no official name for it. Do we have a official name for this page?

@martinpitt
Copy link
Member

martinpitt commented Feb 5, 2024

Ah, perhaps "Creating VM state"? Note that this is going to be very flaky unless you have a way to control the installation -- i. e. make it take a nontrivial amount of time and then do something to the system to finish it.

@yunmingyang
Copy link
Contributor Author

yunmingyang commented Feb 5, 2024

Ah I think the test will have enough time to check if using "Download an OS" with creating running VM. And for "Create and edit", I have not met the flaky locally, not sure whether the failed job is related with it or not. But actully, "create and edit" is a key point for this PR, as there will be enough time to test manually by using "Create and running", but it is too fast with "create and edit" so that tests could not be done manually in this situation.

@martinpitt
Copy link
Member

There are still lots of failures here:

wait_js_cond(ph_in_text("#vm-subVmTestCreate31-system-state","Creating VM")): Uncaught (in promise) Error: actual text: Shut off

This smells like the race condition discussed above.

@yunmingyang yunmingyang force-pushed the testCheckMidPageForRunningVM branch 4 times, most recently from 5f99419 to 79bf1a9 Compare March 24, 2024 22:21
@yunmingyang
Copy link
Contributor Author

@martinpitt sorry for the late response, I checked the original failed, according the log and screenshot, I think the reason why it failed is that the VM is still left there after deletion when using "create and run". Then, I tried to resolve this by adding some extra steps, but I failed. So I add a sleep for debugging, and the case passed the original failed location as the VM deletion will not leave a shutoff VM. For the new failed case, I think the reason is that there are two deletion operation in checkMiddlePageWhenCreating, the first passed with the debug line, the second didn't pass as there is no sleep before deletion just as the debug line, and it seems that this failure doesn't happen every time as some of checks pass. I think maybe this problem is related with libvirt, I am not sure. I think we could add some time wait before the deletion to resolve that. But I think add a sleep there is not a good way. Do you have some another ways for adding the time wait?

@martinpitt
Copy link
Member

Right, static sleeps are never the right way, but rather waiting for something specific -- like, an UI change, or maybe even polling some virsh command. I don't know why the test is failing, but if it's failing to delete that VM then it could be useful to enable debug logging what's going on under the hood:

b.eval_js('window.debugging = "machines"')

That helps especially if you cannot reproduce the issue locally, so that you can see the libvirt/dbus messages in the test output. However, how hard did you try? Try to run the test several times, and also if you run against a persistent bots/vm-run VM, then do add --no-network (to reproduce what the real CI does with run-test).

@yunmingyang yunmingyang force-pushed the testCheckMidPageForRunningVM branch 2 times, most recently from 3d87834 to bd64228 Compare April 17, 2024 14:43
@yunmingyang yunmingyang force-pushed the testCheckMidPageForRunningVM branch 4 times, most recently from 30e4a48 to a1f35d6 Compare April 29, 2024 11:01
@yunmingyang
Copy link
Contributor Author

Hi, @martinpitt. Thanks for your information, it is very helpful. After some investigation, I am not whether it is correct, but I think maybe the difference between failure and success is there will be doUsagePolling, "UPDATE_VM" action and some other API invoking before the Destroy event. You could find it in https://cockpit-logs.us-east-1.linodeobjects.com/pull-1421-f22c88e1-20240423-115708-arch/log.html(subVmTestCreate31). So I add a refresh before removing the VM, and it seems that the problem disappears even though removing the sleep. But it smells like an issue about race condition. What do you think of it?

@yunmingyang yunmingyang force-pushed the testCheckMidPageForRunningVM branch from a1f35d6 to 3d8fc30 Compare May 20, 2024 17:45
@yunmingyang yunmingyang marked this pull request as ready for review May 21, 2024 01:25
b.wait_in_text(f"#vm-{self.name}-system-state", "Creating VM")

b.click(f"#vm-{self.name}-system-name")
b.wait_visible(f"h1:contains(\"Creating VM {self.name}\")")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering what this PR adds, so far I can see that this is a new test assertion, not covered by tests:

[jelle@t14s][~/projects/cockpit-machines]%git grep "Creating VM" test/
[jelle@t14s][~/projects/cockpit-machines]%

But we can easily move this into an existing test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test check the middle page after clicking "Create and edit" and "Create and run". Include, the page will be redirected to the middle page after clicking "Create and edit"; after click "Create and run", the page will not be redirected, but click the VM name could be in the middle page; Check the content of the middle page; Check the button "Go to VMs list" of the middle page.

I am not sure whether it will be easy if we try to move this into an existing test, according to the debug process above. And, if move it into createTest, there will bring more complexity, such as need to consider that os is RHEL. Also, if we didn't figure out why the deletion needs a reload, the failure may spread to all create test

b.wait_in_text("h2.vm-name", self.name)

# It seems that the difference between removing success and failure is doUsagePolling and reducer "UPDATE_VM" action.
# Thus, try to reload the page to trigger these event manually to see whether it could resolve the problem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a bug? Or do we simply not have a good way to wait till we can delete the VM? But maybe the bigger question is, can't we re-use runner.createTest which removes a VM for us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a way to wait untill we could detele the VM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants