Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: LaunchAsync finishes with TimeoutException on Chrome #32897

Closed
ilonatommy opened this issue Oct 1, 2024 · 2 comments
Closed

[Bug]: LaunchAsync finishes with TimeoutException on Chrome #32897

ilonatommy opened this issue Oct 1, 2024 · 2 comments
Assignees
Labels

Comments

@ilonatommy
Copy link
Member

Version

1.47.0

Steps to reproduce

Create a PR on https://github.com/dotnet/runtime that will add a dummy change under e.g. src/mono/wasm/Wasm.Build.Tests path.

Expected behavior

Tests pass, it's a dummy change.

Actual behavior

Tests that use Playwright and run on the Windows machine will most probably fail with a log similar to this:
https://helixre107v0xdcypoyl9e7f.blob.core.windows.net/dotnet-runtime-refs-pull-107865-merge-8d53b01d113d4e65a8/Workloads-NoWebcil-ST-Wasm.Build.Tests.TestAppScenarios.LibraryInitializerTests/1/console.c50164a6.log?helixlogtype=result
If not, rerun the CI by pushing another dummy commit. The hit rate is rather high: dotnet/runtime#107771

Additional context

We are using Playwright in Mono runtime tests in https://github.com/dotnet/runtime.
On Windows machine they tend to fail on Chrome with timeouts that are hard to debug.
Playwright version: 1.47.0,
Chrome version: 128.0.6613.120.

Logic of tests: tests run sequentially, before we start launching a test we check for chrome processes and force-kill them. Then we create new Playwright = await Microsoft.Playwright.Playwright.CreateAsync(); and this Playwright is used to LaunchAsync the browser with 15000 max timeout and following arguments:

C:\helix\work\correlation\chrome-win\chrome.exe --disable-background-networking --enable-features=NetworkService,NetworkServiceInProcess --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,AcceptCHFrame,AutoExpandDetailsElement,CertificateTransparencyComponentUpdater --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --disable-sync --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --headless --hide-scrollbars --mute-audio --blink-settings=primaryHoverType=2,availableHoverTypes=2,primaryPointerType=4,availablePointerTypes=4 --no-sandbox --explicitly-allowed-ports=53956 --ignore-certificate-errors --lang=en-US --user-data-dir=C:\Users\ContainerAdministrator\AppData\Local\Temp\playwright_chromiumdev_profile-qQD4Tr --remote-debugging-pipe --no-startup-window

In case it fails, we clean up the Playwright and Browser instances and we have 2 more retries. The tests are failing pretty often on the CI with all the retries hitting timeout. There is not much information logged:

Attempt 3 failed with TimeoutException: System.TimeoutException: Timeout 15000ms exceeded.
Call log:
  - <launching> C:\helix\work\correlation\chrome-win\chrome.exe --disable-field-trial-config --disable-background-networking --disable-background-timer-throttling --disable-backgrounding-occluded-windows --disable-back-forward-cache --disable-breakpad --disable-client-side-phishing-detection --disable-component-extensions-with-background-pages --disable-component-update --no-default-browser-check --disable-default-apps --disable-dev-shm-usage --disable-extensions --disable-features=ImprovedCookieControls,LazyFrameLoading,GlobalMediaControls,DestroyProfileOnBrowserClose,MediaRouter,DialMediaRouteProvider,AcceptCHFrame,AutoExpandDetailsElement,CertificateTransparencyComponentUpdater,AvoidUnnecessaryBeforeUnloadCheckSync,Translate,HttpsUpgrades,PaintHolding,PlzDedicatedWorker --allow-pre-commit-input --disable-hang-monitor --disable-ipc-flooding-protection --disable-popup-blocking --disable-prompt-on-repost --disable-renderer-backgrounding --force-color-profile=srgb --metrics-recording-only --no-first-run --enable-automation --password-store=basic --use-mock-keychain --no-service-autorun --export-tagged-pdf --disable-search-engine-choice-screen --unsafely-disable-devtools-self-xss-warnings --headless=old --hide-scrollbars --mute-audio --blink-settings=primaryHoverType=2,availableHoverTypes=2,primaryPointerType=4,availablePointerTypes=4 --no-sandbox --explicitly-allowed-ports=49233 --ignore-certificate-errors --lang=en-US --user-data-dir=C:\Users\ContainerAdministrator\AppData\Local\Temp\playwright_chromiumdev_profile-kFF0DH --remote-debugging-pipe --no-startup-window
  - <launched> pid=4464
   at Microsoft.Playwright.Transport.Connection.InnerSendMessageToServerAsync[T](ChannelOwner object, String method, Dictionary`2 dictionary, Boolean keepNulls) in /_/src/Playwright/Transport/Connection.cs:line 206
   at Microsoft.Playwright.Transport.Connection.WrapApiCallAsync[T](Func`1 action, Boolean isInternal) in /_/src/Playwright/Transport/Connection.cs:line 532
   at Microsoft.Playwright.Core.BrowserType.LaunchAsync(BrowserTypeLaunchOptions options) in /_/src/Playwright/Core/BrowserType.cs:line 56
   at Wasm.Build.Tests.BrowserRunner.SpawnBrowserAsync(String browserUrl, Boolean headless, Int32 timeout, Int32 maxRetries, String language) in /_/src/mono/wasm/Wasm.Build.Tests/BrowserRunner.cs:line 129

We switched on the debug logs: "set DEBUG=pw:browser*" but it removed the problem and left us without the real fix. What do you suggest to investigate the issue further?

PR that tries to fix the issue:
dotnet/runtime#107865

cc @mxschmitt

Environment

To be added when this build finishes:
https://dev.azure.com/dnceng-public/public/_build/results?buildId=823464&view=results
@mxschmitt
Copy link
Member

Summary of an internal discussion: machine has enough CPU/memory. I suggest the following:

  • Increase timeout back to 30s or null
  • Try our chromium (not specifying executablePath or Channel)
  • Set DEBUG=pw:protocol,pw:browser env var

@mxschmitt
Copy link
Member

Turns out its based on Windows Containers. Windows containers are not officially supported. If you have any further outcome, feel free to post it here, so it might help future users.

IIRC mcr.microsoft.com/windows/server:ltsc2022 was working fine with Playwright for some users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants