Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUIC] [Linux] Segmentation fault in System.Net.Http.Functional.Tests #103703

Closed
matouskozak opened this issue Jun 19, 2024 · 10 comments · Fixed by #105109
Closed

[QUIC] [Linux] Segmentation fault in System.Net.Http.Functional.Tests #103703

matouskozak opened this issue Jun 19, 2024 · 10 comments · Fixed by #105109
Labels
arch-arm32 area-System.Net.Http blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Milestone

Comments

@matouskozak
Copy link
Member

matouskozak commented Jun 19, 2024

Build Information

Build: https://dev.azure.com/dnceng-public/public/_build/results?buildId=712564
Build error leg or test failing: System.Net.Http.Functional.Tests
Affected CI: linux-arm Release Libraries_Release_CoreCLR (runtime-extra-platforms)

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorPattern": ["arm", "Segmentation fault.*System.Net.Http.Functional.Tests"],
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=712564
Error message validated: [arm Segmentation fault.*System.Net.Http.Functional.Tests]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 6/21/2024 6:18:04 AM UTC

Report

Build Definition Test Pull Request
746167 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
746152 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
745393 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
744545 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
744526 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
743391 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
727276 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
721722 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
720898 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
720816 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
719424 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
716252 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution
713848 dotnet/runtime System.Net.Http.Functional.Tests.WorkItemExecution

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 6 13
@matouskozak matouskozak added arch-arm32 area-System.Net.Http os-linux Linux OS (any supported distro) blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' untriaged New issue has not been triaged by the area owner Known Build Error Use this to report build issues in the .NET Helix tab labels Jun 19, 2024
Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@jkotas jkotas changed the title [Linux] CoreCLR Linux Arm failures inside System.Net.Http.Functional due to: Condition not met: "IsChromium" [Linux] CoreCLR Linux Arm failures in System.Net.Http.Functional.Tests.HttpMetricsTest Jun 19, 2024
@jkotas
Copy link
Member

jkotas commented Jun 19, 2024

Condition(s) not met: "IsChromium

This is not an error message. This message is in every test log. I have fixed the pattern.

@jkotas jkotas changed the title [Linux] CoreCLR Linux Arm failures in System.Net.Http.Functional.Tests.HttpMetricsTest [Linux] CoreCLR Linux Arm failures due System.Exception: Early EOF Jun 19, 2024
@MihaZupan
Copy link
Member

exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped.

The console output looks like a crash.

The System.Exception: Early EOF does appear a lot, but those tests are passing, XUnit is just reporting the ITestOutputHelper text for all the tests now, not just the failing ones -- see #103445.

@MihaZupan MihaZupan changed the title [Linux] CoreCLR Linux Arm failures due System.Exception: Early EOF [Linux] Segmentation fault in System.Net.Http.Functional.Tests Jun 19, 2024
@matouskozak
Copy link
Member Author

exit code 139 means SIGSEGV Illegal memory access. Deref invalid pointer, overrunning buffer, stack overflow etc. Core dumped.

The console output looks like a crash.

The System.Exception: Early EOF does appear a lot, but those tests are passing, XUnit is just reporting the ITestOutputHelper text for all the tests now, not just the failing ones -- see #103445.

There is another console output without the System.Exception: Early EOF.

@rzikm
Copy link
Member

rzikm commented Jun 19, 2024

cc @liveans

@jkotas
Copy link
Member

jkotas commented Jun 20, 2024

Analysis for build 712564:

Crash in libmsquic.so at RecvDataReturn . Null pointer passed to InterlockedDecrement.

(gdb) bt
#0  0xe861799c in InterlockedDecrement (Addend=0x60) at /__w/1/s/src/inc/quic_platform_posix.h:115
#1  RecvDataReturn (RecvDataChain=0x0) at /__w/1/s/src/platform/datapath_epoll.c:2067
#2  0xe860f836 in CxPlatRecvDataReturn (RecvDataChain=<optimized out>) at /__w/1/s/src/platform/datapath_linux.c:338
#3  0xe85e7204 in QuicConnRecvDatagrams (Connection=Connection@entry=0xf686f278, Packets=0x0, Packets@entry=0xe43948c8, PacketChainCount=PacketChainCount@entry=2,
    PacketChainByteCount=PacketChainByteCount@entry=2333, IsDeferred=<optimized out>, IsDeferred@entry=0 '\000') at /__w/1/s/src/core/connection.c:5745
#4  0xe85e76aa in QuicConnFlushRecv (Connection=Connection@entry=0xf686f278) at /__w/1/s/src/core/connection.c:5826
#5  0xe85e9a2a in QuicConnDrainOperations (Connection=Connection@entry=0xf686f278) at /__w/1/s/src/core/connection.c:7575
#6  0xe85d578c in QuicWorkerProcessConnection (Worker=Worker@entry=0xf68b94c0, Connection=0xf686f278, ThreadID=<optimized out>, TimeNow=TimeNow@entry=0xe166fd88) at /__w/1/s/src/core/worker.c:506
#7  0xe85d5d20 in QuicWorkerLoop (Context=0xf68b94c0, State=0xe166fd88) at /__w/1/s/src/core/worker.c:658
#8  0xe860d7bc in CxPlatRunExecutionContexts (State=0xe166fd88, Worker=<optimized out>) at /__w/1/s/src/platform/platform_worker.c:395
#9  CxPlatRunExecutionContexts (Worker=<optimized out>, State=0xe166fd88) at /__w/1/s/src/platform/platform_worker.c:369
#10 0xe860d948 in CxPlatWorkerThread (Context=0xf6863370) at /__w/1/s/src/platform/platform_worker.c:492
#11 0xf7745dd6 in start_thread (arg=0x3d63daff) at pthread_create.c:442
#12 0xf779c8a0 in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:74 from /lib/arm-linux-gnueabihf/libc.so.6

@ManickaP
Copy link
Member

This is Arm32 again, isn't it. Similar issue: #103404.
cc @nibanks

@ManickaP ManickaP added this to the 9.0.0 milestone Jun 20, 2024
@dotnet-policy-service dotnet-policy-service bot removed the untriaged New issue has not been triaged by the area owner label Jun 20, 2024
@liveans liveans self-assigned this Jun 20, 2024
@liveans liveans added the disabled-test The test is disabled in source code against the issue label Jun 21, 2024
@karelz karelz added test-run-core Test failures in .NET Core test runs tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly labels Jun 25, 2024
@ManickaP ManickaP changed the title [Linux] Segmentation fault in System.Net.Http.Functional.Tests [QUIC] [Linux] Segmentation fault in System.Net.Http.Functional.Tests Jun 25, 2024
@ManickaP ManickaP assigned liveans and unassigned liveans Jun 25, 2024
@jkotas
Copy link
Member

jkotas commented Jul 2, 2024

Hit in #104264

@liveans liveans removed their assignment Jul 18, 2024
@liveans
Copy link
Member

liveans commented Jul 18, 2024

This issue, #103404, #91757 will be fixed with next release of msquic.

@liveans liveans removed the disabled-test The test is disabled in source code against the issue label Jul 19, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Aug 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
arch-arm32 area-System.Net.Http blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab os-linux Linux OS (any supported distro) test-run-core Test failures in .NET Core test runs tracking-external-issue The issue is caused by external problem (e.g. OS) - nothing we can do to fix it directly
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants