-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JsonSerializer.Deserialize(ROS<byte>, JsonTypeInfo<T>>) hangs with AllowOutOfOrderMetadataProperties on Amazon Linux 2023 with native AoT #107740
Comments
Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis |
Looks like a tricky one. Does the hang repro consistently with the same JSON payload that you shared here? Are you able to insert any tracing that might pinpoint where exactly the hang is happening? What is the smallest possible polymorphic type that reproduces the hang? |
I have a Graviton / Amazon Linux instance I can test this on. Any tips or commands I should run to try and repro? Even if it isn't minimal, I can try to whittle it down from there. |
The hangs come through invoking the Lambda directly through the AWS Lambda APIs (to bypass the need to simulate an Alexa device) and are from these payloads in the test project. Each one seems to have the same issue and precipitate the hang.
The best attempt at tracing I've gotten to are these two The first one outputs to CloudWatch, and the second one isn't ever seen. There's just a Here's an example of the logs I see for a single invocation:
Unfortunately X-Ray/OTel doesn't give me any granularity to get deeper.
I'm just about to try and see if I can put something together using the Amazon Linux 2023 docker image. |
As a starting point I guess run the end to end test project - if it's as simple as "Graviton bad" then maybe the tests will just hang/explode immediately. Failing that, you could try hacking apart FunctionEntrypoint to instead of creating the bootstrap and then calling |
This may be duplicate of #107347 . cc @MichalStrehovsky Can you try to add:
to you csproj and check whether it repros? It will reference the latest nightly build of dotnet/runtime. The SDK is always a few days behind and the SDK that you are testing with does not have the fix that I have linked yet. |
Sure I'll try that now. |
Just testing now (it's looking promising), but for reference I put together a Dockerfile to run the native AoT tests I have on arm64 for Amazon Linux 2023 which @hwoodiwiss kindly ran for me on an arm64 machine as my Windows version of Docker wasn't having a good time of it, and that didn't repro the issue I get when deployed either: FROM --platform=arm64 amazonlinux:latest
RUN dnf install findutils gzip libicu krb5-libs openssl-libs tar zlib -y
COPY . /source
WORKDIR /source
RUN curl -sSL https://dot.net/v1/dotnet-install.sh | bash /dev/stdin --jsonfile ./global.json --install-dir $HOME/.dotnet
RUN --mount=type=cache,id=nuget,target=/root/.nuget/packages \
export DOTNET_ROOT=$HOME/.dotnet \
&& export PATH=$PATH:$DOTNET_ROOT:$DOTNET_ROOT/tools \
&& dotnet run --project test/LondonTravel.Skill.NativeAotTests |
Yep, thanks @jkotas, adding these two packages resolved the issue: <PackageReference Include="Microsoft.Dotnet.ILCompiler" Version="9.0.0-rc.2.24461.16" />
<PackageReference Include="runtime.linux-x64.Microsoft.DotNet.ILCompiler" Version="9.0.0-rc.2.24461.16" /> Now the tests pass as expected: martincostello/alexa-london-travel#1434 (comment) Looks like this was a duplicate of #107300. |
Thanks, closing as duplicate of #107300. |
Confirmed the issue is resolved using |
Description
In trying to deploy a native AoT application to AWS Lambda using Amazon Linux 2023 that uses JSON polymorphism with
AllowOutOfOrderMetadataProperties=true
, it appears that the code hangs when trying to deserialize a payload. Requests timeout after 10 seconds (the Lambda timeout), but I've increased this as high as 2 minutes and the code still times out. Due to the limited ability to reach into the execution environment for AWS Lambda, most of my observations have been throughConsole.WriteLine()
based observations of which log messages are or aren't reached.The debugging trial and error can be observed here: martincostello/alexa-london-travel#1434
The use of
AllowOutOfOrderMetadataProperties
is needed as the payloads I wish to deserialize are AWS-owned events for Alexa devices where the type discriminator is not the first property of the JSON documents and this is not something that can be changed.I haven't yet been able to create a minimal repro for this, but these are the points of interest I've deduced through lots of trial and error.
AllowOutOfOrderMetadataProperties=false
causes the serializer to throw, as expected, so the issue seems specific to enabling this feature.9.0.100-rc.2.24462.3
of the SDK).I also did a bit of digging around whether this was an issue with an exception happening, and then its logging causing a hang when writing to the console (see microsoft/testfx#3485 (comment)), but I think I've ruled that out by messing around with turning logging off and other things (unless it is that, but the fix is incomplete).
Reproduction Steps
Currently all I can offer is this commit for inspection: martincostello/alexa-london-travel@d8ced7f
I'll try to distil things down to a more minimal repro, but if it's specific to Amazon Linux 2023 and/or Graviton, that might be tricky.
Here's an example payload:
Expected behavior
The JSON payloads are (de)serialized correctly as they are when native AoT is not used.
Actual behavior
The code hangs. The AWS Lambda runtime doesn't appear to be emitting any logs and telemetry that suggest the runtime is crashing.
Regression?
It was working at some point during the .NET 9 development cycle, as once I got out-of-order metadata polymorphism, I ran into #105034. Some time after that I used testfx's AoT support as a way to get feedback on native AoT issues, so needing to deploy to the real environment to validate things became less needed.
Known Workarounds
Don't use JSON polymorphism.
Configuration
9.0.100-rc.1.24452.12
and9.0.100-rc.2.24462.3
Other information
No response
The text was updated successfully, but these errors were encountered: