Disable stacktrace collection per EventPipeSession #3696

ezsilmar · 2023-02-28T17:08:00Z

Background

Currently, all events emitted from the managed space by EventSource will do a stackwalk, see eventpipeinternal.cpp:121. The obligatory stackwalk leads to significant performance penalty for frequent events while in some cases the stacktrace is not required.

As an example, I'm working on the association between CPU samples from perf record and spans of System.Diagnostics.Activity. The idea is to bring the profiling data on the application level, showing the cpu usage grouped by activity name. To make it work, I record all the activities with Microsoft-Diagnostics-DiagnosticSource and track TraceSyncrhonousWorkStart and TraceOperationStart events from System.Threading.Tasks.TplEventSource to follow the asynchronous flow.

With the stacktraces, the idea is not viable because the application spends more than half of the cpu time doing stackwalks. By disabling the stacktrace recording (just put false in the file linked above), the overhead drops down to around 3% showing really promising results.

Proposed Feature

I propose to introduce the bool collectStacks parameter when creating an EventPipe session. The change is required both in the runtime implementation and the DiagnosticsClient. For managed events the effect of session.collectStacks would be straightforward as there's no way to control it in EventSource definition. For the native we could check session.collectStacks && event.needStack to respect ClrEtwAllMeta.lst

Why I think introducing this parameter on the session level is a good idea:

For the same event, the importance of the stacktrace depends on the usecase. Thus the user should be able to choose
It seems too complex for me to propagate the flag per provider, even more complex per keyword or per event. On the other hand disabling the stacks with the environment variable (per process) will severely limit other usecases.
If we need the events both with and without the stack at the same time, we could start 2 parallel sessions. Not very convenient but not too much trouble either.
It's relatively easy to support backward compatibility as we don't change the defaults and the parameter could be optional

Usage Examples

Starting the session:

var session = client.StartEventPipeSession(providers, collectStacks = false);

The text was updated successfully, but these errors were encountered:

tommcdon · 2023-03-01T01:42:06Z

@davmason @noahfalk

ezsilmar · 2023-03-02T11:43:09Z

I've just realized that RequestRundown could be the thing I need. I'll test it and close the issue if it's the case.

davmason · 2023-03-02T21:23:40Z

Hi @ezsilmar,

RequestRundown won't have any affect on stack collection. There is not currently a way to disable stack collection and it causes pain for some scenarios, we are in the process of adding options to disable stack collection. PR dotnet/runtime#82470 is about to be merged, and long term I would like to do the approach you describe where we can disable stack collection per session.

ezsilmar · 2023-03-03T11:21:28Z

Hi @davmason, thanks for the clarification!

There's not much info on rundown events on the internet, as far as I understand it it's basically symbol information?

I was confused with RequestRundown because of these lines:

 if (event.NeedStack() && !session.RundownEnabled() && pStack == NULL)
 {
     EventPipe::WalkManagedStackForCurrentThread(s);
     pStack = &s;
 }

I read them incorrectly at first, assuming that we only collect stacks when we emit Rundown which would make sense to me. However I'm a bit lost as the logic is opposite, meaning probably there are two different ways to do a stackwalk? I couldn't find it in the code.

Anyway, I'm interested in contributing to this subject as I'm going to implement something fast for our custom-built runtime in the coming days.

davmason · 2023-03-03T12:02:30Z

Rundown is something we do to make sure everything in a trace can be resolved. Jitted methods are dynamically generated at runtime so their IPs are not contained in the symbol file, and we also can potentially have dynamically generated types, in memory modules, and other things that are not known in the symbol file.

When rundown is triggered we go through all the dynamically generated things and emit the proper information so anyone reading the trace can properly map everything. It is effectively iterating through all the method bodies, types, etc and emitting an event for each one.

We turn off stack information for rundown events as an optimization because it's not relevant, there is no reason for anyone to care what stack is emitting rundown events since it is always the same one.

ezsilmar · 2023-03-03T13:16:54Z

Ah ok I see there's rundown_requested field that is set at session startup but then there's rundown_enabled that is set to true at the end of the session. So Rundown provider doesn't rely on event.need_stack to exclude the stacks but on the separate check in the code.

Client-side of dotnet/runtime#84077 and the implementation of #3696. To simplify the interface I made `EventPipeSessionConfiguration` public and introduced a new method in the DiagnosticsClient: `Task<EventPipeSession> StartEventPipeSessionAsync(EventPipeSessionConfiguration configuration, CancellationToken token)`. This is the only method that supports disabling the stackwalk so no additional overloads with a new bool parameter and no synchronous counterpart. I believe it'd be easier to use and maintain a single async method with the options rather than creating more overloads or default parameters but I may not have all the context here so please correct me if you think it's a bad idea. To deal with the backward compatibility I only use `CollectTracingV3` when necessary i.e. when `RequestStackwalk` option is set to false. I think it's a good compromise between the added complexity and potentially surprising behavior: * when the client is old and the runtime is new everything works because the runtime supports `CollectTracingV2` * when the client is new but the runtime is old everything works until the new option is used. When it's used the session won't start as `CollectTracingV3` doesn't exist server side: there'd be no clear error message but it's documented in the option summary. * when both the client and the runtime are new either `CollectTracingV2` or `CollectTracingV3` may be used transparently for the user * we may use the same trick when we introduce `CollectTracingV4` The alternative is to implement version negotiation of some sort but I'd like to have your opinion before attempting this as handling the errors correctly wouldn't be easy (f.e. in [my current fork](criteo-forks@3946b4a#diff-e8365039cd36eae3dec611784fc7076be7dadeda1007733412aaaa63f40a748fR39) I just hide the exception) The testing turned out to be a bit complex as I needed to convert EventPipe stream to `TraceLog` to be able to read the stacktraces. I couldn't achieve that without writing data to a file. Afaiu the stackwalk may not work correctly without the rundown that only happens at the end of the session so I wonder if looking at the stacktraces with a live session is even possible (though iirc netfw+ETW could do that back in the days) ? Thanks for your time !

ezsilmar · 2024-03-26T14:04:56Z

I've just realized this should be closed as completed as the implementation was merged.

ezsilmar added the enhancement New feature or request label Feb 28, 2023

tommcdon added this to the 8.0.0 milestone Mar 21, 2023

tommcdon added the Microsoft.Diagnostics.NETCore.Client label Mar 21, 2023

ezsilmar mentioned this issue Mar 29, 2023

Ability to request stackwalk per EventPipe session dotnet/runtime#84077

Merged

ezsilmar mentioned this issue Oct 3, 2023

Add RequestStackwalk parameter to EventPipeSession #4290

Merged

ezsilmar closed this as completed Mar 26, 2024

github-actions bot locked and limited conversation to collaborators Apr 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable stacktrace collection per EventPipeSession #3696

Disable stacktrace collection per EventPipeSession #3696

ezsilmar commented Feb 28, 2023 •

edited

Loading

tommcdon commented Mar 1, 2023

ezsilmar commented Mar 2, 2023

davmason commented Mar 2, 2023

ezsilmar commented Mar 3, 2023

davmason commented Mar 3, 2023

ezsilmar commented Mar 3, 2023

ezsilmar commented Mar 26, 2024

Disable stacktrace collection per EventPipeSession #3696

Disable stacktrace collection per EventPipeSession #3696

Comments

ezsilmar commented Feb 28, 2023 • edited Loading

Background

Proposed Feature

Usage Examples

tommcdon commented Mar 1, 2023

ezsilmar commented Mar 2, 2023

davmason commented Mar 2, 2023

ezsilmar commented Mar 3, 2023

davmason commented Mar 3, 2023

ezsilmar commented Mar 3, 2023

ezsilmar commented Mar 26, 2024

ezsilmar commented Feb 28, 2023 •

edited

Loading