diff --git a/docs/fundamentals/networking/telemetry/media/aspire-http-connections-thumb.png b/docs/fundamentals/networking/telemetry/media/aspire-http-connections-thumb.png new file mode 100644 index 0000000000000..669be614508d9 Binary files /dev/null and b/docs/fundamentals/networking/telemetry/media/aspire-http-connections-thumb.png differ diff --git a/docs/fundamentals/networking/telemetry/media/aspire-http-connections.png b/docs/fundamentals/networking/telemetry/media/aspire-http-connections.png new file mode 100644 index 0000000000000..967bcd1f87f75 Binary files /dev/null and b/docs/fundamentals/networking/telemetry/media/aspire-http-connections.png differ diff --git a/docs/fundamentals/networking/telemetry/media/aspire-httpclient-get-thumb.png b/docs/fundamentals/networking/telemetry/media/aspire-httpclient-get-thumb.png new file mode 100644 index 0000000000000..61d44bda516a9 Binary files /dev/null and b/docs/fundamentals/networking/telemetry/media/aspire-httpclient-get-thumb.png differ diff --git a/docs/fundamentals/networking/telemetry/media/aspire-httpclient-get.png b/docs/fundamentals/networking/telemetry/media/aspire-httpclient-get.png new file mode 100644 index 0000000000000..7f41b8bd3f1ed Binary files /dev/null and b/docs/fundamentals/networking/telemetry/media/aspire-httpclient-get.png differ diff --git a/docs/fundamentals/networking/telemetry/metrics.md b/docs/fundamentals/networking/telemetry/metrics.md index 4cf7e0ec49b2e..b2f566aba3000 100644 --- a/docs/fundamentals/networking/telemetry/metrics.md +++ b/docs/fundamentals/networking/telemetry/metrics.md @@ -26,122 +26,48 @@ There are two parts to using metrics in a .NET app: This section demonstrates various methods to collect and view System.Net metrics. -### Example app +### .NET Aspire -For the sake of this tutorial, create a simple app that sends HTTP requests to various endpoints in parallel. +The simplest solution for collecting metrics for ASP.NET applications is to use [.NET Aspire](/dotnet/aspire/get-started/aspire-overview) which is a set of extensions to .NET to make it easy to create and work with distributed applications. One of the benefits of using .NET Aspire is that telemetry is built in, using the OpenTelemetry libraries for .NET. The default project templates for .NET Aspire contain a `ServiceDefaults` project, part of which is to setup and configure OTel. The Service Defaults project is referenced and initialized by each service in a .NET Aspire solution. -```dotnetcli -dotnet new console -o HelloBuiltinMetrics -cd ..\HelloBuiltinMetrics -``` - -Replace the contents of `Program.cs` with the following sample code: - -:::code language="csharp" source="snippets/metrics/Program.cs" id="snippet_ExampleApp"::: - -### View metrics with dotnet-counters - -[`dotnet-counters`](../../../core/diagnostics/dotnet-counters.md) is a cross-platform performance monitoring tool for ad-hoc health monitoring and first-level performance investigation. +The Service Defaults project template includes the OTel SDK, ASP.NET, HttpClient and Runtime Instrumentation packages, and those are configured in the [`Extensions.cs`](https://github.com/dotnet/aspire/blob/main/src/Aspire.ProjectTemplates/templates/aspire-servicedefaults/Extensions.cs) file. For exporting telemetry .NET Aspire includes the OTLP exporter by default so that it can provide telemetry visualization using the Aspire Dashboard. -```dotnetcli -dotnet tool install --global dotnet-counters -``` - -When running against a .NET 8+ process, `dotnet-counters` enables the instruments defined by the `--counters` argument and displays the measurements. It continuously refreshes the console with the latest numbers: - -```console -dotnet-counters monitor --counters System.Net.Http,System.Net.NameResolution -n HelloBuiltinMetrics -``` +The Aspire Dashboard is designed to bring telemetry observation to the local debug cycle, which enables developers to not only ensure that the applications are producing telemetry, but also use that telemetry to diagnose those applications locally. Being able to observe the calls between services is proving to be just as useful at debug time as in production. The .NET Aspire dashboard is launched automatically when you F5 the `AppHost` Project from Visual Studio or `dotnet run` the `AppHost` project. -### View metrics in Grafana with OpenTelemetry and Prometheus +[![Aspire Dashboard](../../../core/diagnostics/media/aspire-dashboard-metrics-thumb.png)](../../../core/diagnostics/media/aspire-dashboard-metrics.png#lightbox) -#### Overview +For more details on .NET Aspire see: -[OpenTelemetry](https://opentelemetry.io/): +- [Aspire Overview](/dotnet/aspire/get-started/aspire-overview) +- [Telemetry in Aspire](/dotnet/aspire/fundamentals/telemetry) +- [Aspire Dashboard](/dotnet/aspire/fundamentals/dashboard/explore) -- Is a vendor-neutral, open-source project supported by the [Cloud Native Computing Foundation](https://www.cncf.io/). -- Standardizes generating and collecting telemetry for cloud-native software. -- Works with .NET using the .NET metric APIs. -- Is endorsed by [Azure Monitor](/azure/azure-monitor/app/opentelemetry-overview) and many APM vendors. +### Reusing Service Defaults project without .NET Aspire Orchestration -This tutorial shows one of the integrations available for OpenTelemetry metrics using the OSS [Prometheus](https://prometheus.io/) and [Grafana](https://grafana.com/) projects. The metrics data flow consists of the following steps: +Probably the easiest way to configure OTel for ASP.NET projects is to use the Aspire Service Defaults project, even if not using the rest of .NET Aspire such as the AppHost for orchestration. The Service Defaults project is available as a project template via Visual Studio or `dotnet new`. It configures OTel and sets up the OTLP exporter. You can then use the [OTel environment variables](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol#exporter-configuration) to configure the OTLP endpoint to send telemetry to, and provide the resource properties for the application. -1. The .NET metric APIs record measurements from the example app. -1. The OpenTelemetry library running in the app aggregates the measurements. -1. The Prometheus exporter library makes the aggregated data available via an HTTP metrics endpoint. 'Exporter' is what OpenTelemetry calls the libraries that transmit telemetry to vendor-specific backends. -1. A Prometheus server: +The steps to use *ServiceDefaults* outside .NET Aspire are: - - Polls the metrics endpoint. - - Reads the data. - - Stores the data in a database for long-term persistence. Prometheus refers to reading and storing data as *scraping* an endpoint. - - Can run on a different machine. +- Add the *ServiceDefaults* project to the solution using Add New Project in Visual Studio, or use `dotnet new aspire-servicedefaults --output ServiceDefaults` +- Reference the *ServiceDefaults* project from your ASP.NET application. In Visual Studio use "Add -> Project Reference" and select the *ServiceDefaults* project" +- Call its OpenTelemetry setup function as part of your application builder initialization. -1. The Grafana server: +``` csharp +var builder = WebApplication.CreateBuilder(args); +builder.ConfigureOpenTelemetry(); - - Queries the data stored in Prometheus and displays it on a web-based monitoring dashboard. - - Can run on a different machine. +var app = builder.Build(); -#### Configure the example app to use OpenTelemetry's Prometheus exporter +app.MapGet("/", () => "Hello World!"); -Add a reference to the OpenTelemetry Prometheus exporter to the example app: - -```dotnetcli -dotnet add package OpenTelemetry.Exporter.Prometheus.HttpListener --prerelease +app.Run(); ``` -> [!NOTE] -> This tutorial uses a pre-release build of OpenTelemetry's Prometheus support available at the time of writing. - -Update `Program.cs` with OpenTelemetry configuration: - -:::code language="csharp" source="snippets/metrics/Program.cs" id="snippet_PrometheusExporter" highlight="5-8"::: - -In the preceding code: - -- `AddMeter("System.Net.Http", "System.Net.NameResolution")` configures OpenTelemetry to transmit all the metrics collected by the built-in `System.Net.Http` and `System.Net.NameResolution` meters. -- `AddPrometheusHttpListener` configures OpenTelemetry to expose Prometheus' metrics HTTP endpoint on port `9184`. - -> [!NOTE] -> This configuration differs for ASP.NET Core apps, where metrics are exported with `OpenTelemetry.Exporter.Prometheus.AspNetCore` instead of `HttpListener`. See the [related ASP.NET Core example](/aspnet/core/log-mon/metrics/metrics#create-the-starter-app). - -Run the app and leave it running so measurements can be collected: - -```dotnetcli -dotnet run -``` - -#### Set up and configure Prometheus - -Follow the [Prometheus first steps](https://prometheus.io/docs/introduction/first_steps/) to set up a Prometheus server and confirm it is working. - -Modify the *prometheus.yml* configuration file so that Prometheus scrapes the metrics endpoint that the example app is exposing. Add the following highlighted text in the `scrape_configs` section: - -:::code language="yaml" source="snippets/metrics/prometheus.yml" highlight="31-99"::: - -#### Start prometheus - -1. Reload the configuration or restart the Prometheus server. -1. Confirm that OpenTelemetryTest is in the UP state in the **Status** > **Targets** page of the Prometheus web portal. -![Prometheus status](~/docs/core/diagnostics/media/prometheus-status.png) - -1. On the Graph page of the Prometheus web portal, enter `http` in the expression text box and select `http_client_active_requests`. -![http_client_active_requests](~/docs/fundamentals/networking/telemetry/media/prometheus-search.png) - In the graph tab, Prometheus shows the value of the `http.client.active_requests` counter that's emitted by the example app. - ![Prometheus active requests graph](~/docs/fundamentals/networking/telemetry/media/prometheus-active-requests.png) - -#### Show metrics on a Grafana dashboard - -1. Follow the [standard instructions](https://prometheus.io/docs/visualization/grafana/#installing) to install Grafana and connect it to a Prometheus data source. - -1. Create a Grafana dashboard by selecting the **+** icon on the top toolbar then selecting **Dashboard**. In the dashboard editor that appears, enter **Open HTTP/1.1 Connections** in the **Title** box and the following query in the PromQL expression field: - -``` -sum by(http_connection_state) (http_client_open_connections{network_protocol_version="1.1"}) -``` +For a full walkthrough, see [Example: Use OpenTelemetry with OTLP and the standalone Aspire Dashboard](../../../core/diagnostics/observability-otlp-example.md). -![Grafana HTTP/1.1 Connections](~/docs/fundamentals/networking/telemetry/media/grafana-connections.png) +### Collecting metrics manually -1. Select **Apply** to save and view the new dashboard. It displays the number of active vs idle HTTP/1.1 connections in the pool. +For a walkthrough of how to collect metrics, as well as distributed traces without using Aspire Service Defaults, see [Example: Use OpenTelemetry with Prometheus, Grafana, and Jaeger](../../../core/diagnostics/observability-prgrja-example.md). ## Enrichment diff --git a/docs/fundamentals/networking/telemetry/tracing.md b/docs/fundamentals/networking/telemetry/tracing.md new file mode 100644 index 0000000000000..f4db96c0466be --- /dev/null +++ b/docs/fundamentals/networking/telemetry/tracing.md @@ -0,0 +1,169 @@ +--- +title: Networking tracing +description: Learn how to consume .NET networking Tracing. +author: samsp-msft +ms.author: samsp +ms.date: 10/4/2024 +--- + +# Networking distributed traces in .NET + +[Distributed tracing](../../../core/diagnostics/distributed-tracing.md) is a diagnostic technique that helps engineers localize failures and performance issues within applications, especially those that may be distributed across multiple machines or processes. This technique tracks requests through an application correlating together work done by different application components and separating it from other work the application may be doing for concurrent requests. For example, a request to a typical web service might be first received by a load balancer, then forwarded to a web server process, which then makes several queries to a database. Using distributed tracing allows engineers to distinguish if any of those steps failed, how long each step took, and potentially logging messages produced by each step as it ran. + +The tracing system in .NET is designed to work with OpenTelemetry (OTel), and uses OTel to export the data to monitoring systems. Tracing in .NET is implemented using `System.Diagnostics.Activity` class along with `System.Diagnostics.ActivitySource` for collection, these correspond to `spans` in OTel. OpenTelemetry is defining an industry-wide standard for naming of tracing spans and their attributes, these are known as [semantic conventions](https://opentelemetry.io/docs/concepts/semantic-conventions). .NET telemetry is using the semantic conventions that have already been defined, and are working to add missing ones to the spec(s). + +While the `System.Net` apis create activities, they rely on [OpenTelemetry instrumentation libraries](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/glossary.md#instrumentation-library) to populate the `Activity` with the trace tags/attributes, primarily `[OpenTelemetry.Instrumentation.Http](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Instrumentation.Http)`. + +In .NET 9, we have started to move the functionality to emit the tags/attributes to the networking libraries, starting with the Http libraries. + +> [!TIP] +> For a comprehensive list of all built-in tracing together with their tags/attributes, see [System.Net Tracing](../../../core/diagnostics/TBD). + +## Collect System.Net traces + +There are several parts to using distributed tracing in a .NET app: + +* **Instrumentation:** Code in .NET libraries create an `ActivitySource` with a name, and then creates `Activity` objects to track work performed. The Activity objects are only created if there are listeners to the ActivitySource. +* **OpenTelemetry:** The OTel SDK listens to named ActivitySources and create spans to represent the work tracked by the Activity. +* **Instrumentation Packages:** Work with the OTel SDK to add additional attributes to the Activity based on the work being performed, implementing the OTel semantic conventions +* **Exporters:** Integrate the OTel SDK with specific monitoring systems such as OTLP (an OTel standard wire format), Open Source monitoring solutions such as Jaeger or Zipkin, or commercial offerings such as Azure Monitor Application Insights. + +This section demonstrates various methods to collect and view System.Net traces. + +### .NET Aspire + +The simplest solution for collecting traces for ASP.NET applications is to use [.NET Aspire](/dotnet/aspire/get-started/aspire-overview) which is a set of extensions to .NET to make it easy to create and work with distributed applications. One of the benefits of using .NET Aspire is that telemetry is built in, using the OpenTelemetry libraries for .NET. The default project templates for .NET Aspire contain a `ServiceDefaults` project, part of which is to setup and configure OTel. The Service Defaults project is referenced and initialized by each service in a .NET Aspire solution. + +The Service Defaults project template includes the OTel SDK, ASP.NET, HttpClient and Runtime Instrumentation packages, and those are configured in the [`Extensions.cs`](https://github.com/dotnet/aspire/blob/main/src/Aspire.ProjectTemplates/templates/aspire-servicedefaults/Extensions.cs) file. For exporting telemetry .NET Aspire includes the OTLP exporter by default so that it can provide telemetry visualization using the Aspire Dashboard. + +The Aspire Dashboard is designed to bring telemetry observation to the local debug cycle, which enables developers to not only ensure that the applications are producing telemetry, but also use that telemetry to diagnose those applications locally. Being able to observe the calls between services is proving to be just as useful at debug time as in production. The .NET Aspire dashboard is launched automatically when you F5 the `AppHost` Project from Visual Studio or `dotnet run` the `AppHost` project. + +[![Aspire Dashboard](../../../core/diagnostics/media/aspire-dashboard-thumb.png)](../../../core/diagnostics/media/aspire-dashboard.png#lightbox) + +For more details on .NET Aspire see: + +- [Aspire Overview](/dotnet/aspire/get-started/aspire-overview) +- [Telemetry in Aspire](/dotnet/aspire/fundamentals/telemetry) +- [Aspire Dashboard](/dotnet/aspire/fundamentals/dashboard/explore) + +### Reusing Service Defaults project without .NET Aspire Orchestration + +Probably the easiest way to configure OTel for ASP.NET projects is to use the Aspire Service Defaults project, even if not using the rest of .NET Aspire such as the AppHost for orchestration. The Service Defaults project is available as a project template via Visual Studio or `dotnet new`. It configures OTel and sets up the OTLP exporter. You can then use the [OTel environment variables](https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/src/OpenTelemetry.Exporter.OpenTelemetryProtocol#exporter-configuration) to configure the OTLP endpoint to send telemetry to, and provide the resource properties for the application. + +The steps to use *ServiceDefaults* outside .NET Aspire are: + +- Add the *ServiceDefaults* project to the solution using Add New Project in Visual Studio, or use `dotnet new aspire-servicedefaults --output ServiceDefaults` +- Reference the *ServiceDefaults* project from your ASP.NET application. In Visual Studio use "Add -> Project Reference" and select the *ServiceDefaults* project" +- Call its OpenTelemetry setup function as part of your application builder initialization. + +``` csharp +var builder = WebApplication.CreateBuilder(args); +builder.ConfigureOpenTelemetry(); + +var app = builder.Build(); + +app.MapGet("/", () => "Hello World!"); + +app.Run(); +``` + +For a full walkthrough, see [Example: Use OpenTelemetry with OTLP and the standalone Aspire Dashboard](../../../core/diagnostics/observability-otlp-example.md). + + +### Collecting traces manually + +For a walkthrough of how to collect distributed traces, as well as metrics without using Aspire Service Defaults, see [Example: Use OpenTelemetry with Prometheus, Grafana, and Jaeger](../../../core/diagnostics/observability-prgrja-example.md). + +## Experimental connection spans in .NET 9 + +.NET 9 adds a handful of new spans for collecting detailed connection information: + +| Activity Source | Description | +| --- | --- | +| Experimental.System.Net.NameResolution | Tracks DNS resolution for anything using the .NET DNS api's such as HttpClient | +| Experimental.System.Net.Sockets | Tracks socket connection activity | +| Experimental.System.Net.Security | Tracks TLS handshake for inbound and outbound connections | +| Experimental.System.Net.Http.Connections | Tracks the Connection pool for HttpClient | + +These spans are available starting with .NET 9. The ActivitySource Names start with `Experimental` as these spans are not yet included in the OpenTelemetry Semantic conventions, and may be changed as we learn more about how well they work in production. + +These spans are probably too verbose for use 24x7 in production scenarios with high workloads - they are somewhat noisy and this level of data is not normally needed. However if you are trying to diagnose connection issues or get a deeper understanding of how network and connection latency is affecting your services, then they provide insight that is hard to collect by other means. + +Note: When enabled, the http connection span is linked to from HttpClient request spans. As an http connection can be long lived, this could result in many links to the connection span from each of the request spans. Some APM monitoring tools aggresively walk links between spans to build up their views and so including this span may cause issues when the tools were not designed to account for large numbers of links. + +### Walkthrough: Using the experimental spans in .NET 9 + +This walkthough uses an Aspire Application such as the __.NET Aspire Starter App__. + +1. Modify each of the service projects to use .NET 9 by updating the `TargetFramework` to `net9.0` in each of the service's __.csproj__ files. The AppHost project does not need to be updated as it does not emit telemetry. +2. Similarly modify the `TargetFramework` to `net9.0` in the __ServiceDefaults__ project +3. Add the `ActivitySource` names to the initialization code in __Extensions.cs__ in the Service Defaults project: + +``` csharp + public static IHostApplicationBuilder ConfigureOpenTelemetry(this IHostApplicationBuilder builder) + { + builder.Logging.AddOpenTelemetry(logging => + { + logging.IncludeFormattedMessage = true; + logging.IncludeScopes = true; + }); + + builder.Services.AddOpenTelemetry() + .WithMetrics(metrics => + { + metrics.AddAspNetCoreInstrumentation() + .AddHttpClientInstrumentation() + .AddRuntimeInstrumentation(); + }) + .WithTracing(tracing => + { + tracing.AddAspNetCoreInstrumentation() + // Uncomment the following line to enable gRPC instrumentation (requires the OpenTelemetry.Instrumentation.GrpcNetClient package) + //.AddGrpcClientInstrumentation() + .AddSource("System.Net.Http") + .AddSource("Experimental.System.Net.NameResolution") + .AddSource("Experimental.System.Net.Sockets") + .AddSource("Experimental.System.Net.Security") + .AddSource("Experimental.System.Net.Http.Connections"); + // .AddHttpClientInstrumentation(); + }); + + builder.AddOpenTelemetryExporters(); + + return builder; + } +``` + +In this example, the `AddHttpClientInstrumentation()` has been replaced with the built-in instrumentation to HttpClient in .NET 9 using `.AddSource("System.Net.Http")`. + +When http requests are made with this instrumentation enabled, the HttpClient span will have the following changes: + +[![HttpClient Spans in Aspire Dashboard](media/aspire-httpclient-get-thumb.png)](media/aspire-httpclient-get.png#lightbox) + +- If a connection needs to be established, or waiting for a connection from the connection pool, then an additional __Http wait_for_connection__ span will be shown which represents the delay for waiting for a connection to be made. This helps to understand delays between the HttpClient request being made in code, and when the destination server actually recieves and processes the request. In the picture above: + - The selected span is the HttpClient request. + - The one below it is the delay waiting for a connection to be established. + - The lasts span in yellow is from the destination processing the request. +- The HttpClient span will have a link to the http connection setup span which shows the activity to create the http connection used by the request. + +[![Http Connection Spans in Aspire Dashboard](media/aspire-http-connections-thumb.png)](aspire-http-connections.png#lightbox) + +The http connection setup span is a separate span with its own TraceId as its lifetime is independent from each individual HttpClient request. Many HttpClient requests can be made over the same http connection, and if its already established and available (http 1.1 supports sequential requests over the same connection, http 2 & 3 enable parallel requests) then the request can reuse that connection. This span will have child spans for DNS lookup, TCP socket connecting and the TLS handshake as applicable. + +## Extending Traces + +There are a couple of approaches that can be taken to augment the existing tracing functionality from System.Net. + +### Adding attributes to the Activity + +The instrumentation libraries augment the built-in support for tracing by adding additional tags/attributes to the Activity. You need to do this before the Activity is completed. Techniques for accessing the Activity include: +* Using `Activity.Current` in code that is included in the scope of the Activity +* Using `DiagnosticSource` to get callbacks for when networking activity is occuring - this is how the OTel Instrumentation Libraries are implemented. For an example see [HttpHandlerDiagnosticListener.cs](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Instrumentation.Http/Implementation/HttpHandlerDiagnosticListener.cs) in the [Http instrumentation library](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Instrumentation.Http/README.md). + +### Httpclient Instrumentation Enrichment API + +Another approach when using the Http Instrumentation Library, is using the [Enrich HttpClient API](https://github.com/open-telemetry/opentelemetry-dotnet-contrib/blob/main/src/OpenTelemetry.Instrumentation.Http/README.md#enrich-httpclient-api). + +## Need more tracing? + +If you have suggestions for other useful information that could be exposed via tracing, create a [dotnet/runtime issue](https://github.com/dotnet/runtime/issues/new).