From c14b2abe6cb8a6946c7309fee1fdb2730e3b6ddf Mon Sep 17 00:00:00 2001 From: Kim Nylander <104772500+knylander-grafana@users.noreply.github.com> Date: Tue, 3 Sep 2024 22:53:19 -0400 Subject: [PATCH] [DOC] Release notes for 2.6 (#4048) --- docs/sources/tempo/release-notes/v2-6.md | 247 +++++++++++++++++++++++ docs/sources/tempo/setup/upgrade.md | 129 +++++++++++- 2 files changed, 373 insertions(+), 3 deletions(-) create mode 100644 docs/sources/tempo/release-notes/v2-6.md diff --git a/docs/sources/tempo/release-notes/v2-6.md b/docs/sources/tempo/release-notes/v2-6.md new file mode 100644 index 00000000000..6f82f6eeca9 --- /dev/null +++ b/docs/sources/tempo/release-notes/v2-6.md @@ -0,0 +1,247 @@ +--- +title: Version 2.6 release notes +menuTitle: V2.6 +description: Release notes for Grafana Tempo 2.6 +weight: 30 +--- + +# Version 2.6 release notes + +The Tempo team is pleased to announce the release of Tempo 2.5. + +This release gives you: + +* Additions to the TraceQL language, including the ability to search by span events, links, and arrays +* Additions to TraceQL metric query-types including a compare function and the ability to do instant queries (which will return faster than range queries). +* Performance and stability enhancements + + + +These release notes highlight the most important features and bugfixes. For a complete list, refer to the [Tempo changelog](https://github.com/grafana/tempo/releases). + + + +## Features and enhancements + +The most important features and enhancements in Tempo 2.6 are highlighted below. + +### Additional TraceQL metrics (experimental) + +In this release, we’ve added several [TraceQL metrics](https://grafana.com/docs/tempo/latest/operations/traceql-metrics/). In Tempo 2.6, TraceQL metrics adds: + +* Exemplars [[PR 3824](https://github.com/grafana/tempo/pull/3824), [documentation](https://grafana.com/docs/tempo/next/traceql/metrics-queries/#exemplars)] +* Instant metrics queries using `/api/metrics/query` [[PR 3859](https://github.com/grafana/tempo/pull/3859), [documentation](https://grafana.com/docs/tempo/next/api_docs/#traceql-metrics)] +* A `q` parameter to tag-name filtering the search v2 API [[PR 3822](https://github.com/grafana/tempo/pull/3822), [documentation](https://grafana.com/docs/tempo/next/api_docs/#traceql-metrics)] +* A new `compare()` metrics function [[PR 3695](https://github.com/grafana/tempo/pull/3695), [documentation](https://grafana.com/docs/tempo/latest/traceql/metrics-queries/#functions)] + +Additionally, we're working on refactoring the replication factor. Refer to the [Operational change for TraceQL metrics](#operational-change-for-traceql-metrics) section for details. + +Note that using TraceQL metrics may require additional system resources. + +For more information, refer to the [TraceQL metrics queries](https://grafana.com/docs/tempo/latest/traceql/metrics-queries/) and [Configure TraceQL metrics](https://grafana.com/docs/tempo/latest/operations/traceql-metrics/). + +### TraceQL improvements + +Unique to Tempo, TraceQL is a query language that lets you perform custom queries into your tracing data. To learn more about the TraceQL syntax, refer to the [TraceQL documentation](https://grafana.com/docs/tempo/latest/traceql/). + +We’ve added event attributes and link scopes. Like spans, they both have instrinsics and attributes. + +The `event` scope lets you query events that happen within a span. A span event is a unique point in time during the span’s duration. While spans help build the structural hierarchy of your services, span events can provide a deeper level of granularity to help debug your application faster and maintain optimal performance. To learn more about how you can use span events, read the [What are span events?](https://grafana.com/blog/2024/08/15/all-about-span-events-what-they-are-and-how-to-query-them/) blog post. [PRs [3708](https://github.com/grafana/tempo/pull/3708), [3708](https://github.com/grafana/tempo/pull/3748), [3908](https://github.com/grafana/tempo/pull/3908)] + +If you've instrumented your traces for span links, you can use the `link` scope to search for an attribute within a span link. A span link associates one span with one or more other spans. [PRs [3814](https://github.com/grafana/tempo/pull/3814), [3741](https://github.com/grafana/tempo/pull/3741)] + +For more information on span links, refer to the [Span Links](https://opentelemetry.io/docs/concepts/signals/traces/#span-links) documentation in the Open Telemetry project. + +You can search for an attribute in your link: + +``` +{ link.opentracing.ref_type = "child_of" } +``` + +![A TraceQL example showing `link` scope](/media/docs/grafana/data-sources/tempo/query-editor/traceql-link-example.png) + +We’ve also added autocomplete support for `events` and `links`. [[PR 3846](https://github.com/grafana/tempo/pull/3846)] + +Tempo 2.6 improves TraceQL performance with these updates: + +* Performance improvement for `rate() by ()` queries [[PR 3719](https://github.com/grafana/tempo/pull/3719)] +* Add caching to query range queries [[PR 3796](https://github.com/grafana/tempo/pull/3796)] +* Only stream diffs on metrics queries [[PR 3808](https://github.com/grafana/tempo/pull/3808)] +* Tag value lookup use protobuf internally for improved latency [[PR 3731](https://github.com/grafana/tempo/pull/3731)] +* TraceQL metrics queries use protobuf internally for improved latency [[PR 3745](https://github.com/grafana/tempo/pull/3745)] +* TraceQL search and other endpoints use protobuf internally for improved latency and resource usage [[PR 3944](https://github.com/grafana/tempo/pull/3944)] +* Add local disk caching of metrics queries in local-blocks processor [[PR 3799](https://github.com/grafana/tempo/pull/3799)] +* Performance improvement for queries using trace-level intrinsics [[PR 3920](https://github.com/grafana/tempo/pull/3920)] +* Use multiple goroutines to unmarshal responses in parallel in the query frontend. [[PR 3713](https://github.com/grafana/tempo/pull/3713)] + +### Native histogram support + +The metrics-generator can produce native histograms for high-resolution data. [PR 3789](https://github.com/grafana/tempo/pull/3789) + +Native histograms are a data type in Prometheus that can produce, store, and query high-resolution histograms of observations. It usually offers higher resolution and more straightforward instrumentation than classic histograms. + +To learn more, refer to the [Native histogram](https://grafana.com/docs/tempo/<TEMPO_VERSION>/metrics-generator/#native-histograms) documentation. + +### Performance improvements + +One of our major improvements in Tempo 2.6 is the reduction of memory usage due to polling improvements. [PRs [3950](https://github.com/grafana/tempo/pull/3950), [3951](https://github.com/grafana/tempo/pull/3951), [3952](https://github.com/grafana/tempo/pull/3952]) + +![Comparison graph showing the reduction of memory usage due to the polling improvements](/media/docs/tempo/tempo-2-6-poll-improvement.png) + +This improvement is a result of some of these changes: + +* Add data quality metric to measure traces without a root [[PR 3812](https://github.com/grafana/tempo/pull/3812)] +* Reduce memory consumption of query-frontend [[PR 3888](https://github.com/grafana/tempo/pull/3888)] +* Reduce allocs of caching middleware [[PR 3976](https://github.com/grafana/tempo/pull/3976)] +* Reduce allocs building queriers sharded requests [[PR 3932](https://github.com/grafana/tempo/pull/3932)] +* Improve trace id lookup from Tempo Vulture by selecting a date range [[PR 3874](https://github.com/grafana/tempo/pull/3874)] + +### Other enhancements and improvements + +This release also has these notable updates: + +* Bring back OTel receiver metrics. [[PR 3917](https://github.com/grafana/tempo/pull/3917)] +* Add a `q` parameter to `/api/v2/search/tags` for tag name filtering. [[PR 3822](https://github.com/grafana/tempo/pull/3822)] +* Add middleware to block matching URLs. [[PR 3963](https://github.com/grafana/tempo/pull/3963)] +* Add data quality metric to measure traces without a root. [[PR 3812](https://github.com/grafana/tempo/pull/3812)] +* Implement polling tenants concurrently. [[PR 3647](https://github.com/grafana/tempo/pull/3647)] +* Add [native histograms](https://grafana.com/docs/tempo/next/metrics-generator/#native-histograms) for internal metrics [[PR 3870](https://github.com/grafana/tempo/pull/3870)] +* Add a Tempo CLI command to drop traces by id by rewriting blocks. [[PR 3856](https://github.com/grafana/tempo/pull/3856), [documentation](https://grafana.com/docs/tempo/next/operations/tempo_cli/#drop-trace-by-id)] +* Add new OTel compatible Traces API V2. [[PR 3912](https://github.com/grafana/tempo/pull/3912), [documentation](https://grafana.com/docs/tempo/next/api_docs/#query-v2)] +* Rename `Batches` to `ResourceSpans`. [[PR 3895](https://github.com/grafana/tempo/pull/3895)] + +## Upgrade considerations + +When [upgrading](https://grafana.com/docs/tempo/latest/setup/upgrade/) to Tempo 2.6, be aware of these considerations and breaking changes. + +### Operational change for TraceQL metrics + +We've changed to an RF1 (Replication Factor 1) pattern for TraceQL metrics as we were unable to hit performance goals for RF3 de-duplication. This requires some operational changes to query TraceQL metrics. + +TraceQL metrics are still considered experimental. We hope to mark them GA soon when we productionize a complete RF1 write-read path. [PRs [3628](https://github.com/grafana/tempo/pull/3628), [3691]([https://github.com/grafana/tempo/pull/3691](https://github.com/grafana/tempo/pull/3691)), [3723]([https://github.com/grafana/tempo/pull/3723](https://github.com/grafana/tempo/pull/3723)), [3995]([https://github.com/grafana/tempo/pull/3995](https://github.com/grafana/tempo/pull/3995))] + +**For recent data** + +The local-blocks processor must be enabled to start using metrics queries like `{ } | rate()`. If not enabled metrics queries fail with the error `localblocks processor not found`. Enabling the local-blocks processor can be done either per tenant or in all tenants. + +* Per-tenant in the per-tenant overrides: + + ```yaml + overrides: + 'tenantID': + metrics_generator_processors: + - local-blocks + ``` + +* By default, for all tenants in the main config: + + ```yaml + overrides: + defaults: + metrics_generator: + processors: [local-blocks] + ``` + +Add this configuration to run TraceQL metrics queries against all spans (and not just server spans): + +```yaml +metrics_generator: + processor: + local_blocks: + filter_server_spans: false +``` + +**For historical data** + +To run metrics queries on historical data, you must configure the local-blocks processor to flush rf1 blocks to object storage: + +```yaml +metrics_generator: + processor: + local_blocks: + flush_to_storage: true +``` + +### Transition to vParquet4 + +vParquet4 format is now the default block format. +It's production ready and we highly recommend switching to it for improved query performance. [PR [3810](https://github.com/grafana/tempo/pull/3810)] + +Upgrading to Tempo 2.6 modifies the Parquet block format. +Although you can use Tempo 2.6 with vParquet2 or vParquet3, you can only use Tempo 2.6 with vParquet3. + +You can also use the `tempo-cli analyse blocks` command to query vParquet4 blocks. [PR 3868](https://github.com/grafana/tempo/pull/3868)]. +Refer to the [Tempo CLI ](https://grafana.com/docs/tempo/next/operations/tempo_cli/#analyse-blocks)documentation for more information. + +For information on upgrading, refer to [Upgrade to Tempo 2.6](https://grafana.com/docs/tempo/next/setup/upgrade/) and [Choose a different block format](https://grafana.com/docs/tempo/next/configuration/parquet/#choose-a-different-block-format). + +### Updated, removed, or renamed configuration parameters + +
Parameter + | +Comments + | +
+ storage:
+ |
+ Removed. Azure v2 is the only and primary Azure backend [PR 3875] + | +
autocomplete_filtering_enabled
+ |
+ The feature flag option has been removed. The feature is always enabled. [PR 3729] + | +
completedfilepath and blocksfilepath
+ |
+ Removed unused WAL configuration options. [PR 3911] + | +
compaction_disabled
+ |
+ New. Allow compaction disablement per-tenant. [PR 3965, documentation] + | +
+
+Storage:
+ |
+ Boolean flag to activate or deactivate dualstack mode on the Storage block configuration for S3. [PR 3721, documentation] + | +
Parameter + | +Comments + | +
storage:
+
+
+ |
+ Removed. Azure v2 is the only and primary Azure backend [PR #3875] + | +
autocomplete_filtering_enabled
+ |
+ The feature flag option has been removed. The feature is always enabled. [PR #3729] + | +
completedfilepath and blocksfilepath
+ |
+ Removed unused WAL configuration options. [PR #3911] + | +
compaction_disabled
+ |
+ New. Allow compaction disablement per-tenant. [PR #3965, documentation] + | +
Storage:
+
+
+ |
+ Boolean flag to activate or deactivate dualstack mode on the Storage block configuration for S3. [PR #3721, documentation] + | +