Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for multi-tenant queries in streaming search #1

Closed
wants to merge 7 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,8 @@ jobs:
- name: Lint
uses: golangci/golangci-lint-action@v3
with:
version: v1.53.3
version: v1.55.2
only-new-issues: true

unit-tests-pkg:
name: Test packages - pkg
Expand Down
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,6 @@
/tempo-query
/tempo-vulture
/tempodb/encoding/benchmark_block
private-key.key
private-key.key
integration/e2e/e2e_integration_test[0-9]*
integration/e2e/metrics_*_dump.txt
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
## main / unreleased

* [FEATURE] Add support for multi-tenant queries. [#3087](https://github.com/grafana/tempo/pull/3087) (@electron0zero)
* [BUGFIX] Change exit code if config is successfully verified [#3174](https://github.com/grafana/tempo/pull/3174) (@am3o @agrib-01)
* [BUGFIX] The tempo-cli analyse blocks command no longer fails on compacted blocks [#3183](https://github.com/grafana/tempo/pull/3183) (@stoewer)
* [BUGFIX] Move waitgroup handling for poller error condition [#3224](https://github.com/grafana/tempo/pull/3224) (@zalegrala)
* [ENHANCEMENT] Introduced `AttributePolicyMatch` & `IntrinsicPolicyMatch` structures to match span attributes based on strongly typed values & precompiled regexp [#3025](https://github.com/grafana/tempo/pull/3025) (@andriusluk)
* [CHANGE] TraceQL/Structural operators performance improvement. [#3088](https://github.com/grafana/tempo/pull/3088) (@joe-elliott)
* [CHANGE] Merge the processors overrides set through runtime overrides and user-configurable overrides [#3125](https://github.com/grafana/tempo/pull/3125) (@kvrhdn)
* [CHANGE] Make vParquet3 the default block encoding [#2526](https://github.com/grafana/tempo/pull/3134) (@stoewer)
* [CHANGE] Set `autocomplete_filtering_enabled` to `true` by default [#3178](https://github.com/grafana/tempo/pull/3178) (@mapno)
* [CHANGE] Major cache refactor to allow multiple role based caches to be configured [#3166](https://github.com/grafana/tempo/pull/3166).
**BREAKING CHANGE** Deprecate the following fields. These have all been migrated to a top level "cache:" field.
```
Expand All @@ -26,9 +28,11 @@
* [ENHANCEMENT] Improve TraceQL regex performance in certain queries. [#3139](https://github.com/grafana/tempo/pull/3139) (@joe-elliott)
* [ENHANCEMENT] Improve TraceQL performance in complex queries. [#3113](https://github.com/grafana/tempo/pull/3113) (@joe-elliott)
* [ENHANCEMENT] Added a `frontend-search` cache role for job search caching. [#3225](https://github.com/grafana/tempo/pull/3225) (@joe-elliott)
* [ENHANCEMENT] Added a `parquet-page` cache role for page level caching. [#3196](https://github.com/grafana/tempo/pull/3196) (@joe-elliott)
* [ENHANCEMENT] Update opentelemetry-collector-contrib dependency to the latest version, v0.89.0 [#3148](https://github.com/grafana/tempo/pull/3148) (@gebn)
* [BUGFIX] Prevent building parquet iterators that would loop forever. [#3159](https://github.com/grafana/tempo/pull/3159) (@mapno)
* [BUGFIX] Sanitize name in mapped dimensions in span-metrics processor [#3171](https://github.com/grafana/tempo/pull/3171) (@mapno)
* [ENHANCEMENT] Update opentelemetry-collector-contrib dependency to the latest version, v0.89.0 [#3148](https://github.com/grafana/tempo/pull/3148) (@gebn)
* [BUGFIX] Fixed an issue where cached footers were requested then ignored. [#3196](https://github.com/grafana/tempo/pull/3196) (@joe-elliott)

## v2.3.1 / 2023-11-28

Expand Down
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -164,7 +164,7 @@ lint:

.PHONY: docker-component # Not intended to be used directly
docker-component: check-component exe
docker build -t grafana/$(COMPONENT) --build-arg=TARGETARCH=$(GOARCH) -f ./cmd/$(COMPONENT)/Dockerfile .
docker build -t grafana/$(COMPONENT) --load --build-arg=TARGETARCH=$(GOARCH) -f ./cmd/$(COMPONENT)/Dockerfile .
docker tag grafana/$(COMPONENT) $(COMPONENT)

.PHONY: docker-component-debug
Expand Down
3 changes: 2 additions & 1 deletion cmd/tempo/app/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -75,7 +75,7 @@ func (c *Config) RegisterFlagsAndApplyDefaults(prefix string, f *flag.FlagSet) {
f.StringVar(&c.HTTPAPIPrefix, "http-api-prefix", "", "String prefix for all http api endpoints.")
f.BoolVar(&c.UseOTelTracer, "use-otel-tracer", false, "Set to true to replace the OpenTracing tracer with the OpenTelemetry tracer")
f.BoolVar(&c.EnableGoRuntimeMetrics, "enable-go-runtime-metrics", false, "Set to true to enable all Go runtime metrics")
f.BoolVar(&c.AutocompleteFilteringEnabled, "autocomplete-filtering.enabled", false, "Set to true to enable autocomplete filtering")
f.BoolVar(&c.AutocompleteFilteringEnabled, "autocomplete-filtering.enabled", true, "Set to false to disable autocomplete filtering")

// Server settings
flagext.DefaultValues(&c.Server)
Expand Down Expand Up @@ -130,6 +130,7 @@ func (c *Config) RegisterFlagsAndApplyDefaults(prefix string, f *flag.FlagSet) {
c.Compactor.RegisterFlagsAndApplyDefaults(util.PrefixConfig(prefix, "compactor"), f)
c.StorageConfig.RegisterFlagsAndApplyDefaults(util.PrefixConfig(prefix, "storage"), f)
c.UsageReport.RegisterFlagsAndApplyDefaults(util.PrefixConfig(prefix, "reporting"), f)
c.CacheProvider.RegisterFlagsAndApplyDefaults(util.PrefixConfig(prefix, "cache"), f)
}

// MultitenancyIsEnabled checks if multitenancy is enabled
Expand Down
34 changes: 14 additions & 20 deletions cmd/tempo/app/modules.go
Original file line number Diff line number Diff line change
Expand Up @@ -364,38 +364,32 @@ func (t *App) initQueryFrontend() (services.Service, error) {
return nil, err
}

// wrap handlers with auth
middleware := middleware.Merge(
t.HTTPAuthMiddleware,
httpGzipMiddleware(),
)

traceByIDHandler := middleware.Wrap(queryFrontend.TraceByIDHandler)
searchHandler := middleware.Wrap(queryFrontend.SearchHandler)
searchWSHandler := middleware.Wrap(queryFrontend.SearchWSHandler)
spanMetricsSummaryHandler := middleware.Wrap(queryFrontend.SpanMetricsSummaryHandler)
searchTagsHandler := middleware.Wrap(queryFrontend.SearchTagsHandler)

// register grpc server for queriers to connect to
frontend_v1pb.RegisterFrontendServer(t.Server.GRPC, t.frontend)
// we register the streaming querier service on both the http and grpc servers. Grafana expects
// this GRPC service to be available on the HTTP server.
tempopb.RegisterStreamingQuerierServer(t.Server.GRPC, queryFrontend)
tempopb.RegisterStreamingQuerierServer(t.Server.GRPCOnHTTPServer, queryFrontend)

// wrap handlers with auth
base := middleware.Merge(
t.HTTPAuthMiddleware,
httpGzipMiddleware(),
)

// http trace by id endpoint
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathTraces), traceByIDHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathTraces), base.Wrap(queryFrontend.TraceByIDHandler))

// http search endpoints
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearch), searchHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathWSSearch), searchWSHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTags), searchTagsHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTagsV2), searchTagsHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTagValues), searchTagsHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTagValuesV2), searchTagsHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearch), base.Wrap(queryFrontend.SearchHandler))
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathWSSearch), base.Wrap(queryFrontend.SearchWSHandler))
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTags), base.Wrap(queryFrontend.SearchTagsHandler))
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTagsV2), base.Wrap(queryFrontend.SearchTagsV2Handler))
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTagValues), base.Wrap(queryFrontend.SearchTagsValuesHandler))
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSearchTagValuesV2), base.Wrap(queryFrontend.SearchTagsValuesV2Handler))

// http metrics endpoints
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSpanMetricsSummary), spanMetricsSummaryHandler)
t.Server.HTTP.Handle(addHTTPAPIPrefix(&t.cfg, api.PathSpanMetricsSummary), base.Wrap(queryFrontend.SpanMetricsSummaryHandler))

// the query frontend needs to have knowledge of the blocks so it can shard search jobs
t.store.EnablePolling(context.Background(), nil)
Expand Down
15 changes: 12 additions & 3 deletions docs/sources/tempo/configuration/_index.md
Original file line number Diff line number Diff line change
Expand Up @@ -423,6 +423,14 @@ query_frontend:
# (default: 5)
[max_batch_size: <int>]

# Enable multi-tenant queries.
# If enabled, queries can be federated across multiple tenants.
# The tenant IDs involved need to be specified separated by a '|'
# character in the 'X-Scope-OrgID' header.
# note: this is no-op if cluster doesn't have `multitenancy_enabled: true`
# (default: true)
[multi_tenant_queries_enabled: <bool>]

search:

# The number of concurrent jobs to execute when searching the backend.
Expand Down Expand Up @@ -1499,9 +1507,10 @@ cache:
# every cache must have at least one role.
# Allowed values:
# bloom - Bloom filters for trace id lookup.
# parquet-footer - Parquet footer values. Useful for search and trace by id lookup.
# parquet-column-idx - Parquet column index values. Useful for search and trace by id lookup.
# parquet-offset-idx - Parquet offset index values. Useful for search and trace by id lookup.
# parquet-footer - Parquet footer values. Useful for search and trace by id lookup.
# parquet-column-idx - Parquet column index values. Useful for search and trace by id lookup.
# parquet-offset-idx - Parquet offset index values. Useful for search and trace by id lookup.
# parquet-page - Parquet "pages". WARNING: This will attempt to cache most reads from parquet and, as a result, is very high volume.
# frontend-search - Frontend search job results.

- roles:
Expand Down
30 changes: 30 additions & 0 deletions docs/sources/tempo/operations/cross_tenant_query.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
---
title: Cross-tenant query federation
menuTitle: Cross-tenant query
description: Cross-tenant query federation
weight: 70
aliases:
- /docs/tempo/operations/cross-tenant-query
---


# Cross-tenant query federation

{{% admonition type=note" %}}
You need to enable `multitenancy_enabled: true` in the cluster for multi-tenant querying to work.
see [enable multi-tenancy]({{< relref "./multitenancy" >}}) for more details and implications of `multitenancy_enabled: true`.
{{% /admonition %}}

Tempo supports multi-tenant queries for search, search-tags and trace-by-id search operations.

To perform multi-tenant queries, send tenant IDs separated by a `|` character in the `X-Scope-OrgID` header, for e.g: `foo|bar`.

By default, Cross-tenant query is enabled and can be controlled using `multi_tenant_queries_enabled` configuration setting.

```yaml
query_frontend:
multi_tenant_queries_enabled: true
```

Queries performed using the cross-tenant configured data source, in either **Explore** or inside of dashboards,
are performed across all the tenants that you specified in the **X-Scope-OrgID** header.
6 changes: 3 additions & 3 deletions example/docker-compose/local/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ services:
- "9411:9411" # zipkin

k6-tracing:
image: ghcr.io/grafana/xk6-client-tracing:v0.0.2
image: ghcr.io/grafana/xk6-client-tracing:latest
environment:
- ENDPOINT=tempo:4317
restart: always
Expand All @@ -35,13 +35,13 @@ services:
- "9090:9090"

grafana:
image: grafana/grafana:10.1.1
image: grafana/grafana:10.2.2
volumes:
- ../shared/grafana-datasources.yaml:/etc/grafana/provisioning/datasources/datasources.yaml
environment:
- GF_AUTH_ANONYMOUS_ENABLED=true
- GF_AUTH_ANONYMOUS_ORG_ROLE=Admin
- GF_AUTH_DISABLE_LOGIN_FORM=true
- GF_FEATURE_TOGGLES_ENABLE=traceqlEditor
- GF_FEATURE_TOGGLES_ENABLE=traceqlEditor traceQLStreaming metricsSummary
ports:
- "3000:3000"
9 changes: 9 additions & 0 deletions example/docker-compose/local/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,12 @@ docker logs local_tempo_1 -f
```console
docker-compose down -v
```

## search streaming over http

- need to set `traceQLStreaming` feature flag in Grafana
- need to enable `stream_over_http_enabled` in tempo by setting `stream_over_http_enabled: true` in the config file.

you can use Grafana or tempo-cli to make a query.

tempo-cli: `$ tempo-cli query api search "0.0.0.0:3200" --use-grpc "{}" "2023-12-05T08:11:18Z" "2023-12-05T08:12:18Z" --org-id="test"`
5 changes: 4 additions & 1 deletion example/docker-compose/shared/tempo.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
multitenancy_enabled: true
stream_over_http_enabled: true
server:
http_listen_port: 3200
log_level: info

query_frontend:
search:
Expand Down Expand Up @@ -52,4 +55,4 @@ storage:
overrides:
defaults:
metrics_generator:
processors: [service-graphs, span-metrics] # enables metrics generator
processors: [service-graphs, span-metrics] # enables metrics generator
6 changes: 6 additions & 0 deletions integration/e2e/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,10 @@ go test -count=1 -v ./integration/e2e/... -run TestMicroservices$

# build and run a particular test "TestMicroservicesWithKVStores"
make docker-tempo && go test -count=1 -v ./integration/e2e/... -run TestMicroservicesWithKVStores$

# run a single e2e tests with timeout
go test -timeout 3m -count=1 -v ./integration/e2e/... -run ^TestMultiTenantSearch$

# follow and watch logs while tests are running (assuming e2e test container is named tempo_e2e-tempo)
docker logs $(docker container ls -f name=tempo_e2e-tempo -q) -f
```
71 changes: 71 additions & 0 deletions integration/e2e/config-multi-tenant-local.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
target: all
multitenancy_enabled: true
stream_over_http_enabled: true

server:
http_listen_port: 3200
log_level: warn

query_frontend:
search:
query_backend_after: 0 # setting these both to 0 will force all range searches to hit the backend
query_ingesters_until: 0

distributor:
receivers:
jaeger:
protocols:
grpc:
otlp:
protocols:
grpc:
zipkin:
log_received_spans:
enabled: true

ingester:
lifecycler:
address: 127.0.0.1
ring:
kvstore:
store: inmemory
replication_factor: 1
final_sleep: 0s
trace_idle_period: 1s
max_block_bytes: 1
max_block_duration: 2s
complete_block_timeout: 20s
flush_check_period: 1s

metrics_generator:
processor:
service_graphs:
histogram_buckets: [1, 2] # seconds
span_metrics:
histogram_buckets: [1, 2]
registry:
collection_interval: 1s
storage:
path: /var/tempo
remote_write:
- url: http://tempo_e2e-prometheus:9090/api/v1/write
send_exemplars: true


storage:
trace:
backend: local
local:
path: /var/tempo
pool:
max_workers: 10
queue_depth: 100

overrides:
user_configurable_overrides:
enabled: true
poll_interval: 10s
client:
backend: local
local:
path: /var/tempo_overrides
3 changes: 2 additions & 1 deletion integration/e2e/e2e_test.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package e2e

import (
"context"
"encoding/json"
"fmt"
"io"
Expand Down Expand Up @@ -127,7 +128,7 @@ func TestAllInOne(t *testing.T) {
util.SearchAndAssertTraceBackend(t, apiClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())

// find the trace with streaming. using the http server b/c that's what Grafana will do
grpcClient, err := util.NewSearchGRPCClient(tempo.Endpoint(3200))
grpcClient, err := util.NewSearchGRPCClient(context.Background(), tempo.Endpoint(3200))
require.NoError(t, err)

util.SearchStreamAndAssertTrace(t, grpcClient, info, now.Add(-20*time.Minute).Unix(), now.Unix())
Expand Down
3 changes: 2 additions & 1 deletion integration/e2e/encodings_test.go
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
package e2e

import (
"context"
"os"
"testing"
"time"
Expand Down Expand Up @@ -106,7 +107,7 @@ func TestEncodings(t *testing.T) {
queryAndAssertTrace(t, apiClient, info)

// create grpc client used for streaming
grpcClient, err := integration.NewSearchGRPCClient(tempo.Endpoint(3200))
grpcClient, err := integration.NewSearchGRPCClient(context.Background(), tempo.Endpoint(3200))
require.NoError(t, err)

if enc.Version() == v2.VersionString {
Expand Down
Loading