Releases: cortexproject/cortex
v1.18.1
This release includes security fixes. Thanks to @99-not-out and @jeanlouisboudart for reporting it
What's Changed
- [BUGFIX] Backporting upgrade to go 1.22.7 to patch CVE-2024-34155, CVE-2024-34156, CVE-2024-34158 #6217 #6264
Full Changelog: v1.18.0...v1.18.1
Cortex v1.18.0
This release contains 230 contributions from 29 contributors. We also have 9 new contributors. Thank you all for the contributions!
Some notable changes release are:
- Experimental native histogram ingestion via
-blocks-storage.tsdb.enable-native-histograms
flag - Support for filtering alerts on ListRules API
- Add query rejection mechanism to protect queries
- Introduce token bucket limiter to protect store gateway
- Implement ingester metadata API limits
- Remove
-querier.query-store-for-labels-enabled
flag - Remove
-querier.at-modifier-enabled
flag - Remove
oltp_endpoint
config ruler.evaluation-delay-duration
mark as deprecated. Useruler.query-offset
-querier.max-outstanding-requests-per-tenant
and-query-scheduler.max-outstanding-requests-per-tenant
mark as deprecated. Usefrontend.max-outstanding-requests-per-tenant
What's Changed
- [CHANGE] Ingester: Remove
-querier.query-store-for-labels-enabled
flag. Querying long-term store for labels is always enabled. #5984 - [CHANGE] Server: Instrument
cortex_request_duration_seconds
metric with native histogram. Ifnative-histograms
feature is enabled in monitoring Prometheus then the metric name needs to be updated in your dashboards. #6056 - [CHANGE] Distributor/Ingester: Change
cortex_distributor_ingester_appends_total
,cortex_distributor_ingester_append_failures_total
,cortex_distributor_ingester_queries_total
, andcortex_distributor_ingester_query_failures_total
metrics to use the ingester ID instead of its IP as the label value. #6078 - [CHANGE] OTLP: Set
AddMetricSuffixes
to true to always enable metric name normalization. #6136 - [CHANGE] Querier: Deprecate and enable by default
querier.ingester-metadata-streaming
flag. #6147 - [CHANGE] QueryFrontend/QueryScheduler: Deprecate
-querier.max-outstanding-requests-per-tenant
and-query-scheduler.max-outstanding-requests-per-tenant
flags. Use frontend.max-outstanding-requests-per-tenant instead. #6146 - [CHANGE] Ingesters: Enable 'snappy-block' compression on ingester clients by default. #6148
- [CHANGE] Ruler: Scheduling
ruler.evaluation-delay-duration
to be deprecated. Ruler will use the highest value betweenruler.evaluation-delay-duration
andruler.query-offset
#6149 - [CHANGE] Querier: Remove
-querier.at-modifier-enabled
flag. #6157 - [CHANGE] Tracing: Remove deprecated
oltp_endpoint
config entirely. #6158 - [CHANGE] Store Gateway: Enable store gateway zone stable shuffle sharding by default. #6161
- [FEATURE] Ingester/Distributor: Experimental: Enable native histogram ingestion via
-blocks-storage.tsdb.enable-native-histograms
flag. #5986 #6010 #6020 - [FEATURE] Querier: Enable querying native histogram chunks. #5944 #6031
- [FEATURE] Query Frontend: Support native histogram in query frontend response. #5996 #6043
- [FEATURE] Ruler: Support sending native histogram samples to Ingester. #6029
- [FEATURE] Ruler: Add support for filtering out alerts in ListRules API. #6011
- [FEATURE] Query Frontend: Added a query rejection mechanism to block resource-intensive queries. #6005
- [FEATURE] OTLP: Support ingesting OTLP exponential metrics as native histograms. #6071 #6135
- [FEATURE] Ingester: Add
ingester.instance-limits.max-inflight-query-requests
to allow limiting ingester concurrent queries. #6081 - [FEATURE] Distributor: Add
validation.max-native-histogram-buckets
to limit max number of bucket count. Distributor will try to automatically reduce histogram resolution until it is within the bucket limit or resolution cannot be reduced anymore. #6104 - [FEATURE] Store Gateway: Introduce token bucket limiter to enhance store gateway throttling. #6016
- [FEATURE] Ruler: Add support for
query_offset
field on RuleGroup and newruler_query_offset
per-tenant limit. #6085 - [ENHANCEMENT] Ruler: Add support to persist tokens in rulers. #5987
- [ENHANCEMENT] Query Frontend/Querier: Added store gateway postings touched count and touched size in Querier stats and log in Query Frontend. #5892
- [ENHANCEMENT] Query Frontend/Querier: Returns
warnings
on prometheus query responses. #5916 - [ENHANCEMENT] Ingester: Allowing to configure
-blocks-storage.tsdb.head-compaction-interval
flag up to 30 min and add a jitter on the first head compaction. #5919 #5928 - [ENHANCEMENT] Distributor: Added
max_inflight_push_requests
config to ingester client to protect distributor from OOMKilled. #5917 - [ENHANCEMENT] Distributor/Querier: Clean stale per-ingester metrics after ingester restarts. #5930
- [ENHANCEMENT] Distributor/Ring: Allow disabling detailed ring metrics by ring member. #5931
- [ENHANCEMENT] KV: Etcd Added etcd.ping-without-stream-allowed parameter to disable/enable PermitWithoutStream #5933
- [ENHANCEMENT] Ingester: Add a new
limits_per_label_set
limit. This limit functions similarly tomax_series_per_metric
, but allowing users to define the maximum number of series per LabelSet. #5950 #5993 - [ENHANCEMENT] Store Gateway: Log gRPC requests together with headers configured in
http_request_headers_to_log
. #5958 - [ENHANCEMENT] Ingester: Add a new experimental
-ingester.labels-string-interning-enabled
flag to enable string interning for metrics labels. #6057 - [ENHANCEMENT] Ingester: Add link to renew 10% of the ingesters tokens in the admin page. #6063
- [ENHANCEMENT] Ruler: Add support for filtering by
state
andhealth
field on Rules API. #6040 - [ENHANCEMENT] Ruler: Add support for filtering by
match
field on Rules API. #6083 - [ENHANCEMENT] Distributor: Reduce memory usage when error volume is high. #6095
- [ENHANCEMENT] Compactor: Centralize metrics used by compactor and add user label to compactor metrics. #6096
- [ENHANCEMENT] Compactor: Add unique execution ID for each compaction cycle in log for easy debugging. #6097
- [ENHANCEMENT] Compactor: Differentiate retry and halt error and retry failed compaction only on retriable error. #6111
- [ENHANCEMENT] Ruler: Add support for filtering by
state
andhealth
field on Rules API. #6040 - [ENHANCEMENT] Compactor: Split cleaner cycle for active and deleted tenants. #6112
- [ENHANCEMENT] Compactor: Introduce cleaner visit marker. #6113
- [ENHANCEMENT] Query Frontend: Add
cortex_query_samples_total
metric. #6142 - [ENHANCEMENT] Ingester: Implement metadata API limit. #6128
- [BUGFIX] Configsdb: Fix endline issue in db password. #5920
- [BUGFIX] Ingester: Fix
user
andtype
labels for thecortex_ingester_tsdb_head_samples_appended_total
TSDB metric. #5952 - [BUGFIX] Querier: Enforce max query length check for
/api/v1/series
API even thoughignoreMaxQueryLength
is set to true. #6018 - [BUGFIX] Ingester: Fix issue with the minimize token generator where it was not taking in consideration the current ownership of an instance when generating extra tokens. #6062
- [BUGFIX] Scheduler: Fix user queue in scheduler that was not thread-safe. #6077 #6160
- [BUGFIX] Ingester: Include out-of-order head compaction when compacting TSDB head. #6108
- [BUGFIX] Ingester: Fix
cortex_ingester_tsdb_mmap_chunks_total
metric. #6134 - [BUGFIX] Query Frontend: Fix query rejection bug for metadata queries. #6143
New Contributors
- @Yaxhveer made their first contribution in #5920
- @KrisBuytaert made their first contribution in #5933
- @wilguo made their first contribution in #5935
- @rapphil made their first contribution in #5987
- @harshitasao made their first contribution in #6061
- @deradiri made their first contribution in #6099
- @shekeriev made their first contribution in #6125
- @klingerf made their first contribution in #6131
- @SungJin1212 made their first contribution in #6142
Full Changelog: v1.17.1...v1.18.0
Cortex v1.18.0-rc.0
This release contains 230 contributions from 29 contributors. We also have 9 new contributors. Thank you all for the contributions!
Some notable changes release are:
- Experimental native histogram ingestion via
-blocks-storage.tsdb.enable-native-histograms
flag - Support for filtering alerts on ListRules API
- Add query rejection mechanism to protect queries
- Introduce token bucket limiter to protect store gateway
- Implement ingester metadata API limits
- Remove
-querier.query-store-for-labels-enabled
flag - Remove
-querier.at-modifier-enabled
flag - Remove
oltp_endpoint
config ruler.evaluation-delay-duration
mark as deprecated. Useruler.query-offset
-querier.max-outstanding-requests-per-tenant
and-query-scheduler.max-outstanding-requests-per-tenant
mark as deprecated. Usefrontend.max-outstanding-requests-per-tenant
What's Changed
- [CHANGE] Ingester: Remove
-querier.query-store-for-labels-enabled
flag. Querying long-term store for labels is always enabled. #5984 - [CHANGE] Server: Instrument
cortex_request_duration_seconds
metric with native histogram. Ifnative-histograms
feature is enabled in monitoring Prometheus then the metric name needs to be updated in your dashboards. #6056 - [CHANGE] Distributor/Ingester: Change
cortex_distributor_ingester_appends_total
,cortex_distributor_ingester_append_failures_total
,cortex_distributor_ingester_queries_total
, andcortex_distributor_ingester_query_failures_total
metrics to use the ingester ID instead of its IP as the label value. #6078 - [CHANGE] OTLP: Set
AddMetricSuffixes
to true to always enable metric name normalization. #6136 - [CHANGE] Querier: Deprecate and enable by default
querier.ingester-metadata-streaming
flag. #6147 - [CHANGE] QueryFrontend/QueryScheduler: Deprecate
-querier.max-outstanding-requests-per-tenant
and-query-scheduler.max-outstanding-requests-per-tenant
flags. Use frontend.max-outstanding-requests-per-tenant instead. #6146 - [CHANGE] Ingesters: Enable 'snappy-block' compression on ingester clients by default. #6148
- [CHANGE] Ruler: Scheduling
ruler.evaluation-delay-duration
to be deprecated. Ruler will use the highest value betweenruler.evaluation-delay-duration
andruler.query-offset
#6149 - [CHANGE] Querier: Remove
-querier.at-modifier-enabled
flag. #6157 - [CHANGE] Tracing: Remove deprecated
oltp_endpoint
config entirely. #6158 - [CHANGE] Store Gateway: Enable store gateway zone stable shuffle sharding by default. #6161
- [FEATURE] Ingester/Distributor: Experimental: Enable native histogram ingestion via
-blocks-storage.tsdb.enable-native-histograms
flag. #5986 #6010 #6020 - [FEATURE] Querier: Enable querying native histogram chunks. #5944 #6031
- [FEATURE] Query Frontend: Support native histogram in query frontend response. #5996 #6043
- [FEATURE] Ruler: Support sending native histogram samples to Ingester. #6029
- [FEATURE] Ruler: Add support for filtering out alerts in ListRules API. #6011
- [FEATURE] Query Frontend: Added a query rejection mechanism to block resource-intensive queries. #6005
- [FEATURE] OTLP: Support ingesting OTLP exponential metrics as native histograms. #6071 #6135
- [FEATURE] Ingester: Add
ingester.instance-limits.max-inflight-query-requests
to allow limiting ingester concurrent queries. #6081 - [FEATURE] Distributor: Add
validation.max-native-histogram-buckets
to limit max number of bucket count. Distributor will try to automatically reduce histogram resolution until it is within the bucket limit or resolution cannot be reduced anymore. #6104 - [FEATURE] Store Gateway: Introduce token bucket limiter to enhance store gateway throttling. #6016
- [FEATURE] Ruler: Add support for
query_offset
field on RuleGroup and newruler_query_offset
per-tenant limit. #6085 - [ENHANCEMENT] Ruler: Add support to persist tokens in rulers. #5987
- [ENHANCEMENT] Query Frontend/Querier: Added store gateway postings touched count and touched size in Querier stats and log in Query Frontend. #5892
- [ENHANCEMENT] Query Frontend/Querier: Returns
warnings
on prometheus query responses. #5916 - [ENHANCEMENT] Ingester: Allowing to configure
-blocks-storage.tsdb.head-compaction-interval
flag up to 30 min and add a jitter on the first head compaction. #5919 #5928 - [ENHANCEMENT] Distributor: Added
max_inflight_push_requests
config to ingester client to protect distributor from OOMKilled. #5917 - [ENHANCEMENT] Distributor/Querier: Clean stale per-ingester metrics after ingester restarts. #5930
- [ENHANCEMENT] Distributor/Ring: Allow disabling detailed ring metrics by ring member. #5931
- [ENHANCEMENT] KV: Etcd Added etcd.ping-without-stream-allowed parameter to disable/enable PermitWithoutStream #5933
- [ENHANCEMENT] Ingester: Add a new
limits_per_label_set
limit. This limit functions similarly tomax_series_per_metric
, but allowing users to define the maximum number of series per LabelSet. #5950 #5993 - [ENHANCEMENT] Store Gateway: Log gRPC requests together with headers configured in
http_request_headers_to_log
. #5958 - [ENHANCEMENT] Ingester: Add a new experimental
-ingester.labels-string-interning-enabled
flag to enable string interning for metrics labels. #6057 - [ENHANCEMENT] Ingester: Add link to renew 10% of the ingesters tokens in the admin page. #6063
- [ENHANCEMENT] Ruler: Add support for filtering by
state
andhealth
field on Rules API. #6040 - [ENHANCEMENT] Ruler: Add support for filtering by
match
field on Rules API. #6083 - [ENHANCEMENT] Distributor: Reduce memory usage when error volume is high. #6095
- [ENHANCEMENT] Compactor: Centralize metrics used by compactor and add user label to compactor metrics. #6096
- [ENHANCEMENT] Compactor: Add unique execution ID for each compaction cycle in log for easy debugging. #6097
- [ENHANCEMENT] Compactor: Differentiate retry and halt error and retry failed compaction only on retriable error. #6111
- [ENHANCEMENT] Ruler: Add support for filtering by
state
andhealth
field on Rules API. #6040 - [ENHANCEMENT] Compactor: Split cleaner cycle for active and deleted tenants. #6112
- [ENHANCEMENT] Compactor: Introduce cleaner visit marker. #6113
- [ENHANCEMENT] Query Frontend: Add
cortex_query_samples_total
metric. #6142 - [ENHANCEMENT] Ingester: Implement metadata API limit. #6128
- [BUGFIX] Configsdb: Fix endline issue in db password. #5920
- [BUGFIX] Ingester: Fix
user
andtype
labels for thecortex_ingester_tsdb_head_samples_appended_total
TSDB metric. #5952 - [BUGFIX] Querier: Enforce max query length check for
/api/v1/series
API even thoughignoreMaxQueryLength
is set to true. #6018 - [BUGFIX] Ingester: Fix issue with the minimize token generator where it was not taking in consideration the current ownership of an instance when generating extra tokens. #6062
- [BUGFIX] Scheduler: Fix user queue in scheduler that was not thread-safe. #6077 #6160
- [BUGFIX] Ingester: Include out-of-order head compaction when compacting TSDB head. #6108
- [BUGFIX] Ingester: Fix
cortex_ingester_tsdb_mmap_chunks_total
metric. #6134 - [BUGFIX] Query Frontend: Fix query rejection bug for metadata queries. #6143
New Contributors
- @Yaxhveer made their first contribution in #5920
- @KrisBuytaert made their first contribution in #5933
- @wilguo made their first contribution in #5935
- @rapphil made their first contribution in #5987
- @harshitasao made their first contribution in #6061
- @deradiri made their first contribution in #6099
- @shekeriev made their first contribution in #6125
- @klingerf made their first contribution in #6131
- @SungJin1212 made their first contribution in #6142
Full Changelog: v1.17.1...v1.18.0-rc.0
Cortex v1.17.1
This release includes one bug fix and two changes related to compatibility:
- [CHANGE] Query Frontend/Ruler: Omit empty data, errorType and error fields in API response. #5953 #5954
- [ENHANCEMENT] Ingester: Added
upload_compacted_blocks_enabled
config to ingester to parameterize uploading compacted blocks. #5959 - [BUGFIX] Querier: Select correct tenant during query federation. #5943
Cortex 1.17.0
This release contains 168 contributions from 29 contributors. We also have 16 new contributors. Thank you all for the contributions!
Some notable changes release are:
- Experimental OTLP ingestion
- Experimental minimize spread token generator strategy on Ingester
- Advanced query scheduling with Query Priority
- ListRules API high availability by rule group replication and backup
- Various improvements on Store Gateway Index Cache
mem-ballast-size-bytes
flag has been marked as deprecated and not functional anymore-querier.ingester-streaming
flag has been marked as deprecated and ingester streaming is always enabled nowquerier.iterators
andquerier.batch-iterators
flags have been marked as deprecated and batch iterator is always enabled in Querier now
Cortex
- [CHANGE] Azure Storage: Upgraded objstore dependency and support Azure Workload Identity Authentication. Added
connection_string
to support authenticating via SAS token. Markedmsi_resource
config as deprecating. #5645 - [CHANGE] Store Gateway: Add a new fastcache based inmemory index cache. #5619
- [CHANGE] Index Cache: Multi level cache backfilling operation becomes async. Added
-blocks-storage.bucket-store.index-cache.multilevel.max-async-concurrency
and-blocks-storage.bucket-store.index-cache.multilevel.max-async-buffer-size
configs and metriccortex_store_multilevel_index_cache_backfill_dropped_items_total
for number of dropped items. #5661 - [CHANGE] Ingester: Disable uploading compacted blocks and overlapping compaction in ingester. #5735
- [CHANGE] Distributor: Count the number of rate-limited samples in
distributor_samples_in_total
. #5714 - [CHANGE] Ruler: Remove
cortex_ruler_write_requests_total
,cortex_ruler_write_requests_failed_total
,cortex_ruler_queries_total
,cortex_ruler_queries_failed_total
, andcortex_ruler_query_seconds_total
metrics for the tenant when the ruler deletes the manager for the tenant. #5772 - [CHANGE] Main: Mark
mem-ballast-size-bytes
flag as deprecated. #5816 - [CHANGE] Querier: Mark
-querier.ingester-streaming
flag as deprecated. Now query ingester streaming is always enabled. #5817 - [CHANGE] Compactor/Bucket Store: Added
-blocks-storage.bucket-store.block-discovery-strategy
to configure different block listing strategy. Reverted the current recursive block listing mechanism and use the strategyConcurrent
as in 1.15. #5828 - [CHANGE] Compactor: Don't halt compactor when overlapped source blocks detected. #5854
- [CHANGE] S3 Bucket Client: Expose
-blocks-storage.s3.send-content-md5
flag and set default checksum algorithm to MD5. #5870 - [CHANGE] Querier: Mark
querier.iterators
andquerier.batch-iterators
flags as deprecated. Now querier always use batch iterators. #5868 - [FEATURE] OTLP ingestion experimental. #5813
- [FEATURE] Ingester: Add per-tenant new metric
cortex_ingester_tsdb_data_replay_duration_seconds
. #5477 - [FEATURE] Query Frontend/Scheduler: Add query priority support. #5605
- [FEATURE] Tracing: Add
kuberesolver
to resolve endpoints address withkubernetes://
prefix as Kubernetes service. #5731 - [FEATURE] Tracing: Add
tracing.otel.round-robin
flag to useround_robin
gRPC client side LB policy for sending OTLP traces. #5731 - [FEATURE] Ruler: Add
ruler.concurrent-evals-enabled
flag to enable concurrent evaluation within a single rule group for independent rules. Maximum concurrency can be configured viaruler.max-concurrent-evals
. #5766 - [FEATURE] Distributor Queryable: Experimental: Add config
zone_results_quorum_metadata
. When querying ingesters using metadata APIs such as label names and values, only results from quorum number of zones will be included and merged. #5779 - [FEATURE] Storage Cache Clients: Add config
set_async_circuit_breaker_config
to utilize the circuit breaker pattern for dynamically thresholding asynchronous set operations. Implemented in both memcached and redis cache clients. #5789 - [FEATURE] Ruler: Add experimental
experimental.ruler.api-deduplicate-rules
flag to remove duplicate rule groups from the Prometheus compatible rules API endpoint. Add experimentalruler.ring.replication-factor
andruler.ring.zone-awareness-enabled
flags to configure rule group replication, but only the first ruler in the replicaset evaluates the rule group, the rest will just hold a copy as backup. Add experimentalexperimental.ruler.api-enable-rules-backup
flag to configure rulers to send the rule group backups stored in the replicaset to handle events when a ruler is down during an API request to list rules. #5782 - [ENHANCEMENT] Store Gateway: Added
-store-gateway.enabled-tenants
and-store-gateway.disabled-tenants
to explicitly enable or disable store-gateway for specific tenants. #5638 - [ENHANCEMENT] Compactor: Add new compactor metric
cortex_compactor_start_duration_seconds
. #5683 - [ENHANCEMENT] Index Cache: Multi level cache adds config
max_backfill_items
to cap max items to backfill per async operation. #5686 - [ENHANCEMENT] Query Frontend: Log number of split queries in
query stats
log. #5703 - [ENHANCEMENT] Logging: Added new options for logging HTTP request headers:
-server.log-request-headers
enables logging HTTP request headers,-server.log-request-headers-exclude-list
allows users to specify headers which should not be logged. #5744 - [ENHANCEMENT] Query Frontend/Scheduler: Time check in query priority now considers overall data select time window (including range selectors, modifiers and lookback delta). #5758
- [ENHANCEMENT] Querier: Added
querier.store-gateway-query-stats-enabled
to enable or disable store gateway query stats log. #5749 - [ENHANCEMENT] AlertManager: Retrying AlertManager Delete Silence on error #5794
- [ENHANCEMENT] Ingester: Add new ingester metric
cortex_ingester_max_inflight_query_requests
. #5798 - [ENHANCEMENT] Query: Added
query_storage_wall_time
to Query Frontend and Ruler query stats log for wall time spent on fetching data from storage. Query evaluation is not included. #5799 - [ENHANCEMENT] Query: Added additional max query length check at Query Frontend and Ruler. Added
-querier.ignore-max-query-length
flag to disable max query length check at Querier. #5808 - [ENHANCEMENT] Querier: Add context error check when converting Metrics to SeriesSet for GetSeries on distributorQuerier. #5827
- [ENHANCEMENT] Ruler: Improve GetRules response time by refactoring mutexes and introducing a temporary rules cache in
ruler/manager.go
. #5805 - [ENHANCEMENT] Querier: Add context error check when merging slices from ingesters for GetLabel operations. #5837
- [ENHANCEMENT] Ring: Add experimental
-ingester.tokens-generator-strategy=minimize-spread
flag to enable the new minimize spread token generator strategy. #5855 - [ENHANCEMENT] Query Frontend: Ensure error response returned by Query Frontend follows Prometheus API error response format. #5811
- [ENHANCEMENT] Ring Status Page: Add
Ownership Diff From Expected
column in the ring table to indicate the extent to which the ownership of a specific ingester differs from the expected ownership. #5889 - [BUGFIX] Distributor: Do not use label with empty values for sharding #5717
- [BUGFIX] Query Frontend: queries with negative offset should check whether it is cacheable or not. #5719
- [BUGFIX] Redis Cache: pass
cache_size
config correctly. #5734 - [BUGFIX] Distributor: Shuffle-Sharding with IngestionTenantShardSize == 0, default sharding strategy should be used #5189
- [BUGFIX] Cortex: Fix GRPC stream clients not honoring overrides for call options. #5797
- [BUGFIX] Ring DDB: Fix lifecycle for ring counting unhealthy pods as healthy. #5838
- [BUGFIX] Ring DDB: Fix region assignment. #5842
New Contributors
- @testwill made their first contribution in #5644
- @dsabsay made their first contribution in #5684
- @pawarpranav83 made their first contribution in #5719
- @Kramer0x0 made their first contribution in #5743
- @tesla59 made their first contribution in #5746
- @blorby made their first contribution in #5767
- @CharlieTLe made their first contribution in #5784
- @lekaf974 made their first contribution in #5793
- @euniceek made their first contribution in #5794
- @mustafain117 made their first contribution in #5823
- @availhang made their first contribution in #5826
- @erlan-z made their first contribution in #5827
- @yj-yoo made their first contribution in #5775
- @kindknow made their first contribution in #5856
- @momantech made their first contribution in #5863
- @till made their first contribution in #5874
Full Changelog: v1.16.1...v1.17.0
Cortex 1.17.0-rc.1
Over v1.17.0-rc.0 to include one bug fix and one change.
Cortex 1.17.0-rc.0
This release contains 166 contributions from 29 contributors. We also have 16 new contributors. Thank you all for the contributions!
Some notable changes release are:
- Experimental OTLP ingestion
- Experimental minimize spread token generator strategy on Ingester
- Advanced query scheduling with Query Priority
- ListRules API high availability by rule group replication and backup
- Various improvements on Store Gateway Index Cache
mem-ballast-size-bytes
flag has been marked as deprecated and not functional anymore-querier.ingester-streaming
flag has been marked as deprecated and ingester streaming is always enabled nowquerier.iterators
andquerier.batch-iterators
flags have been marked as deprecated and batch iterator is always enabled in Querier now
Cortex
- [CHANGE] Azure Storage: Upgraded objstore dependency and support Azure Workload Identity Authentication. Added
connection_string
to support authenticating via SAS token. Markedmsi_resource
config as deprecating. #5645 - [CHANGE] Store Gateway: Add a new fastcache based inmemory index cache. #5619
- [CHANGE] Index Cache: Multi level cache backfilling operation becomes async. Added
-blocks-storage.bucket-store.index-cache.multilevel.max-async-concurrency
and-blocks-storage.bucket-store.index-cache.multilevel.max-async-buffer-size
configs and metriccortex_store_multilevel_index_cache_backfill_dropped_items_total
for number of dropped items. #5661 - [CHANGE] Ingester: Disable uploading compacted blocks and overlapping compaction in ingester. #5735
- [CHANGE] Distributor: Count the number of rate-limited samples in
distributor_samples_in_total
. #5714 - [CHANGE] Ruler: Remove
cortex_ruler_write_requests_total
,cortex_ruler_write_requests_failed_total
,cortex_ruler_queries_total
,cortex_ruler_queries_failed_total
, andcortex_ruler_query_seconds_total
metrics for the tenant when the ruler deletes the manager for the tenant. #5772 - [CHANGE] Main: Mark
mem-ballast-size-bytes
flag as deprecated. #5816 - [CHANGE] Querier: Mark
-querier.ingester-streaming
flag as deprecated. Now query ingester streaming is always enabled. #5817 - [CHANGE] Compactor/Bucket Store: Added
-blocks-storage.bucket-store.block-discovery-strategy
to configure different block listing strategy. Reverted the current recursive block listing mechanism and use the strategyConcurrent
as in 1.15. #5828 - [CHANGE] Compactor: Don't halt compactor when overlapped source blocks detected. #5854
- [CHANGE] S3 Bucket Client: Expose
-blocks-storage.s3.send-content-md5
flag and set default checksum algorithm to MD5. #5870 - [CHANGE] Querier: Mark
querier.iterators
andquerier.batch-iterators
flags as deprecated. Now querier always use batch iterators. #5868 - [CHANGE] Query Frontend: Error response returned by Query Frontend now follows Prometheus API error response format. #5811
- [FEATURE] Experimental: OTLP ingestion. #5813
- [FEATURE] Query Frontend/Scheduler: Add query priority support. #5605
- [FEATURE] Tracing: Use
kuberesolver
to resolve OTLP endpoints address withkubernetes://
prefix as Kubernetes service. #5731 - [FEATURE] Tracing: Add
tracing.otel.round-robin
flag to useround_robin
gRPC client side LB policy for sending OTLP traces. #5731 - [FEATURE] Ruler: Add
ruler.concurrent-evals-enabled
flag to enable concurrent evaluation within a single rule group for independent rules. Maximum concurrency can be configured viaruler.max-concurrent-evals
. #5766 - [FEATURE] Distributor Queryable: Experimental: Add config
zone_results_quorum_metadata
. When querying ingesters using metadata APIs such as label names and values, only results from quorum number of zones will be included and merged. #5779 - [FEATURE] Storage Cache Clients: Add config
set_async_circuit_breaker_config
to utilize the circuit breaker pattern for dynamically thresholding asynchronous set operations. Implemented in both memcached and redis cache clients. #5789 - [FEATURE] Ruler: Add experimental
experimental.ruler.api-deduplicate-rules
flag to remove duplicate rule groups from the Prometheus compatible rules API endpoint. Add experimentalruler.ring.replication-factor
andruler.ring.zone-awareness-enabled
flags to configure rule group replication, but only the first ruler in the replicaset evaluates the rule group, the rest will just hold a copy as backup. Add experimentalexperimental.ruler.api-enable-rules-backup
flag to configure rulers to send the rule group backups stored in the replicaset to handle events when a ruler is down during an API request to list rules. #5782 - [FEATURE] Ring: Add experimental
-ingester.tokens-generator-strategy=minimize-spread
flag to enable the new minimize spread token generator strategy. #5855 - [FEATURE] Ring Status Page: Add
Ownership Diff From Expected
column in the ring table to indicate the extent to which the ownership of a specific ingester differs from the expected ownership. #5889 - [ENHANCEMENT] Ingester: Add per-tenant new metric
cortex_ingester_tsdb_data_replay_duration_seconds
. #5477 - [ENHANCEMENT] Store Gateway: Added
-store-gateway.enabled-tenants
and-store-gateway.disabled-tenants
to explicitly enable or disable store-gateway for specific tenants. #5638 - [ENHANCEMENT] Query Frontend: Write service timing header in response even though there is an error. #5653
- [ENHANCEMENT] Compactor: Add new compactor metric
cortex_compactor_start_duration_seconds
. #5683 - [ENHANCEMENT] Index Cache: Multi level cache adds config
max_backfill_items
to cap max items to backfill per async operation. #5686 - [ENHANCEMENT] Query Frontend: Log number of split queries in
query stats
log. #5703 - [ENHANCEMENT] Compactor: Skip compaction retry when encountering a permission denied error. #5727
- [ENHANCEMENT] Logging: Added new options for logging HTTP request headers:
-server.log-request-headers
enables logging HTTP request headers,-server.log-request-headers-exclude-list
allows users to specify headers which should not be logged. #5744 - [ENHANCEMENT] Query Frontend/Scheduler: Time check in query priority now considers overall data select time window (including range selectors, modifiers and lookback delta). #5758
- [ENHANCEMENT] Querier: Added
querier.store-gateway-query-stats-enabled
to enable or disable store gateway query stats log. #5749 - [ENHANCEMENT] Querier: Improve labels APIs latency by merging slices using K-way merge and more than 1 core. #5785
- [ENHANCEMENT] AlertManager: Retrying AlertManager Delete Silence on error. #5794
- [ENHANCEMENT] Ingester: Add new ingester metric
cortex_ingester_max_inflight_query_requests
. #5798 - [ENHANCEMENT] Query: Added
query_storage_wall_time
to Query Frontend and Ruler query stats log for wall time spent on fetching data from storage. Query evaluation is not included. #5799 - [ENHANCEMENT] Query: Added additional max query length check at Query Frontend and Ruler. Added
-querier.ignore-max-query-length
flag to disable max query length check at Querier. #5808 - [ENHANCEMENT] Querier: Add context error check when converting Metrics to SeriesSet for GetSeries on distributorQuerier. #5827
- [ENHANCEMENT] Ruler: Improve GetRules response time by reducing lock contention and introducing a temporary rules cache in
ruler/manager.go
. #5805 - [ENHANCEMENT] Querier: Add context error check when merging slices from ingesters for GetLabel operations. #5837
- [BUGFIX] Distributor: Do not use label with empty values for sharding #5717
- [BUGFIX] Query Frontend: queries with negative offset should check whether it is cacheable or not. #5719
- [BUGFIX] Redis Cache: pass
cache_size
config correctly. #5734 - [BUGFIX] Distributor: Shuffle-Sharding with
ingestion_tenant_shard_size
set to 0, default sharding strategy should be used. #5189 - [BUGFIX] Cortex: Fix GRPC stream clients not honoring overrides for call options. #5797
- [BUGFIX] Ruler: Fix support for
keep_firing_for
field in alert rules. #5823 - [BUGFIX] Ring DDB: Fix lifecycle for ring counting unhealthy pods as healthy. #5838
- [BUGFIX] Ring DDB: Fix region assignment. #5842
New Contributors
- @testwill made their first contribution in #5644
- @dsabsay made their first contribution in #5684
- @pawarpranav83 made their first contribution in #5719
- @Kramer0x0 made their first contribution in #5743
- @tesla59 made their first contribution in #5746
- @blorby made their first contribution in #5767
- @CharlieTLe made their first contribution in #5784
- @lekaf974 made their first contribution in #5793
- @euniceek made their first contribution in #5794
- @mustafain117 made their first contribution in #5823
- @availhang made their first contribution in #5826
- @erlan-z made their first contribution in #5827
- @yj-yoo made their first contribution in #5775
- @kindknow made their first contribution in #5856
- @momantech made their first contribution in #5863
- @till made their first contribution in #5874
Full Changelog: v1.16.1...v1.17.0-rc.0
Cortex 1.16.1
Cortex 1.16.0
This release contains 227 contributions from 27 contributors. We also have 10 new contributors. Thank you all for the contribution!
Some notable changes release are:
- Store Gateway multilevel index cache
- Object storage backend for runtime config
- Disable specific rule groups in Ruler
- List rules supports filtering by rule name, rule group and file
- Allow tenant shard size to be a percent of total instances for Querier and Store Gateway
- Various improvement on metrics
Cortex
- [CHANGE] AlertManager: include reason label in
cortex_alertmanager_notifications_failed_total
. #5409 - [CHANGE] Ruler: Added user label to
cortex_ruler_write_requests_total
,cortex_ruler_write_requests_failed_total
,cortex_ruler_queries_total
, andcortex_ruler_queries_failed_total
metrics. #5312 - [CHANGE] Alertmanager: Validating new fields on the PagerDuty AM config. #5290
- [CHANGE] Ingester: Creating label
native-histogram-sample
on thecortex_discarded_samples_total
to keep track of discarded native histogram samples. #5289 - [CHANGE] Store Gateway: Rename
cortex_bucket_store_cached_postings_compression_time_seconds
tocortex_bucket_store_cached_postings_compression_time_seconds_total
. #5431 - [CHANGE] Store Gateway: Rename
cortex_bucket_store_cached_series_fetch_duration_seconds
tocortex_bucket_store_series_fetch_duration_seconds
andcortex_bucket_store_cached_postings_fetch_duration_seconds
tocortex_bucket_store_postings_fetch_duration_seconds
. Add new metriccortex_bucket_store_chunks_fetch_duration_seconds
. #5448 - [CHANGE] Store Gateway: Remove
idle_timeout
,max_conn_age
,pool_size
,min_idle_conns
fields for Redis index cache and caching bucket. #5448 - [CHANGE] Store Gateway: Add flag
-store-gateway.sharding-ring.zone-stable-shuffle-sharding
to enable store gateway to use zone stable shuffle sharding. #5489 - [CHANGE] Bucket Index: Add
series_max_size
andchunk_max_size
to bucket index. #5489 - [CHANGE] StoreGateway: Rename
cortex_bucket_store_chunk_pool_returned_bytes_total
andcortex_bucket_store_chunk_pool_requested_bytes_total
tocortex_bucket_store_chunk_pool_operation_bytes_total
. #5552 - [CHANGE] Query Frontend/Querier: Make build info API disabled by default and add feature flag
api.build-info-enabled
to enable it. #5533 - [CHANGE] Purger: Do no use S3 tenant kms key when uploading deletion marker. #5575
- [CHANGE] Ingester: Shipper always allows uploading compacted blocks to ship OOO compacted blocks. #5625
- [CHANGE] DDBKV: Change metric name from
dynamodb_kv_read_capacity_total
todynamodb_kv_consumed_capacity_total
and include Delete, Put, Batch dimension. #5487 - [CHANGE] Compactor: Adding the userId on the compact dir path. #5524
- [CHANGE] Ingester: Remove deprecated ingester metrics. #5472
- [CHANGE] Query Frontend: Expose
-querier.max-subquery-steps
to configure subquery max steps check. By default, the limit is set to 0, which is disabled. #5656 - [FEATURE] Store Gateway: Implementing multi level index cache. #5451
- [FEATURE] Ruler: Add support for disabling rule groups. #5521
- [FEATURE] Support object storage backends for runtime configuration file. #5292
- [FEATURE] Ruler: Add support for
Limit
field on RuleGroup. #5528 - [FEATURE] AlertManager: Add support for Webex, Discord and Telegram Receiver. #5493
- [FEATURE] Ingester: added
-admin-limit-message
to customize the message contained in limit errors.#5460 - [FEATURE] AlertManager: Update version to v0.26.0 and bring in Microsoft Teams receiver. #5543
- [FEATURE] Store Gateway: Support lazy expanded posting optimization. Added new flag
blocks-storage.bucket-store.lazy-expanded-postings-enabled
and new metricscortex_bucket_store_lazy_expanded_postings_total
,cortex_bucket_store_lazy_expanded_posting_size_bytes_total
andcortex_bucket_store_lazy_expanded_posting_series_overfetched_size_bytes_total
. #5556. - [FEATURE] Store Gateway: Add
max_downloaded_bytes_per_request
to limit max bytes to download per store gateway request. #5179 - [FEATURE] Added 2 flags
-alertmanager.alertmanager-client.grpc-max-send-msg-size
and-alertmanager.alertmanager-client.grpc-max-recv-msg-size
to configure alert manager grpc client message size limits. #5338 - [FEATURE] Querier/StoreGateway: Allow the tenant shard sizes to be a percent of total instances. #5393
- [FEATURE] Added the flag
-alertmanager.api-concurrency
to configure alert manager api concurrency limit. #5412 - [FEATURE] Store Gateway: Add
-store-gateway.sharding-ring.keep-instance-in-the-ring-on-shutdown
to skip unregistering instance from the ring in shutdown. #5421 - [FEATURE] Ruler: Support for filtering rules in the API. #5417
- [FEATURE] Compactor: Add
-compactor.ring.tokens-file-path
to store generated tokens locally. #5432 - [FEATURE] Query Frontend: Add
-frontend.retry-on-too-many-outstanding-requests
to re-enqueue 429 requests if there are multiple query-schedulers available. #5496 - [FEATURE] Store Gateway: Add
-blocks-storage.bucket-store.max-inflight-requests
for store gateways to reject further series requests upon reaching the limit. #5553 - [FEATURE] Store Gateway: Support filtered index cache. #5587
- [ENHANCEMENT] Update go version to 1.21.3. #5630
- [ENHANCEMENT] Store Gateway: Add
cortex_bucket_store_block_load_duration_seconds
histogram to track time to load blocks. #5580 - [ENHANCEMENT] Querier: retry chunk pool exhaustion error in querier rather than query frontend. #5569
- [ENHANCEMENT] Alertmanager: Added flag
-alertmanager.alerts-gc-interval
to configure alerts Garbage collection interval. #5550 - [ENHANCEMENT] Query Frontend: enable vertical sharding on binary expr . #5507
- [ENHANCEMENT] Query Frontend: Include user agent as part of query frontend log. #5450
- [ENHANCEMENT] Query: Set CORS Origin headers for Query API #5388
- [ENHANCEMENT] Query Frontend: Add
cortex_rejected_queries_total
metric for throttled queries. #5356 - [ENHANCEMENT] Query Frontend: Optimize the decoding of
SampleStream
. #5349 - [ENHANCEMENT] Compactor: Check ctx done when uploading visit marker. #5333
- [ENHANCEMENT] AlertManager: Add
cortex_alertmanager_dispatcher_aggregation_groups
andcortex_alertmanager_dispatcher_alert_processing_duration_seconds
metrics for dispatcher. #5592 - [ENHANCEMENT] Store Gateway: Added new flag
blocks-storage.bucket-store.series-batch-size
to control how many series to fetch per batch in Store Gateway. #5582. - [ENHANCEMENT] Querier: Log query stats when querying store gateway. #5376
- [ENHANCEMENT] Ruler: Add
cortex_ruler_rule_group_load_duration_seconds
andcortex_ruler_rule_group_sync_duration_seconds
metrics. #5609 - [ENHANCEMENT] Ruler: Add contextual info and query statistics to log #5604
- [ENHANCEMENT] Distributor/Ingester: Add span on push path #5319
- [ENHANCEMENT] Query Frontend: Reject subquery with too small step size. #5323
- [ENHANCEMENT] Compactor: Exposing Thanos
accept-malformed-index
to Cortex compactor. #5334 - [ENHANCEMENT] Log: Avoid expensive
log.Valuer
evaluation for disallowed levels. #5297 - [ENHANCEMENT] Improving Performance on the API Gzip Handler. #5347
- [ENHANCEMENT] Dynamodb: Add
puller-sync-time
to allow different pull time for ring. #5357 - [ENHANCEMENT] Emit querier
max_concurrent
as a metric. #5362 - [ENHANCEMENT] Avoid sort tokens on lifecycler autoJoin. #5394
- [ENHANCEMENT] Do not resync blocks in running store gateways during rollout deployment and container restart. #5363
- [ENHANCEMENT] Store Gateway: Add new metrics
cortex_bucket_store_sent_chunk_size_bytes
,cortex_bucket_store_postings_size_bytes
andcortex_bucket_store_empty_postings_total
. #5397 - [ENHANCEMENT] Add jitter to lifecycler heartbeat. #5404
- [ENHANCEMENT] Store Gateway: Add config
estimated_max_series_size_bytes
andestimated_max_chunk_size_bytes
to address data overfetch. #5401 - [ENHANCEMENT] Distributor/Ingester: Add experimental
-distributor.sign_write_requests
flag to sign the write requests. #5430 - [ENHANCEMENT] Store Gateway/Querier/Compactor: Handling CMK Access Denied errors. #5420 #5442 #5446
- [ENHANCEMENT] Alertmanager: Add the alert name in error log when it get throttled. #5456
- [ENHANCEMENT] Querier: Retry store gateway on different zones when zone awareness is enabled. #5476
- [ENHANCEMENT] Compactor: allow
unregister_on_shutdown
to be configurable. #5503 - [ENHANCEMENT] Querier: Batch adding series to query limiter to optimize locking. #5505
- [ENHANCEMENT] Store Gateway: add metric
cortex_bucket_store_chunk_refetches_total
for number of chunk refetches. #5532 - [ENHANCEMENT] BasicLifeCycler: allow final-sleep during shutdown #5517
- [ENHANCEMENT] All: Handling CMK Access Denied errors. #5420 #5542
- [ENHANCEMENT] Querier: Retry store gateway client connection closing gRPC error. #5558
- [ENHANCEMENT] QueryFrontend: Add generic retry for all APIs. #5561.
- [ENHANCEMENT] Querier: Check context before notifying scheduler and frontend. #5565
- [ENHANCEMENT] QueryFrontend: Add metric for number of series requests. #5373
- [ENHANCEMENT] Store Gateway: Add histogram metrics for total time spent fetching series and chunks per request. #5573
- [ENHANCEMENT] Store Gateway: Check context in multi level cache. Add
cortex_store_multilevel_index_cache_fetch_duration_seconds
andcortex_store_multilevel_index_cache_backfill_duration_seconds
to measure fetch and backfill latency. #5596 - [ENHANCEMENT] Ingester: Added new ingester TSDB metrics
cortex_ingester_tsdb_head_samples_appended_total
,cortex_ingester_tsdb_head_out_of_order_samples_appended_total
,cortex_ingester_tsdb_snapshot_replay_error_total
,cortex_ingester_tsdb_sample_ooo_delta
andcortex_ingester_tsdb_mmap_chunks_total
. #5624 - [ENHANCEMENT] Query Frontend: Handle context error before decoding and merging responses. #5499
- [ENHANCEMENT] Store-Gateway and AlertM...
Cortex 1.16.0-rc.1
Over v1.16.0-rc.0 to include one bug fix and one change.