Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enforce max span attribute size #4335

Open
wants to merge 9 commits into
base: main
Choose a base branch
from
Open

Conversation

ie-pham
Copy link
Contributor

@ie-pham ie-pham commented Nov 17, 2024

What this PR does: Add config for max_span_attr_size to enforce the max size a span attribute can be. If a span contains an attribute with either key or value over the limit is sent to Tempo, the attribute key/value will be truncated and marked with "_truncated".

Which issue(s) this PR fixes:
Fixes #3985

benchmarks

4 traces, 300 batches and 5000 spans each run

with adding "_truncated"

                 │ before.out  │        after_add_truncated.out         │
                 │   sec/op    │   sec/op      vs base                  │
TestsByRequestID   402.3m ± 7%   2384.8m ± 10%  +492.83% (p=0.000 n=10)

without adding "_truncated" to truncated attribute values

                 │  before.out  │         after-notruncate.out         │
                 │    sec/op    │    sec/op     vs base                │
TestsByRequestID   357.1m ± 52%   594.0m ± 34%  +66.33% (p=0.001 n=10)

                 │  before.out  │      after-notruncate.out      │
                 │     B/op     │     B/op      vs base          │
TestsByRequestID   58.71Mi ± 0%   58.71Mi ± 0%  ~ (p=1.000 n=10)

                 │ before.out  │      after-notruncate.out       │
                 │  allocs/op  │  allocs/op   vs base            │
TestsByRequestID   5.100k ± 0%   5.100k ± 0%  ~ (p=1.000 n=10) ¹

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@ie-pham ie-pham changed the title Enforce max span attribute size Enforce max span attribute size [wip] Nov 18, 2024

func spanContainsAttributeTooLarge(span *v1.Span, maxAttrSize int) bool {
for _, attr := range span.Attributes {
if len(attr.Key) > maxAttrSize || attr.Value.Size() > maxAttrSize {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, checking the attribute name. I'm wondering if this calculation should be against the total Key.Size() + Value.Size()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if key is max/2 and value is max/2, does it still cause a problem?


for _, b := range batches {
spansByILS := make(map[uint32]*v1.ScopeSpans)

for _, ils := range b.ScopeSpans {
for _, span := range ils.Spans {
// drop spans with attributes that are too large
if spanContainsAttributeTooLarge(span, maxSpanAttrSize) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a way we could drop only the attribute that seems better than dropping the span (preserve as much information as possible). It doesn't fit as easily within our current metrics/logs though.

@@ -229,6 +229,10 @@ distributor:
# instruct the client how to retry.
[retry_after_on_resource_exhausted: <duration> | default = '0' ]

# Optional
# Configures the max size a span attribute can be. Any span with at least one attribute over this limit would be discarded with reason "attribute_too_large"
[max_span_attr_size: <int> | default = '10000']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking if this should be max_attr_bytes to follow the pattern of other limits, and should it also apply to resource/event/link attributes?

Copy link
Contributor

@knylander-grafana knylander-grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding docs :)

@ie-pham
Copy link
Contributor Author

ie-pham commented Nov 22, 2024

benchmark updated but not looking too good :/

case *v1_common.AnyValue_StringValue:
if len(value.StringValue) > maxAttrSize {
fmt.Printf("size: %d", len(value.StringValue))
value.StringValue = value.StringValue[:maxAttrSize] + "_truncated"
Copy link
Member

@joe-elliott joe-elliott Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is quite expensive as it's going to alloc a completely new string every time.

sane but not as nice option:
just produce a metric that indicates number of truncated attributes

crazy but performant option:
rewrite the byte array underlying the string to contain "_truncated" at the right position

i don't think that would break anything in tempo but not 100% sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the benchmark for the scenario where every attribute needs to be truncated, adding the "_truncated" had a ~400+% degradation. And only 66% if we just truncate without adding the "_truncated". I'm in favor of just reporting the metrics without adding "_truncated". Wdyt?

benchmarks

4 traces, 300 batches and 5000 spans each run

with adding "_truncated"

                 │ before.out  │        after_add_truncated.out         │
                 │   sec/op    │   sec/op      vs base                  │
TestsByRequestID   402.3m ± 7%   2384.8m ± 10%  +492.83% (p=0.000 n=10)

without adding "_truncated" to truncated attribute values

                 │  before.out  │         after-notruncate.out         │
                 │    sec/op    │    sec/op     vs base                │
TestsByRequestID   357.1m ± 52%   594.0m ± 34%  +66.33% (p=0.001 n=10)

                 │  before.out  │      after-notruncate.out      │
                 │     B/op     │     B/op      vs base          │
TestsByRequestID   58.71Mi ± 0%   58.71Mi ± 0%  ~ (p=1.000 n=10)

                 │ before.out  │      after-notruncate.out       │
                 │  allocs/op  │  allocs/op   vs base            │
TestsByRequestID   5.100k ± 0%   5.100k ± 0%  ~ (p=1.000 n=10) ¹

@ie-pham ie-pham changed the title Enforce max span attribute size [wip] Enforce max span attribute size Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a limit that caps the maximum allowed size of an attribute
4 participants