Added metrics development docs

getsentry · Dec 4, 2023 · 4247c5d · 4247c5d
1 parent 4590501
commit 4247c5d
Show file tree

Hide file tree

Showing 2 changed files with 325 additions and 0 deletions.
diff --git a/src/docs/sdk/envelopes.mdx b/src/docs/sdk/envelopes.mdx
@@ -440,6 +440,39 @@ details.
 
 *None*
 
+### Statsd
+
+Item type `"statsd"` contains metrics emissions in a superset of the statsd format.
+
+See the <Link to="/sdk/metrics/">metrics</Link> documentation for the payload
+details.
+
+**Constraints:**
+
+- This Item may occur multiple times per Envelope.
+- Ingestion may limit the maximum number of Items per Envelope, see *Ingestion*.
+
+**Additional Item Headers:**
+
+*None*
+
+### Metric Meta
+
+Item type `"metric_meta"` contains per-metric meta data which is persisted alongside
+metrics in the system.
+
+See the <Link to="/sdk/metrics/">metrics</Link> documentation for the payload
+details.
+
+**Constraints:**
+
+- This Item may occur multiple times per Envelope.
+- Ingestion may limit the maximum number of Items per Envelope, see *Ingestion*.
+
+**Additional Item Headers:**
+
+*None*
+
 ### User Feedback
 
 User Feedback was called User Report in the beginning. Therefore the envelope item type is `"user_report"`. 

diff --git a/src/docs/sdk/metrics.mdx b/src/docs/sdk/metrics.mdx
@@ -0,0 +1,292 @@
+---
+title: Metrics
+sidebar_order: 13
+---
+
+Sentry supports the emission of pre-aggregated metrics information via the `statsd`
+envelope item.  This emission bypasses relay-side sampling and assumes a client side
+aggregator.  In the product these are referred to as "custom metrics" and internally
+the system sometimes carries the name "ddm" ([delightful developer
+metrics](/delightful-developer-metrics/)).  The functionality sits on top of the
+generics metrics platform which is also used for span and transaction level
+metrics.
+
+These metrics are considered "trace seeking" which means that they should attempt to
+establish a relationship to the current span or trace.
+
+## Introduction
+
+Custom metrics are sent into the `custom` namespace in the generic metrics platform.
+The features exposed from the generic metrics platform to the SDKs are all
+metric types (**counters**, **distributions**, **gauges** and **sets**).  SDKs
+are required to perform local aggregation unless they perform on-shot emissions.
+
+## Aggregator Behavior
+
+Metrics are aggregated into a process wide aggregator which is required to flush
+in intervals that are reasonable for the SDK.  These aggregations are lossless
+(unless the metric type is already lossy which for instance applies to sets
+and gauges) and are required to support tags.  The following general guidelines
+exist:
+
+* **10 Second Bucketing:** SDKs are required to bucket into 10 second intervals
+   (**rollup in seconds**) which is the current lower bound of metric accuracy.
+* **Flush Shift:** SDKs are required to shift the flush interval by `random() * rollup_in_seconds`.
+  That shift is determined once per startup to create jittering.
+* **Force flush:** an SDK is required to perform force flushing ahead of scheduled
+  time if the memory pressure is too high.  There is no rule for this other than
+  that SDKs should be tracking abstract aggregation complexity (eg: a counter
+  only carries a single float, whereas a distribution is a float per emission).
+* **Background Aggregation:** SDKs are encouraged to defer flushing and aggregation
+  into a background thread.  For SDKs where `fork()` is to be expected, the background
+  thread needs to ensure it restarts after fork (eg: python).
+* **Tagging:** metrics are tagged by key value pairs, where each key can have more
+  than one value.  Both keys and values are limited to strings, but the SDK can
+  expose other types if it has a reasonable way to stringify these.
+
+In abstract terms the aggregator is recommended to look a bit like this:
+
+```python
+class Aggregator:
+    def __init__(self):
+        self.buckets = {}
+    def get_bucket(self, ts):
+        return self.buckets.setdefault(ts // 10, {})
+    def add(self, type, key, value, unit, tags):
+        mri = "%s:%s" % (type, key)
+```
+
+## Normalization
+
+Normalization shall be done with unicode support if possible.
+
+* **Keys and Tag Keys:** regex `[^a-zA-Z0-9_/.-]+"` with replacement character `_`.
+* **Tag Values:** regex `[^\w\d_:/@\.{}\[\]$-]+` with no replacement character.
+
+## Units
+
+SDKs should permit any string unit, but some of them are known to the system and
+will in the future permit recalculations.  They are documented as part of
+relay: [`MetricUnit`](https://getsentry.github.io/relay/relay_metrics/enum.MetricUnit.html).
+
+## Trace Seeking
+
+When a metric is emitted it needs to find the current trace (trace seeking),
+most specifically the current span.  From there two things shall be happening
+unless they are disabled:
+
+* **Common tag attaching:** the tags `transaction`, `release` and `environment`
+  shall be automatically added to the metric.
+* **Span local aggregation:** the metric shall be attached to the current span
+  as a local summary (a form of gauge).  For more information see below.
+
+## API
+
+API wise the SDKs are required to expose static metric functions which are to be
+defined in a `metrics` module.  This also documents the aggregation state for
+each metric type.  They are constructed with an initial value, have an `add`
+method to add another value and a `serialize` method that returns the values for
+the final emission as array.
+
+### Common Attributes
+
+The following attributes are common for all metrics.  These should be emitted
+via keyword arguments, some configuration struct or whatever is reasonable
+in the target language.
+
+* `key`: the name of the metric.  SDKs are required to normalize this name.
+* `value`: the value to be emitted, this is specific by metric type.
+* `unit`: the unit for the metric, see below.
+* `tags`: a dictionary or list of tuples of tags and their values.
+* `stacklevel`: a utility attribute for SDKs where n stack level shall be
+  ignored for code location recording.  In languages that have better
+  facilities this can be removed.
+
+### Span Aggregation
+
+On a span every metric is aggregated as a gauge.  For each metric the following
+attributes are retained per span:
+
+* `min`: the minimum value observed.
+* `max`: the maximum value observed.
+* `count`: the number of emissions.
+* `sum`: the sum of all values.
+* `tags`: the metric tags and their values.
+
+The local aggregator behaves as such:
+
+```python
+class LocalAggregator:
+    def add(self, type, value, value, unit, tags):
+        # Assumes sorted tags
+        bucket_key = ("%s:%s@%s" % (ty, key, unit), tags)
+
+        old = self._measurements.get(bucket_key)
+        if old is not None:
+            v_min, v_max, v_count, v_sum = old
+            v_min = min(v_min, value)
+            v_max = max(v_max, value)
+            v_count += 1
+            v_sum += value
+        else:
+            v_min = v_max = v_sum = value
+            v_count = 1
+        self._measurements[bucket_key] = (v_min, v_max, v_count, v_sum)
+```
+
+### Counters
+
+Counters (`c`) have a single method for emission:
+
+* `incr(key, value: int | float = 1.0, unit = "none", ...)`: increments the
+  given key by one in the bucket.
+
+**Aggregation state:**
+
+```python
+class Counter:
+    def __init__(self, initial):
+        self.value = initial
+    def add(self, value):
+        self.value += value
+    def serialize(self):
+        return [self.value]
+```
+
+**Span aggregation:** the value is added to the span local aggregator.
+
+### Distributions
+
+A distribution (`d`) holds all values observed.  There are multiple methods that
+shall exist on an SDK to support common use cases.  The basic way to emit a
+distribution is the low level `distribution` function, higher level methods
+depend on the facilities in the language.
+
+* `distribution(key, value: int | float, unit = "none", ...)`: emits a single
+  value into a distribution.
+
+For languages with context managers (eg: `with` in Python):
+
+* `timing(key, ...)`: this shall measure the time of a code block and emit
+  the measured time as distribution with a time unit (`second`, `millisecond`,
+  etc.).  The default unit shall be `second` in current SDKs as we are not yet
+  normalizing values on the server.
+
+For languages with lambda callback patterns (eg: `timed(() => { ... })`):
+
+* `timing(key, ..., callback)`: this shall measure the time of the invoked lambda and
+  emit the measured time as distribution with a time unit (`second`, `millisecond`,
+  etc.).  The default unit shall be `second` in current SDKs as we are not yet
+  normalizing values on the server.
+
+For languages with decorators:
+
+* `@timing(key, ...)` or `@timed(key, ...)`: Can be used to annotate a function
+  so that every time it is invoked, a timing is emitted.
+
+**Aggregation state:**
+
+```python
+class Distribution:
+    def __init__(self, initial):
+        self.values = [initial]
+    def add(self, value):
+        self.values.append(value)
+    def serialize(self):
+        return self.values
+```
+
+**Span aggregation:** the value is added to the span local aggregator.
+
+### Gauges
+
+Gauges (`g`) are generally discouraged.  They are often also called "summaries" as they
+are similar in nature to distributions, but somewhat lossy.  They are however emitted
+per span when metric summaries are enabled.  As they are hard to aggregate across
+different emitters they can be tricky to use properly in practice.
+
+They are emitted by a single function called `gauge`:
+
+* `gauge(key, value: int | float, unit = "none", ...)`: emits a single value into the
+  gauge.
+
+**Aggregation state:**
+
+```python
+class Gauge:
+    def __init__(self, initial):
+        self.last = initial
+        self.min = initial
+        self.max = initial
+        self.sum = initial
+        self.count = 1
+    def add(self, value):
+        self.last = value
+        self.min = min(self.min, value)
+        self.max = max(self.max, value)
+        self.sum += value
+        self.count += 1
+    def serialize(self):
+        return (
+            self.last,
+            self.min,
+            self.max,
+            self.sum,
+            self.count,
+        )
+```
+
+**Span aggregation:** the value is added to the span local aggregator.
+
+### Sets
+
+Sets (`s`) are used to count unique members.  SDKs are supposed to accept integers and
+strings at least, but they are encouraged to use CRC32 to fold them into a 32bit integer.
+As such the aggregation state is partially lossy.
+
+They are emitted with a single method:
+
+* `set(key, value: int | string, unit = "none", ...)`: marks a single value to be in the set.
+
+**Aggregation state:**
+
+```python
+class Set:
+    def __init__(self, initial):
+        self.value = {initial}
+    def add(self, value):
+        self.value.add(value)
+    def serialize(self):
+        def _hash(x):
+            if isinstance(x, str):
+                return crc32(x.encode("utf-8")) & 0xFFFFFFFF
+            return int(x)
+        return [_hash(value) for value in self.value]
+```
+
+**Span aggregation:** the value `1` is added to the span local aggregator for new members.
+
+## Serialization
+
+Metrics are emitted in regular flush intervals (every 10 seconds, with a random offset that
+is detemined once per startup) as envelope items.  The item type is `statsd` and the serialization
+format is statsd compatible.  Unlike normal statsd, more than one value can be emitted separated
+by a colon.  These values are the results of calling `serialize` or similar on the aggregation
+state.
+
+The format is documented as part of relay: [relay_metrics](https://getsentry.github.io/relay/relay_metrics/index.html).
+
+Additionally metric summaries retained on the spans shall be emitted as `_metric_summaries`
+on the metrics as documented by this RFC: [RFC-0123](https://github.com/getsentry/rfcs/blob/main/text/0123-metrics-correlation.md).
+
+## Meta Data
+
+Additionally SDKs are encouraged to capture location information once per metric.  At metric
+emission time the system shall be collecting which code location emitted a metric and retain it.
+For that the same format as for the [`frame`](/sdk/event-payloads/stacktrace/) on the stacktrace
+shall be used.  `stacklevel` number of frames shall be ignored.
+
+## Reference Implementation
+
+The Python SDK holds a reference implementation in the `metrics.py` module for the full
+feature set: [metrics.py](https://github.com/getsentry/sentry-python/blob/master/sentry_sdk/metrics.py).