Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add container metric fields (from ECS) #282

Merged
merged 15 commits into from
Mar 27, 2024
21 changes: 21 additions & 0 deletions .chloggen/add_new_container_metrics.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Use this changelog template to create an entry for release notes.
#
# If your change doesn't affect end users you should instead start
# your pull request title with [chore] or use the "Skip Changelog" label.

# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix'
change_type: "enhancement"

# The name of the area of concern in the attributes-registry, (e.g. http, cloud, db)
component: "container"

# A brief description of the change. Surround your text with quotes ("") if it needs to start with a backtick (`).
note: "Add new container metrics for `cpu`, `memory`, `disk` and `network`"

# Mandatory: One or more tracking issues related to the change. You can use the PR number here if no issue exists.
issues: [282, 72]

# (Optional) One or more lines of additional information to render under the primary note.
# These lines will be padded with 2 spaces and then inserted directly into the document.
# Use pipe (|) for multiline entries.
subtext:
9 changes: 9 additions & 0 deletions docs/attributes-registry/container.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
| `container.command` | string | The command used to run the container (i.e. the command name). [1] | `otelcontribcol` |
| `container.command_args` | string[] | All the command arguments (including the command/executable itself) run by the container. [2] | `[otelcontribcol, --config, config.yaml]` |
| `container.command_line` | string | The full command run by the container as a single string representing the full command. [2] | `otelcontribcol --config config.yaml` |
| `container.cpu.state` | string | The CPU state for this data point. | `user`; `kernel` |
| `container.id` | string | Container ID. Usually a UUID, as for example used to [identify Docker containers](https://docs.docker.com/engine/reference/run/#container-identification). The UUID might be abbreviated. | `a3bf90e006b2` |
| `container.image.id` | string | Runtime specific image identifier. Usually a hash algorithm followed by a UUID. [2] | `sha256:19c92d0a00d1b66d897bceaa7319bee0dd38a10a851c60bcec9474aa3f01e50f` |
| `container.image.name` | string | Name of the image the container was built on. | `gcr.io/opentelemetry/operator` |
Expand All @@ -27,4 +28,12 @@ K8s defines a link to the container registry repository with digest `"imageID":
The ID is assinged by the container runtime and can vary in different environments. Consider using `oci.manifest.digest` if it is important to identify the same image in different environments/runtimes.

**[3]:** [Docker](https://docs.docker.com/engine/api/v1.43/#tag/Image/operation/ImageInspect) and [CRI](https://github.com/kubernetes/cri-api/blob/c75ef5b473bbe2d0a4fc92f82235efd665ea8e9f/pkg/apis/runtime/v1/api.proto#L1237-L1238) report those under the `RepoDigests` field.

`container.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `user` | When tasks of the cgroup are in user mode (Linux). When all container processes are in user mode (Windows). |
| `system` | When CPU is used by the system (host OS) |
| `kernel` | When tasks of the cgroup are in kernel mode (Linux). When all container processes are in kernel mode (Windows). |
<!-- endsemconv -->
105 changes: 105 additions & 0 deletions docs/system/container-metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
<!--- Hugo front matter used to generate the website version of this page:
linkTitle: Container
--->

# Semantic Conventions for Container Metrics

**Status**: [Experimental][DocumentStatus]

## Container Metrics

### Metric: `container.cpu.time`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.cpu.time(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.cpu.time` | Counter | `s` | Total CPU time consumed [1] |

**[1]:** Total CPU time consumed by the specific container on all available CPU cores
<!-- endsemconv -->

<!-- semconv metric.container.cpu.time(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`container.cpu.state`](../attributes-registry/container.md) | string | The CPU state for this data point. A container SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels. | `user`; `kernel` | Opt-In |

`container.cpu.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used, otherwise a custom value MAY be used.

| Value | Description |
|---|---|
| `user` | When tasks of the cgroup are in user mode (Linux). When all container processes are in user mode (Windows). |
| `system` | When CPU is used by the system (host OS) |
| `kernel` | When tasks of the cgroup are in kernel mode (Linux). When all container processes are in kernel mode (Windows). |
<!-- endsemconv -->

### Metric: `container.memory.usage`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.memory.usage(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.memory.usage` | Counter | `By` | Memory usage of the container. [1] |

**[1]:** Memory usage of the container.
<!-- endsemconv -->

<!-- semconv metric.container.memory.usage(full) -->
lmolkova marked this conversation as resolved.
Show resolved Hide resolved
<!-- endsemconv -->

### Metric: `container.disk.io`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.disk.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.disk.io` | Counter | `By` | Disk bytes for the container. [1] |

**[1]:** The total number of bytes read/written successfully (aggregated from all disks).
dmitryax marked this conversation as resolved.
Show resolved Hide resolved
<!-- endsemconv -->

<!-- semconv metric.container.disk.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`disk.io.direction`](../attributes-registry/disk.md) | string | The disk IO operation direction. | `read` | Recommended |
dmitryax marked this conversation as resolved.
Show resolved Hide resolved
| `system.device` | string | The device identifier | `(identifier)` | Recommended |

`disk.io.direction` MUST be one of the following:
lmolkova marked this conversation as resolved.
Show resolved Hide resolved

| Value | Description |
|---|---|
| `read` | read |
| `write` | write |
<!-- endsemconv -->

### Metric: `container.network.io`

This metric is [opt-in][MetricOptIn].

<!-- semconv metric.container.network.io(metric_table) -->
| Name | Instrument Type | Unit (UCUM) | Description |
| -------- | --------------- | ----------- | -------------- |
| `container.network.io` | Counter | `By` | Network bytes for the container. [1] |

**[1]:** The number of bytes sent/received on all network interfaces by the container.
<!-- endsemconv -->

<!-- semconv metric.container.network.io(full) -->
| Attribute | Type | Description | Examples | Requirement Level |
|---|---|---|---|---|
| [`network.io.direction`](../attributes-registry/network.md) | string | The network IO operation direction. | `transmit` | Recommended |
dmitryax marked this conversation as resolved.
Show resolved Hide resolved
| `system.device` | string | The device identifier | `(identifier)` | Recommended |

`network.io.direction` MUST be one of the following:

| Value | Description |
|---|---|
| `transmit` | transmit |
| `receive` | receive |
<!-- endsemconv -->

[DocumentStatus]: https://github.com/open-telemetry/opentelemetry-specification/tree/v1.22.0/specification/document-status.md
[MetricOptIn]: https://github.com/open-telemetry/opentelemetry-specification/blob/v1.26.0/specification/metrics/metric-requirement-level.md#opt-in
53 changes: 53 additions & 0 deletions model/metrics/container.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
groups:
# container.cpu.* metrics and attribute group
- id: metric.container.cpu.time
type: metric
metric_name: container.cpu.time
brief: "Total CPU time consumed"
note: >
Total CPU time consumed by the specific container on all available CPU cores
instrument: counter
unit: "s"
attributes:
- ref: container.cpu.state
brief: "The CPU state for this data point. A container SHOULD be characterized _either_ by data points with no `state` labels, _or only_ data points with `state` labels."
requirement_level: opt_in
AlexanderWert marked this conversation as resolved.
Show resolved Hide resolved

# container.memory.* metrics and attribute group
- id: metric.container.memory.usage
type: metric
metric_name: container.memory.usage
brief: "Memory usage of the container."
note: >
Memory usage of the container.
instrument: counter
unit: "By"

# container.disk.io.* metrics and attribute group
- id: metric.container.disk.io
type: metric
metric_name: container.disk.io
brief: "Disk bytes for the container."
note: >
The total number of bytes read/written
successfully (aggregated from all disks).
instrument: counter
unit: "By"
attributes:
- ref: disk.io.direction
- ref: system.device

# container.network.io.* metrics and attribute group
- id: metric.container.network.io
type: metric
metric_name: container.network.io
brief: "Network bytes for the container."
note: >
The number of bytes sent/received
on all network interfaces
by the container.
instrument: counter
unit: "By"
attributes:
- ref: network.io.direction
- ref: system.device
15 changes: 15 additions & 0 deletions model/registry/container.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -95,3 +95,18 @@ groups:
brief: >
Container labels, `<key>` being the label name, the value being the label value.
examples: [ 'container.label.app=nginx' ]
- id: cpu.state
brief: "The CPU state for this data point."
type:
allow_custom_values: true
members:
- id: user
value: 'user'
brief: "When tasks of the cgroup are in user mode (Linux). When all container processes are in user mode (Windows)."
- id: system
value: 'system'
brief: "When CPU is used by the system (host OS)"
- id: kernel
value: 'kernel'
brief: "When tasks of the cgroup are in kernel mode (Linux). When all container processes are in kernel mode (Windows)."
examples: ["user", "kernel"]
Loading