diff --git a/.chloggen/1482.yaml b/.chloggen/1482.yaml new file mode 100644 index 0000000000..2f56c01780 --- /dev/null +++ b/.chloggen/1482.yaml @@ -0,0 +1,18 @@ +# Use this changelog template to create an entry for release notes. +# +# If your change doesn't affect end users you should instead start +# your pull request title with [chore] or use the "Skip Changelog" label. + +# One of 'breaking', 'deprecation', 'new_component', 'enhancement', 'bug_fix' +change_type: bug_fix +component: db +note: | + Fix telemetry for complex queries: + + - introduce the `db.query.summary` attribute to provide a concise, low-cardinality + representation of the query text. + - use `db.query.summary` as the span name and as a recommended attribute on metrics. + - avoid capturing `db.operation.name` and `db.collection.name` when the query + involves multiple operations or collections, to prevent ambiguity. + +issues: [521, 805, 1159] diff --git a/docs/attributes-registry/db.md b/docs/attributes-registry/db.md index 1def1b8797..34e4ebc837 100644 --- a/docs/attributes-registry/db.md +++ b/docs/attributes-registry/db.md @@ -17,22 +17,34 @@ This group defines the attributes used to describe telemetry in the context of databases. -| Attribute | Type | Description | Examples | Stability | -| ------------------------------------------------------------------------------------------------------------------ | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------- | ---------------------------------------------------------------- | -| `db.client.connection.pool.name` | string | The name of the connection pool; unique within the instrumented application. In case the connection pool implementation doesn't provide a name, instrumentation SHOULD use a combination of parameters that would make the name unique, for example, combining attributes `server.address`, `server.port`, and `db.namespace`, formatted as `server.address:server.port/db.namespace`. Instrumentations that generate connection pool name following different patterns SHOULD document it. | `myDataSource` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.client.connection.state` | string | The state of a connection in the pool | `idle` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.collection.name` | string | The name of a collection (table, container) within the database. [1] | `public.users`; `customers` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.namespace` | string | The name of the database, fully qualified within the server address and port. [2] | `customers`; `test.users` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.operation.batch.size` | int | The number of queries included in a batch operation. [3] | `2`; `3`; `4` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.operation.name` | string | The name of the operation or command being executed. [4] | `findAndModify`; `HMSET`; `SELECT` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.query.parameter.` | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [5] | `someval`; `55` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.query.text` | string | The database query being executed. [6] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.response.status_code` | string | Database response status code. [7] | `102`; `ORA-17002`; `08P01`; `404` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.system` | string | The database management system (DBMS) product as identified by the client instrumentation. [8] | `other_sql`; `adabas`; `cache` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| Attribute | Type | Description | Examples | Stability | +| ------------------------------------------------------------------------------------------------------------------ | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------- | ---------------------------------------------------------------- | +| `db.client.connection.pool.name` | string | The name of the connection pool; unique within the instrumented application. In case the connection pool implementation doesn't provide a name, instrumentation SHOULD use a combination of parameters that would make the name unique, for example, combining attributes `server.address`, `server.port`, and `db.namespace`, formatted as `server.address:server.port/db.namespace`. Instrumentations that generate connection pool name following different patterns SHOULD document it. | `myDataSource` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.client.connection.state` | string | The state of a connection in the pool | `idle` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.collection.name` | string | The name of a collection (table, container) within the database. [1] | `public.users`; `customers` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.namespace` | string | The name of the database, fully qualified within the server address and port. [2] | `customers`; `test.users` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.operation.batch.size` | int | The number of queries included in a batch operation. [3] | `2`; `3`; `4` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.operation.name` | string | The name of the operation or command being executed. [4] | `findAndModify`; `HMSET`; `SELECT` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.query.parameter.` | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [5] | `someval`; `55` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.query.summary` | string | Low cardinality representation of a database query text. [6] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.query.text` | string | The database query being executed. [7] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.response.status_code` | string | Database response status code. [8] | `102`; `ORA-17002`; `08P01`; `404` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.system` | string | The database management system (DBMS) product as identified by the client instrumentation. [9] | `other_sql`; `adabas`; `cache` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. **[2]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid. @@ -43,25 +55,39 @@ This attribute has stability level RELEASE CANDIDATE. **[3]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[4]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. -For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. +**[4]:** It is RECOMMENDED to capture the value as provided by the application +without attempting to do any case normalization. + +A single database query may involve multiple operations. If the operation +name is parsed from the query text, it SHOULD only be captured for queries that +contain a single operation or when the operation name describing the +whole query is available by other means. + +For batch operations, if the individual operations are known to have the same operation name +then that operation name SHOULD be used prepended by `BATCH `, +otherwise `db.operation.name` SHOULD be `BATCH` or some other database +system specific term if more applicable. + This attribute has stability level RELEASE CANDIDATE. **[5]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. -**[6]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[6]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[7]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[7]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. +**[8]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. This attribute has stability level RELEASE CANDIDATE. -**[8]:** The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the `db.system` is set to `postgresql` based on the instrumentation's best knowledge. +**[9]:** The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the `db.system` is set to `postgresql` based on the instrumentation's best knowledge. This attribute has stability level RELEASE CANDIDATE. `db.client.connection.state` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. @@ -206,9 +232,9 @@ This group defines attributes for Elasticsearch. | Attribute | Type | Description | Examples | Stability | | --------------------------------------------------------------------------------------------------------------- | ------ | -------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | ---------------------------------------------------------------- | | `db.elasticsearch.node.name` | string | Represents the human-readable identifier of the node/instance to which a request was routed. | `instance-0000000001` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| `db.elasticsearch.path_parts.` | string | A dynamic value in the url path. [9] | `db.elasticsearch.path_parts.index=test-index`; `db.elasticsearch.path_parts.doc_id=123` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| `db.elasticsearch.path_parts.` | string | A dynamic value in the url path. [10] | `db.elasticsearch.path_parts.index=test-index`; `db.elasticsearch.path_parts.doc_id=123` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -**[9]:** Many Elasticsearch url paths allow dynamic values. These SHOULD be recorded in span attributes in the format `db.elasticsearch.path_parts.`, where `` is the url path part name. The implementation SHOULD reference the [elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json) in order to map the path part values to their names. +**[10]:** Many Elasticsearch url paths allow dynamic values. These SHOULD be recorded in span attributes in the format `db.elasticsearch.path_parts.`, where `` is the url path part name. The implementation SHOULD reference the [elasticsearch schema](https://raw.githubusercontent.com/elastic/elasticsearch-specification/main/output/schema/schema.json) in order to map the path part values to their names. ## Deprecated Database Attributes diff --git a/docs/database/cassandra.md b/docs/database/cassandra.md index 505819847b..99d4fd6e08 100644 --- a/docs/database/cassandra.md +++ b/docs/database/cassandra.md @@ -34,30 +34,52 @@ The Semantic Conventions for [Cassandra](https://cassandra.apache.org/) extend a | [`db.cassandra.page_size`](/docs/attributes-registry/db.md) | int | The fetch size used for paging, i.e. how many rows will be returned at once. | `5000` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.cassandra.speculative_execution_count`](/docs/attributes-registry/db.md) | int | The number of times a query was speculatively executed. Not set or `0` if the query was not executed speculatively. | `0`; `2` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [11] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [12] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [14] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [12] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [14] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [15] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [16] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [17] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [18] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[2]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[3]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid. Semantic conventions for individual database systems SHOULD document what `db.namespace` means in the context of that system. It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. This attribute has stability level RELEASE CANDIDATE. -**[4]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. -For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. +**[4]:** It is RECOMMENDED to capture the value as provided by the application +without attempting to do any case normalization. + +A single database query may involve multiple operations. If the operation +name is parsed from the query text, it SHOULD only be captured for queries that +contain a single operation or when the operation name describing the +whole query is available by other means. + +For batch operations, if the individual operations are known to have the same operation name +then that operation name SHOULD be used prepended by `BATCH `, +otherwise `db.operation.name` SHOULD be `BATCH` or some other database +system specific term if more applicable. + This attribute has stability level RELEASE CANDIDATE. -**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[5]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[6]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. @@ -76,18 +98,25 @@ Instrumentations SHOULD document how `error.type` is populated. **[11]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[12]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[12]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[13]:** if readily available or if instrumentation supports query summarization. + +**[14]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[13]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[15]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[14]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. +**[16]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[17]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[18]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -97,6 +126,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/docs/database/cosmosdb.md b/docs/database/cosmosdb.md index 2c93330c50..fde6226532 100644 --- a/docs/database/cosmosdb.md +++ b/docs/database/cosmosdb.md @@ -38,10 +38,11 @@ Cosmos DB instrumentation includes call-level (public API) surface spans and net | [`db.cosmosdb.client_id`](/docs/attributes-registry/db.md) | string | Unique Cosmos client instance id. | `3ba4827d-4422-483f-b59f-85b74211c11d` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.cosmosdb.request_content_length`](/docs/attributes-registry/db.md) | int | Request payload size in bytes | | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [10] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [12] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`user_agent.original`](/docs/attributes-registry/user-agent.md) | string | Full user-agent string is generated by Cosmos DB SDK [13] | `cosmos-netstandard-sdk/3.23.0\|3.23.1\|1\|X64\|Linux 5.4.0-1098-azure 104 18\|.NET Core 3.1.32\|S\|` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [10] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [11] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [12] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [14] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`user_agent.original`](/docs/attributes-registry/user-agent.md) | string | Full user-agent string is generated by Cosmos DB SDK [15] | `cosmos-netstandard-sdk/3.23.0\|3.23.1\|1\|X64\|Linux 5.4.0-1098-azure 104 18\|.NET Core 3.1.32\|S\|` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. @@ -171,7 +172,7 @@ If none of them applies, it's RECOMMENDED to use language-agnostic representatio client method name in snake_case. Instrumentations SHOULD document additional values when introducing new operations. -**[4]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[4]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[5]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. @@ -188,20 +189,27 @@ Instrumentations SHOULD document how `error.type` is populated. **[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[10]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[10]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[11]:** if readily available or if instrumentation supports query summarization. + +**[12]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[11]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[13]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[12]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[14]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[13]:** The user-agent value is generated by SDK which is a combination of
`sdk_version` : Current version of SDK. e.g. 'cosmos-netstandard-sdk/3.23.0'
`direct_pkg_version` : Direct package version used by Cosmos DB SDK. e.g. '3.23.1'
`number_of_client_instances` : Number of cosmos client instances created by the application. e.g. '1'
`type_of_machine_architecture` : Machine architecture. e.g. 'X64'
`operating_system` : Operating System. e.g. 'Linux 5.4.0-1098-azure 104 18'
`runtime_framework` : Runtime Framework. e.g. '.NET Core 3.1.32'
`failover_information` : Generated key to determine if region failover enabled. +**[15]:** The user-agent value is generated by SDK which is a combination of
`sdk_version` : Current version of SDK. e.g. 'cosmos-netstandard-sdk/3.23.0'
`direct_pkg_version` : Direct package version used by Cosmos DB SDK. e.g. '3.23.1'
`number_of_client_instances` : Number of cosmos client instances created by the application. e.g. '1'
`type_of_machine_architecture` : Machine architecture. e.g. 'X64'
`operating_system` : Operating System. e.g. 'Linux 5.4.0-1098-azure 104 18'
`runtime_framework` : Runtime Framework. e.g. '.NET Core 3.1.32'
`failover_information` : Generated key to determine if region failover enabled. Format Reg-{D (Disabled discovery)}-S(application region)|L(List of preferred regions)|N(None, user did not configure it). Default value is "NS". -**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -211,6 +219,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/docs/database/couchdb.md b/docs/database/couchdb.md index 3da577dde2..8e38024767 100644 --- a/docs/database/couchdb.md +++ b/docs/database/couchdb.md @@ -22,35 +22,33 @@ The Semantic Conventions for [CouchDB](https://couchdb.apache.org/) extend and o | Attribute | Type | Description | Examples | [Requirement Level](https://opentelemetry.io/docs/specs/semconv/general/attribute-requirement-level/) | Stability | |---|---|---|---|---|---| | [`db.namespace`](/docs/attributes-registry/db.md) | string | The name of the database, fully qualified within the server address and port. | `customers`; `test.users` | `Conditionally Required` If available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The HTTP method + the target REST route. [1] | `GET /{db}/{docid}` | `Conditionally Required` [2] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | The HTTP response code returned by the Couch DB. [3] | `200`; `201`; `429` | `Conditionally Required` [4] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [5] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [6] | `80`; `8080`; `443` | `Conditionally Required` [7] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [8] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The HTTP method + the target REST route. [1] | `GET /{db}/{docid}` | `Conditionally Required` If readily available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | The HTTP response code returned by the Couch DB. [2] | `200`; `201`; `429` | `Conditionally Required` [3] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [4] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [5] | `80`; `8080`; `443` | `Conditionally Required` [6] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [7] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [8] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** In **CouchDB**, `db.operation.name` should be set to the HTTP method + the target REST route according to the API reference documentation. For example, when retrieving a document, `db.operation.name` would be set to (literally, i.e., without replacing the placeholders with concrete values): [`GET /{db}/{docid}`](https://docs.couchdb.org/en/stable/api/document/common.html#get--db-docid). -**[2]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. - -**[3]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. +**[2]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. This attribute has stability level RELEASE CANDIDATE. -**[4]:** If response was received and the HTTP response code is available. +**[3]:** If response was received and the HTTP response code is available. -**[5]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. +**[4]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred. Instrumentations SHOULD document how `error.type` is populated. -**[6]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[5]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[7]:** If using a port other than the default port for this DBMS and if `server.address` is set. +**[6]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[8]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. +**[7]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[9]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[8]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. The following attributes can be important for making sampling decisions and SHOULD be provided **at span creation time** (if provided at all): diff --git a/docs/database/database-metrics.md b/docs/database/database-metrics.md index 30056e1c13..f27d05bbdf 100644 --- a/docs/database/database-metrics.md +++ b/docs/database/database-metrics.md @@ -88,31 +88,54 @@ of `[ 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5, 10 ]`. | [`db.response.status_code`](/docs/attributes-registry/db.md) | string | Database response status code. [7] | `102`; `ORA-17002`; `08P01`; `404` | `Conditionally Required` [8] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [9] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [10] | `80`; `8080`; `443` | `Conditionally Required` [11] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [12] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [12] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [13] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [14] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` If and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [16] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the `db.system` is set to `postgresql` based on the instrumentation's best knowledge. This attribute has stability level RELEASE CANDIDATE. **[2]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[3]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name in the query. +**[3]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[4]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid. Semantic conventions for individual database systems SHOULD document what `db.namespace` means in the context of that system. It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. This attribute has stability level RELEASE CANDIDATE. -**[5]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. -For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. +**[5]:** It is RECOMMENDED to capture the value as provided by the application +without attempting to do any case normalization. + +A single database query may involve multiple operations. If the operation +name is parsed from the query text, it SHOULD only be captured for queries that +contain a single operation or when the operation name describing the +whole query is available by other means. + +For batch operations, if the individual operations are known to have the same operation name +then that operation name SHOULD be used prepended by `BATCH `, +otherwise `db.operation.name` SHOULD be `BATCH` or some other database +system specific term if more applicable. + This attribute has stability level RELEASE CANDIDATE. -**[6]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[6]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[7]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. @@ -128,10 +151,26 @@ Instrumentations SHOULD document how `error.type` is populated. **[11]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[12]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly. +**[12]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[13]:** if readily available or if instrumentation supports query summarization. + +**[14]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly. If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. + +**[16]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. +Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. +This attribute has stability level RELEASE CANDIDATE. + +The following attributes can be important for making sampling decisions +and SHOULD be provided **at span creation time** (if provided at all): + +* [`db.collection.name`](/docs/attributes-registry/db.md) `db.system` has the following list of well-known values. If one of them applies, then the respective value MUST be used; otherwise, a custom value MAY be used. diff --git a/docs/database/database-spans.md b/docs/database/database-spans.md index da5b18579d..4e1996d59a 100644 --- a/docs/database/database-spans.md +++ b/docs/database/database-spans.md @@ -14,6 +14,7 @@ linkTitle: Client Calls - [Common attributes](#common-attributes) - [Notes and well-known identifiers for `db.system`](#notes-and-well-known-identifiers-for-dbsystem) - [Sanitization of `db.query.text`](#sanitization-of-dbquerytext) +- [Generating a summary of the query text](#generating-a-summary-of-the-query-text) - [Semantic Conventions for specific database technologies](#semantic-conventions-for-specific-database-technologies) @@ -54,13 +55,16 @@ with all retries. Database spans MUST follow the overall [guidelines for span names](https://github.com/open-telemetry/opentelemetry-specification/tree/v1.37.0/specification/trace/api.md#span). +The **span name** SHOULD be `{db.query.summary}` if a summary is available. + -The **span name** SHOULD be `{db.operation.name} {target}` if there is a -(low-cardinality) `{db.operation.name}` available (see below for the exact definition of the [`{target}`](#target-placeholder) placeholder). +If no summary is available, the span name SHOULD be `{db.operation.name} {target}` +provided that a (low-cardinality) `db.operation.name` is available (see below for +the exact definition of the [`{target}`](#target-placeholder) placeholder). -If there is no (low-cardinality) `db.operation.name` available, database span names -SHOULD be [`{target}`](#target-placeholder). +If a (low-cardinality) `db.operation.name` is not available, database span names +SHOULD default to the [`{target}`](#target-placeholder). If neither `{db.operation.name}` nor `{target}` are available, span name SHOULD be `{db.system}`. @@ -98,33 +102,55 @@ These attributes will usually be the same for all operations performed over the | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [9] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [10] | `80`; `8080`; `443` | `Conditionally Required` [11] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [12] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [15] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [13] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [15] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [16] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`network.peer.address`](/docs/attributes-registry/network.md) | string | Peer address of the database node where the operation was performed. [17] | `10.1.2.80`; `/tmp/my.sock` | `Recommended` If applicable for this database system. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`network.peer.port`](/docs/attributes-registry/network.md) | int | Peer port number of the network connection. | `65123` | `Recommended` if and only if `network.peer.address` is set. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [16] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [17] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [18] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [19] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** The actual DBMS may differ from the one identified by the client. For example, when using PostgreSQL client libraries to connect to a CockroachDB, the `db.system` is set to `postgresql` based on the instrumentation's best knowledge. This attribute has stability level RELEASE CANDIDATE. **[2]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[3]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[3]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[4]:** If a database system has multiple namespace components, they SHOULD be concatenated (potentially using database system specific conventions) from most general to most specific namespace component, and more specific namespaces SHOULD NOT be captured without the more general namespaces, to ensure that "startswith" queries for the more general namespaces will be valid. Semantic conventions for individual database systems SHOULD document what `db.namespace` means in the context of that system. It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. This attribute has stability level RELEASE CANDIDATE. -**[5]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. -For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. +**[5]:** It is RECOMMENDED to capture the value as provided by the application +without attempting to do any case normalization. + +A single database query may involve multiple operations. If the operation +name is parsed from the query text, it SHOULD only be captured for queries that +contain a single operation or when the operation name describing the +whole query is available by other means. + +For batch operations, if the individual operations are known to have the same operation name +then that operation name SHOULD be used prepended by `BATCH `, +otherwise `db.operation.name` SHOULD be `BATCH` or some other database +system specific term if more applicable. + This attribute has stability level RELEASE CANDIDATE. -**[6]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[6]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[7]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. @@ -143,19 +169,26 @@ Instrumentations SHOULD document how `error.type` is populated. **[12]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[13]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[14]:** if readily available or if instrumentation supports query summarization. + +**[15]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[14]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[16]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[15]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly. +**[17]:** Semantic conventions for individual database systems SHOULD document whether `network.peer.*` attributes are applicable. Network peer address and port are useful when the application interacts with individual database nodes directly. If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. -**[16]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[18]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[17]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[19]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -165,6 +198,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`db.system`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) @@ -268,6 +302,105 @@ Placeholders in a parameterized query SHOULD not be sanitized. E.g. `where id = e.g. from `IN (?, ?, ?, ?)` to `IN (?)`, as this can help with extremely long IN-clauses, and can help control cardinality for users who choose to (optionally) add `db.query.text` to their metric attributes. +## Generating a summary of the query text + +The `db.query.summary` attribute captures a shortened representation of a query text +which SHOULD have low-cardinality and SHOULD NOT contain any dynamic or sensitive data. + +> [!NOTE] +> The `db.query.text` attribute is intended to identify individual queries. Even though +> it is sanitized if captured by default, it could still have high cardinality and +> might reach hundreds of lines. +> +> The `db.query.summary` is intended to provide a less granular grouping key that +> can be used as a span name or a metric attribute in common cases. It SHOULD +> only contain information that has a significant impact on the query, database, +> or application performance. + +Instrumentations that support query parsing SHOULD generate a query summary when +one is not readily available from other sources. + +The summary SHOULD preserve the following parts of query in the order they were provided: + +- operations such as SQL SELECT, INSERT, UPDATE, DELETE, and other commands +- operation targets such as collections and database names + +Instrumentations that support query parsing SHOULD parse the query and extract a +list of operations and targets from the query. It SHOULD set `db.query.summary` +attribute to the value formatted in the following way: + +``` +{operation1} {target1} {operation2} {target2} {target3} ... +```` + +Instrumentations SHOULD capture the values of operations and targets as provided +by the application without attempting to do any case normalization. If the operation +and target value is populated on `db.operation.name`, `db.collection.name`, +or other attributes, it SHOULD match the value used in the `db.query.summary`. + +**Examples**: + +- Query that consist of a single operation: + + ```sql + SELECT * + FROM wuser_table + WHERE username = ? + ``` + + the corresponding `db.query.summary` is `SELECT wuser_table`. + +- Query that performs multiple operations: + + ```sql + INSERT INTO shipping_details + (order_id, + address) + SELECT order_id, + address + FROM orders + WHERE order_id = ? + ``` + + the corresponding `db.query.summary` is `INSERT shipping_details SELECT orders`. + +- Query that performs an operation that's applied to multiple collections: + + ```sql + SELECT * + FROM songs, + artists + WHERE songs.artist_id == artists.id + ``` + + the corresponding `db.query.summary` is `SELECT songs artists`. + +- Query that performs an operation on an anonymous table: + + ```sql + SELECT order_date + FROM (SELECT * + FROM orders o + JOIN customers c + ON o.customer_id = c.customer_id) + ``` + + the corresponding `db.query.summary` is `SELECT SELECT orders customers`. + +- Query that performs an operation on multiple collections with double-quotes or other punctuation: + + ```sql + SELECT * + FROM "song list", + 'artists' + ``` + + the corresponding `db.query.summary` is `SELECT "songs list" 'artists'`. + +Semantic conventions for individual database systems or specialized instrumentations +MAY specify a different `db.query.summary` format as long as produced summary remains +relatively short and its cardinality remains low comparing to the `db.query.text`. + ## Semantic Conventions for specific database technologies More specific Semantic Conventions are defined for the following database technologies: diff --git a/docs/database/hbase.md b/docs/database/hbase.md index ffc09e0651..e7a9f5921a 100644 --- a/docs/database/hbase.md +++ b/docs/database/hbase.md @@ -23,12 +23,12 @@ The Semantic Conventions for [HBase](https://hbase.apache.org/) extend and overr |---|---|---|---|---|---| | [`db.collection.name`](/docs/attributes-registry/db.md) | string | The HBase table name. [1] | `mytable`; `ns:table` | `Conditionally Required` If applicable. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.namespace`](/docs/attributes-registry/db.md) | string | The HBase namespace. [2] | `mynamespace` | `Conditionally Required` If applicable. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the operation or command being executed. [3] | `findAndModify`; `HMSET`; `SELECT` | `Conditionally Required` [4] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | Protocol-specific response code recorded as string. [5] | `200`; `409`; `14` | `Conditionally Required` If response was received. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [6] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Conditionally Required` [8] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [10] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the operation or command being executed. [3] | `findAndModify`; `HMSET`; `SELECT` | `Conditionally Required` If readily available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | Protocol-specific response code recorded as string. [4] | `200`; `409`; `14` | `Conditionally Required` If response was received. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [5] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [6] | `80`; `8080`; `443` | `Conditionally Required` [7] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [8] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** If table name includes the namespace, the `db.collection.name` SHOULD be set to the full table name. @@ -37,29 +37,37 @@ Semantic conventions for individual database systems SHOULD document what `db.na It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. This attribute has stability level RELEASE CANDIDATE. -**[3]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. -For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. -This attribute has stability level RELEASE CANDIDATE. +**[3]:** It is RECOMMENDED to capture the value as provided by the application +without attempting to do any case normalization. + +A single database query may involve multiple operations. If the operation +name is parsed from the query text, it SHOULD only be captured for queries that +contain a single operation or when the operation name describing the +whole query is available by other means. -**[4]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +For batch operations, if the individual operations are known to have the same operation name +then that operation name SHOULD be used prepended by `BATCH `, +otherwise `db.operation.name` SHOULD be `BATCH` or some other database +system specific term if more applicable. + +This attribute has stability level RELEASE CANDIDATE. -**[5]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. +**[4]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. This attribute has stability level RELEASE CANDIDATE. -**[6]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. +**[5]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred. Instrumentations SHOULD document how `error.type` is populated. -**[7]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[6]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[8]:** If using a port other than the default port for this DBMS and if `server.address` is set. +**[7]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. +**[8]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[10]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[9]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. The following attributes can be important for making sampling decisions and SHOULD be provided **at span creation time** (if provided at all): diff --git a/docs/database/mariadb.md b/docs/database/mariadb.md index 894665d15b..fc20d6d57a 100644 --- a/docs/database/mariadb.md +++ b/docs/database/mariadb.md @@ -28,16 +28,28 @@ The Semantic Conventions for *MariaDB* extend and override the [Database Semanti | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [11] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[2]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[3]:** A connection's currently associated database may change during its lifetime, e.g. from executing `USE `. @@ -52,7 +64,7 @@ It is RECOMMENDED to capture the value as provided by the application without at **[4]:** This SHOULD be the SQL command such as `SELECT`, `INSERT`, `UPDATE`, `CREATE`, `DROP`. In the case of `EXEC`, this SHOULD be the stored procedure name that is being executed. -**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[5]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[6]:** SQL defines [SQLSTATE](https://wikipedia.org/wiki/SQLSTATE) as a database return code which is adopted by some database systems like PostgreSQL. @@ -101,16 +113,23 @@ Instrumentations SHOULD document how `error.type` is populated. **[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[12]:** if readily available or if instrumentation supports query summarization. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[14]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -120,6 +139,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/docs/database/mongodb.md b/docs/database/mongodb.md index 9f25c62276..05939fca5b 100644 --- a/docs/database/mongodb.md +++ b/docs/database/mongodb.md @@ -23,40 +23,49 @@ The Semantic Conventions for [MongoDB](https://www.mongodb.com/) extend and over |---|---|---|---|---|---| | [`db.collection.name`](/docs/attributes-registry/db.md) | string | The MongoDB collection being accessed within the database stated in `db.namespace`. [1] | `public.users`; `customers` | `Required` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | | [`db.namespace`](/docs/attributes-registry/db.md) | string | The MongoDB database name. | `customers`; `test.users` | `Conditionally Required` If available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the command being executed. [2] | `findAndModify`; `getMore`; `update` | `Conditionally Required` [3] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [MongoDB error code](https://www.mongodb.com/docs/manual/reference/error-codes/) represented as a string. [4] | `36`; `11602` | `Conditionally Required` [5] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [6] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [7] | `80`; `8080`; `443` | `Conditionally Required` [8] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [9] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [10] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.name`](/docs/attributes-registry/db.md) | string | The name of the command being executed. [2] | `findAndModify`; `getMore`; `update` | `Conditionally Required` If readily available. | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.response.status_code`](/docs/attributes-registry/db.md) | string | [MongoDB error code](https://www.mongodb.com/docs/manual/reference/error-codes/) represented as a string. [3] | `36`; `11602` | `Conditionally Required` [4] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [5] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [6] | `80`; `8080`; `443` | `Conditionally Required` [7] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [8] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [9] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. **[2]:** See [MongoDB database commands](https://www.mongodb.com/docs/manual/reference/command/). -**[3]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. - -**[4]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. +**[3]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. This attribute has stability level RELEASE CANDIDATE. -**[5]:** If the operation failed and error code is available. +**[4]:** If the operation failed and error code is available. -**[6]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. +**[5]:** The `error.type` SHOULD match the `db.response.status_code` returned by the database or the client library, or the canonical name of exception that occurred. When using canonical exception type name, instrumentation SHOULD do the best effort to report the most relevant type. For example, if the original exception is wrapped into a generic one, the original exception SHOULD be preferred. Instrumentations SHOULD document how `error.type` is populated. -**[7]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. +**[6]:** When observed from the client side, and when communicating through an intermediary, `server.port` SHOULD represent the server port behind any intermediaries, for example proxies, if it's available. -**[8]:** If using a port other than the default port for this DBMS and if `server.address` is set. +**[7]:** If using a port other than the default port for this DBMS and if `server.address` is set. -**[9]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. +**[8]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[10]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[9]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. The following attributes can be important for making sampling decisions and SHOULD be provided **at span creation time** (if provided at all): diff --git a/docs/database/mssql.md b/docs/database/mssql.md index ac80b7125b..46d8c09e18 100644 --- a/docs/database/mssql.md +++ b/docs/database/mssql.md @@ -28,16 +28,28 @@ The Semantic Conventions for the *Microsoft SQL Server* extend and override the | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [11] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[2]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[3]:** When connected to a default instance, `db.namespace` SHOULD be set to the name of the database. When connected to a [named instance](https://learn.microsoft.com/sql/connect/jdbc/building-the-connection-url#named-and-multiple-sql-server-instances), @@ -56,7 +68,7 @@ It is RECOMMENDED to capture the value as provided by the application without at **[4]:** This SHOULD be the SQL command such as `SELECT`, `INSERT`, `UPDATE`, `CREATE`, `DROP`. In the case of `EXEC`, this SHOULD be the stored procedure name that is being executed. -**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[5]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[6]:** Microsoft SQL Server does not report SQLSTATE. @@ -71,16 +83,23 @@ Instrumentations SHOULD document how `error.type` is populated. **[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[12]:** if readily available or if instrumentation supports query summarization. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[14]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -90,6 +109,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/docs/database/mysql.md b/docs/database/mysql.md index a52bd8211c..58550b31a1 100644 --- a/docs/database/mysql.md +++ b/docs/database/mysql.md @@ -28,16 +28,28 @@ The Semantic Conventions for *MySQL* extend and override the [Database Semantic | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [11] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[2]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[3]:** A connection's currently associated database may change during its lifetime, e.g. from executing `USE `. @@ -52,7 +64,7 @@ It is RECOMMENDED to capture the value as provided by the application without at **[4]:** This SHOULD be the SQL command such as `SELECT`, `INSERT`, `UPDATE`, `CREATE`, `DROP`. In the case of `EXEC`, this SHOULD be the stored procedure name that is being executed. -**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[5]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[6]:** SQL defines [SQLSTATE](https://wikipedia.org/wiki/SQLSTATE) as a database return code which is adopted by some database systems like PostgreSQL. @@ -101,16 +113,23 @@ Instrumentations SHOULD document how `error.type` is populated. **[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[12]:** if readily available or if instrumentation supports query summarization. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[14]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -120,6 +139,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/docs/database/postgresql.md b/docs/database/postgresql.md index cb394a781b..1481e4d302 100644 --- a/docs/database/postgresql.md +++ b/docs/database/postgresql.md @@ -28,16 +28,28 @@ The Semantic Conventions for *PostgreSQL* extend and override the [Database Sema | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [11] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[2]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[3]:** `db.namespace` SHOULD be set to the combination of database and schema name following the `{database}.{schema}` pattern. @@ -59,7 +71,7 @@ It is RECOMMENDED to capture the value as provided by the application without at **[4]:** This SHOULD be the SQL command such as `SELECT`, `INSERT`, `UPDATE`, `CREATE`, `DROP`. In the case of `EXEC`, this SHOULD be the stored procedure name that is being executed. -**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[5]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[6]:** SQL defines [SQLSTATE](https://wikipedia.org/wiki/SQLSTATE) as a database return code which is adopted by some database systems like PostgreSQL. @@ -108,16 +120,23 @@ Instrumentations SHOULD document how `error.type` is populated. **[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[12]:** if readily available or if instrumentation supports query summarization. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[14]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -127,6 +146,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.namespace`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/docs/database/redis.md b/docs/database/redis.md index f7478e9710..cc09b36948 100644 --- a/docs/database/redis.md +++ b/docs/database/redis.md @@ -41,12 +41,22 @@ then it is RECOMMENDED to fallback and use the database index provided when the Instrumentation SHOULD document if `db.namespace` reflects the database index provided when the connection was established. -**[2]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. -For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. +**[2]:** It is RECOMMENDED to capture the value as provided by the application +without attempting to do any case normalization. + +A single database query may involve multiple operations. If the operation +name is parsed from the query text, it SHOULD only be captured for queries that +contain a single operation or when the operation name describing the +whole query is available by other means. + +For batch operations, if the individual operations are known to have the same operation name +then that operation name SHOULD be used prepended by `BATCH `, +otherwise `db.operation.name` SHOULD be `BATCH` or some other database +system specific term if more applicable. + This attribute has stability level RELEASE CANDIDATE. -**[3]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[3]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[4]:** The status code returned by the database. Usually it represents an error code, but may also represent partial success, warning, or differentiate between various types of successful outcomes. Semantic conventions for individual database systems SHOULD document what `db.response.status_code` means in the context of that system. @@ -68,6 +78,7 @@ This attribute has stability level RELEASE CANDIDATE. **[10]:** For **Redis**, the value provided for `db.query.text` SHOULD correspond to the syntax of the Redis CLI. If, for example, the [`HMSET` command](https://redis.io/commands/hmset) is invoked, `"HMSET myhash field1 'Hello' field2 'World'"` would be a suitable value for `db.query.text`. **[11]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. +See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). **[12]:** If a database operation involved multiple network calls (for example retries), the address of the last contacted node SHOULD be used. diff --git a/docs/database/sql.md b/docs/database/sql.md index 1a95d04e92..bb759388fb 100644 --- a/docs/database/sql.md +++ b/docs/database/sql.md @@ -52,16 +52,28 @@ Instrumentations applied to generic SQL drivers SHOULD adhere to SQL semantic co | [`error.type`](/docs/attributes-registry/error.md) | string | Describes a class of error the operation ended with. [7] | `timeout`; `java.net.UnknownHostException`; `server_certificate_invalid`; `500` | `Conditionally Required` If and only if the operation failed. | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`server.port`](/docs/attributes-registry/server.md) | int | Server port number. [8] | `80`; `8080`; `443` | `Conditionally Required` [9] | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | | [`db.operation.batch.size`](/docs/attributes-registry/db.md) | int | The number of queries included in a batch operation. [10] | `2`; `3`; `4` | `Recommended` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [11] | `SELECT * FROM wuser_table where username = ?`; `SET mykey "WuValue"` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | -| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [13] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | -| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [14] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.summary`](/docs/attributes-registry/db.md) | string | Low cardinality representation of a database query text. [11] | `SELECT wuser_table`; `INSERT shipping_details SELECT orders`; `get user by id` | `Recommended` [12] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`db.query.text`](/docs/attributes-registry/db.md) | string | The database query being executed. [13] | `SELECT * FROM wuser_table where username = ?`; `SET mykey ?` | `Recommended` [14] | ![Experimental](https://img.shields.io/badge/-experimental-blue) | +| [`server.address`](/docs/attributes-registry/server.md) | string | Name of the database host. [15] | `example.com`; `10.1.2.80`; `/tmp/my.sock` | `Recommended` | ![Stable](https://img.shields.io/badge/-stable-lightgreen) | +| [`db.query.parameter.`](/docs/attributes-registry/db.md) | string | A query parameter used in `db.query.text`, with `` being the parameter name, and the attribute value being a string representation of the parameter value. [16] | `someval`; `55` | `Opt-In` | ![Experimental](https://img.shields.io/badge/-experimental-blue) | **[1]:** It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. -If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query and it SHOULD match the value provided in the query text including any schema and database name prefix. -For batch operations, if the individual operations are known to have the same collection name then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + +A single database query may involve multiple collections. + +If the collection name is parsed from the query text, it SHOULD only be captured for queries that +contain a single collection and it SHOULD match the value provided in +the query text including any schema and database name prefix. + +For batch operations, if the individual operations are known to have the same collection name +then that collection name SHOULD be used. + +If the operation or query involves multiple collections, `db.collection.name` +SHOULD NOT be captured. + This attribute has stability level RELEASE CANDIDATE. -**[2]:** If readily available. The collection name MAY be parsed from the query text, in which case it SHOULD be the first collection name found in the query. +**[2]:** If readily available and if a database call is performed on a single collection. The collection name MAY be parsed from the query text, in which case it SHOULD be the single collection name in the query. **[3]:** If a database system has multiple namespace components (e.g. schema name and database name), they SHOULD be concatenated (potentially using database system specific conventions) from most general to most @@ -84,7 +96,7 @@ It is RECOMMENDED to capture the value as provided by the application without at **[4]:** This SHOULD be the SQL command such as `SELECT`, `INSERT`, `UPDATE`, `CREATE`, `DROP`. In the case of `EXEC`, this SHOULD be the stored procedure name that is being executed. -**[5]:** If readily available. The operation name MAY be parsed from the query text, in which case it SHOULD be the first operation name found in the query. +**[5]:** If readily available and if there is a single operation name that describes the database call. The operation name MAY be parsed from the query text, in which case it SHOULD be the single operation name found in the query. **[6]:** SQL defines [SQLSTATE](https://wikipedia.org/wiki/SQLSTATE) as a database return code which is adopted by some database systems like PostgreSQL. @@ -133,16 +145,23 @@ Instrumentations SHOULD document how `error.type` is populated. **[10]:** Operations are only considered batches when they contain two or more operations, and so `db.operation.batch.size` SHOULD never be `1`. This attribute has stability level RELEASE CANDIDATE. -**[11]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[11]:** `db.query.summary` provides static summary of the query text. It describes a class of database queries and is useful as a grouping key, especially when analyzing telemetry for database calls involving complex queries. +Summary may be available to the instrumentation through instrumentation hooks or other means. If it is not available, instrumentations that support query parsing SHOULD generate a summary following [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) section. +This attribute has stability level RELEASE CANDIDATE. + +**[12]:** if readily available or if instrumentation supports query summarization. + +**[13]:** For sanitization see [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). For batch operations, if the individual operations are known to have the same query text then that query text SHOULD be used, otherwise all of the individual query texts SHOULD be concatenated with separator `; ` or some other database system specific separator if more applicable. Even though parameterized query text can potentially have sensitive data, by using a parameterized query the user is giving a strong signal that any sensitive data will be passed as parameter values, and the benefit to observability of capturing the static part of the query text by default outweighs the risk. This attribute has stability level RELEASE CANDIDATE. -**[12]:** SHOULD be collected by default only if there is sanitization that excludes sensitive information. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +**[14]:** Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). +Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). -**[13]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. +**[15]:** When observed from the client side, and when communicating through an intermediary, `server.address` SHOULD represent the server address behind any intermediaries, for example proxies, if it's available. -**[14]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. +**[16]:** Query parameters should only be captured when `db.query.text` is parameterized with placeholders. If a parameter has no name and instead is referenced only by index, then `` SHOULD be the 0-based index. This attribute has stability level RELEASE CANDIDATE. @@ -151,6 +170,7 @@ and SHOULD be provided **at span creation time** (if provided at all): * [`db.collection.name`](/docs/attributes-registry/db.md) * [`db.operation.name`](/docs/attributes-registry/db.md) +* [`db.query.summary`](/docs/attributes-registry/db.md) * [`db.query.text`](/docs/attributes-registry/db.md) * [`server.address`](/docs/attributes-registry/server.md) * [`server.port`](/docs/attributes-registry/server.md) diff --git a/model/database/common.yaml b/model/database/common.yaml index 72a6cdac89..ff268158b8 100644 --- a/model/database/common.yaml +++ b/model/database/common.yaml @@ -7,8 +7,9 @@ groups: - ref: db.operation.name requirement_level: conditionally_required: > - If readily available. The operation name MAY be parsed from the query text, - in which case it SHOULD be the first operation name found in the query. + If readily available and if there is a single operation name that describes the + database call. The operation name MAY be parsed from the query text, + in which case it SHOULD be the single operation name found in the query. - ref: server.address brief: > Name of the database host. @@ -30,3 +31,22 @@ groups: generic one, the original exception SHOULD be preferred. Instrumentations SHOULD document how `error.type` is populated. + + - id: attributes.db.client.with_query_and_collection + extends: attributes.db.client.minimal + type: attribute_group + stability: experimental + brief: This group defines the attributes describing database operations that + have query and collection name. + attributes: + - ref: db.query.text + - ref: db.query.summary + requirement_level: + recommended: if readily available or if instrumentation supports query summarization. + - ref: db.collection.name + sampling_relevant: true + requirement_level: + conditionally_required: > + If readily available and if a database call is performed on a single collection. + The collection name MAY be parsed from the query text, + in which case it SHOULD be the single collection name in the query. diff --git a/model/database/metrics.yaml b/model/database/metrics.yaml index bc4985e323..d8e8f3fbc0 100644 --- a/model/database/metrics.yaml +++ b/model/database/metrics.yaml @@ -8,13 +8,8 @@ groups: instrument: histogram unit: "s" stability: experimental # RELEASE CANDIDATE - extends: attributes.db.client.minimal + extends: attributes.db.client.with_query_and_collection attributes: - - ref: db.collection.name - requirement_level: - conditionally_required: > - If readily available. The collection name MAY be parsed from the query text, - in which case it SHOULD be the first collection name in the query. - ref: db.system # TODO: Not adding to the minimal because of https://github.com/open-telemetry/build-tools/issues/192 requirement_level: required @@ -33,6 +28,9 @@ groups: - ref: db.namespace requirement_level: conditionally_required: If available. + - ref: db.query.text + requirement_level: opt_in + - id: metric.db.client.connection.count type: metric metric_name: db.client.connection.count diff --git a/model/database/registry.yaml b/model/database/registry.yaml index f55ab580f0..5924f054a6 100644 --- a/model/database/registry.yaml +++ b/model/database/registry.yaml @@ -9,14 +9,20 @@ groups: type: string stability: experimental # RELEASE CANDIDATE brief: The name of a collection (table, container) within the database. - note: > + note: | It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. - If the collection name is parsed from the query text, it SHOULD be the first collection name found in the query - and it SHOULD match the value provided in the query text including any schema and database name prefix. + A single database query may involve multiple collections. + + If the collection name is parsed from the query text, it SHOULD only be captured for queries that + contain a single collection and it SHOULD match the value provided in + the query text including any schema and database name prefix. For batch operations, if the individual operations are known to have the same collection name - then that collection name SHOULD be used, otherwise `db.collection.name` SHOULD NOT be captured. + then that collection name SHOULD be used. + + If the operation or query involves multiple collections, `db.collection.name` + SHOULD NOT be captured. This attribute has stability level RELEASE CANDIDATE. examples: ["public.users", "customers"] @@ -43,14 +49,19 @@ groups: stability: experimental # RELEASE CANDIDATE brief: > The name of the operation or command being executed. - note: > - It is RECOMMENDED to capture the value as provided by the application without attempting to do any case normalization. + note: | + It is RECOMMENDED to capture the value as provided by the application + without attempting to do any case normalization. - If the operation name is parsed from the query text, it SHOULD be the first operation name found in the query. + A single database query may involve multiple operations. If the operation + name is parsed from the query text, it SHOULD only be captured for queries that + contain a single operation or when the operation name describing the + whole query is available by other means. For batch operations, if the individual operations are known to have the same operation name then that operation name SHOULD be used prepended by `BATCH `, - otherwise `db.operation.name` SHOULD be `BATCH` or some other database system specific term if more applicable. + otherwise `db.operation.name` SHOULD be `BATCH` or some other database + system specific term if more applicable. This attribute has stability level RELEASE CANDIDATE. examples: ["findAndModify", "HMSET", "SELECT"] @@ -74,7 +85,7 @@ groups: examples: [ "SELECT * FROM wuser_table where username = ?", - 'SET mykey "WuValue"', + 'SET mykey ?', ] - id: db.query.parameter type: template[string] @@ -90,6 +101,29 @@ groups: This attribute has stability level RELEASE CANDIDATE. examples: ["someval", "55"] + - id: db.query.summary + type: string + stability: experimental # RELEASE CANDIDATE + brief: > + Low cardinality representation of a database query text. + note: > + `db.query.summary` provides static summary of the query text. It describes + a class of database queries and is useful as a grouping key, + especially when analyzing telemetry for database calls involving complex queries. + + Summary may be available to the instrumentation through + instrumentation hooks or other means. If it is not available, instrumentations + that support query parsing SHOULD generate a summary following + [Generating query summary](../../docs/database/database-spans.md#generating-a-summary-of-the-quey-text) + section. + + This attribute has stability level RELEASE CANDIDATE. + examples: + [ + 'SELECT wuser_table', + 'INSERT shipping_details SELECT orders', + 'get user by id', + ] - id: db.operation.batch.size type: int stability: experimental # RELEASE CANDIDATE diff --git a/model/database/spans.yaml b/model/database/spans.yaml index cdac8e1059..8ed7e57d62 100644 --- a/model/database/spans.yaml +++ b/model/database/spans.yaml @@ -16,45 +16,37 @@ groups: - ref: server.port sampling_relevant: true - - id: trace.db.common.query - extends: trace.db.common.minimal + - id: trace.db.common.query_and_collection + extends: attributes.db.client.with_query_and_collection type: attribute_group stability: experimental brief: This group defines the attributes used to perform database client calls. attributes: + - ref: db.operation.batch.size - ref: db.query.text sampling_relevant: true requirement_level: recommended: > Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes sensitive data, e.g. by redacting all literal values present in the query text. + See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). Parameterized query text SHOULD be collected by default (the query parameter values themselves are opt-in, see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). - - ref: db.query.parameter - requirement_level: opt_in - - - id: trace.db.common.query_and_collection - extends: trace.db.common.minimal - type: attribute_group - stability: experimental - brief: This group defines the attributes used to perform database client calls. - attributes: - - ref: db.query.text + - ref: db.query.summary sampling_relevant: true - requirement_level: - recommended: > - SHOULD be collected by default only if there is sanitization that excludes sensitive information. - See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). - ref: db.query.parameter requirement_level: opt_in - ref: db.collection.name sampling_relevant: true - requirement_level: - conditionally_required: > - If readily available. The collection name MAY be parsed from the query text, - in which case it SHOULD be the first collection name found in the query. + - ref: db.operation.name + sampling_relevant: true + - ref: db.operation.batch.size + - ref: server.address + sampling_relevant: true + - ref: server.port + sampling_relevant: true - id: trace.db.common.full type: attribute_group @@ -263,6 +255,9 @@ groups: note: > If table name includes the namespace, the `db.collection.name` SHOULD be set to the full table name. examples: ['mytable', 'ns:table'] + - ref: db.operation.name + requirement_level: + conditionally_required: If readily available. - ref: db.response.status_code brief: > Protocol-specific response code recorded as string. @@ -287,6 +282,8 @@ groups: For example, when retrieving a document, `db.operation.name` would be set to (literally, i.e., without replacing the placeholders with concrete values): [`GET /{db}/{docid}`](https://docs.couchdb.org/en/stable/api/document/common.html#get--db-docid). + requirement_level: + conditionally_required: If readily available. - ref: db.namespace sampling_relevant: true requirement_level: @@ -302,7 +299,7 @@ groups: - id: db.redis type: span stability: experimental - extends: trace.db.common.query + extends: attributes.db.client.minimal brief: > Attributes for Redis attributes: @@ -321,11 +318,23 @@ groups: Instrumentation SHOULD document if `db.namespace` reflects the database index provided when the connection was established. examples: ["0", "1", "15"] + - ref: db.operation.name + sampling_relevant: true - ref: db.query.text sampling_relevant: true brief: > The full syntax of the Redis CLI command. examples: ["HMSET myhash field1 'Hello' field2 'World'"] + requirement_level: + recommended: > + Non-parameterized query text SHOULD NOT be collected by default unless there is sanitization that excludes + sensitive data, e.g. by redacting all literal values present in the query text. + + See [Sanitization of `db.query.text`](../../docs/database/database-spans.md#sanitization-of-dbquerytext). + + Parameterized query text SHOULD be collected by default + (the query parameter values themselves are opt-in, + see [`db.query.parameter.`](../../docs/attributes-registry/db.md)). note: > For **Redis**, the value provided for `db.query.text` SHOULD correspond to the syntax of the Redis CLI. If, for example, the [`HMSET` command](https://redis.io/commands/hmset) is invoked, `"HMSET myhash field1 'Hello' field2 'World'"` would be a suitable value for `db.query.text`. @@ -342,6 +351,13 @@ groups: brief: > The Redis [simple error](https://redis.io/docs/latest/develop/reference/protocol-spec/#simple-errors) prefix. examples: ["ERR", "WRONGTYPE", "CLUSTERDOWN"] + - ref: db.operation.batch.size + - ref: db.query.parameter + requirement_level: opt_in + - ref: server.address + sampling_relevant: true + - ref: server.port + sampling_relevant: true - id: db.mongodb type: span @@ -357,6 +373,8 @@ groups: note: > See [MongoDB database commands](https://www.mongodb.com/docs/manual/reference/command/). examples: ['findAndModify', 'getMore', 'update'] + requirement_level: + conditionally_required: If readily available. - ref: db.collection.name sampling_relevant: true brief: