Skip to content

Commit

Permalink
[Data set quality] Add all data set types to quality page (#4232)
Browse files Browse the repository at this point in the history
Co-authored-by: Colleen McGinnis <colleen.j.mcginnis@gmail.com>
(cherry picked from commit 442705e)

# Conflicts:
#	docs/en/serverless/monitor-datasets.mdx
#	docs/en/serverless/serverless-observability.docnav.json
  • Loading branch information
mdbirnstiehl authored and mergify[bot] committed Oct 3, 2024
1 parent 587a626 commit 7a93252
Show file tree
Hide file tree
Showing 6 changed files with 737 additions and 12 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
5 changes: 3 additions & 2 deletions docs/en/observability/index.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -194,8 +194,6 @@ include::inspect-log-anomalies.asciidoc[leveloffset=+3]

include::configure-logs-sources.asciidoc[leveloffset=+3]

include::logs-monitor-datasets.asciidoc[leveloffset=+2]

include::logs-add-service-name.asciidoc[leveloffset=+2]

include::logs-index-template.asciidoc[leveloffset=+2]
Expand Down Expand Up @@ -235,6 +233,9 @@ include::slo-privileges.asciidoc[leveloffset=+3]

include::slo-create.asciidoc[leveloffset=+3]

//Data Set Quality
include::logs-monitor-datasets.asciidoc[leveloffset=+1]

//Observability AI Assistant
include::observability-ai-assistant.asciidoc[leveloffset=+1]

Expand Down
22 changes: 12 additions & 10 deletions docs/en/observability/logs-monitor-datasets.asciidoc
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
[[monitor-datasets]]
= Monitor log data set quality
= Data set quality

beta:[]

The **Data Set Quality** page provides an overview of your log data sets.
Use this information to get an idea of your overall log data set quality and find data sets that contain incorrectly parsed documents.
The **Data Set Quality** page provides an overview of your log, metric, trace, and synthetic data sets.
Use this information to get an idea of your overall data set quality and find data sets that contain incorrectly parsed documents.

Access the Data Set Quality page from the main {kib} menu at **Stack Management** → **Data Set Quality**.
By default, the page only shows log data sets. To see other data set types, select them from the **Type** menu.

[role="screenshot"]
image::images/logs-dataset-overview.png[Screen capture of the data set overview]
image::images/data-set-quality-overview.png[Screen capture of the data set overview]

.Requirements
[NOTE]
Expand Down Expand Up @@ -36,7 +38,7 @@ Opening the details of a specific data set shows the degraded documents history,

The Data Set Quality page has a couple of different ways to help you find ignored fields and investigate issues.
From the data set table, you can open the data set's details page, and view commonly ignored fields and information about those fields.
You can also open a data set in Logs Explorer to find ignored fields in individual logs.
Open a logs data set in Logs Explorer or other data set types in Discover to find ignored fields in individual documents.

[discrete]
[[find-ignored-fields-in-data-sets]]
Expand All @@ -51,19 +53,19 @@ The **Quality issues** section shows fields that were ignored during ingest, the

[discrete]
[[find-ignored-fields-in-individual-logs]]
=== Find ignored fields in individual logs
=== Find ignored fields in individual documents

To use Logs Explorer to find ignored fields in individual logs:
To use Logs Explorer or Discover to find ignored fields in individual documents:

. Find data sets with degraded documents using the **Degraded Docs** column of the data sets table.
. Click the percentage in the **Degraded Docs** column to open the data set in Logs Explorer.
. Click the percentage in the **Degraded Docs** column to open the data set in Logs Explorer or Discover.

The **Documents** table in Logs Explorer is automatically filtered to show documents that were not parsed correctly.
The **Documents** table in Logs Explorer or Discover is automatically filtered to show documents that were not parsed correctly.
Under the **actions** column, you'll find the degraded document icon.

Now that you know which documents contain ignored fields, examine them more closely to find the origin of the issue:

. Under the **actions** column, click image:images/expand-icon.png[expand icon] to open the log details.
. Under the **actions** column, click image:images/expand-icon.png[expand icon] to open the document details.
. Select the **JSON** tab.
. Scroll towards the end of the JSON to find the `ignored_field_values`.

Expand Down
63 changes: 63 additions & 0 deletions docs/en/serverless/monitor-datasets.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
---
id: serverlessObservabilityMonitorDatasets
slug: /serverless/observability/monitor-datasets
title: Data set quality monitoring
description: Monitor data sets to find degraded documents.
tags: [ 'serverless', 'observability', 'how-to' ]
---

<p><DocBadge template="beta" /></p>

The **Data Set Quality** page provides an overview of your log, metric, trace, and synthetic data sets.
Use this information to get an idea of your overall data set quality and find data sets that contain incorrectly parsed documents.

Access the Data Set Quality page from the main menu at **Project settings****Management****Data Set Quality**.
By default, the page only shows log data sets. To see other data set types, select them from the **Type** menu.

<DocCallOut title="Requirements">
Users with the `viewer` role can view the Data Sets Quality summary. To view the Active Data Sets and Estimated Data summaries, users need the `monitor` [index privilege](((ref))/security-privileges.html#privileges-list-indices) for the `logs-*-*` index.
</DocCallOut>

The quality of your data sets is based on the percentage of degraded documents in each data set.
A degraded document in a data set contains the [`_ignored`](((ref))/mapping-ignored-field.html) property because one or more of its fields were ignored during indexing.
Fields are ignored for a variety of reasons.
For example, when the [`ignore_malformed`](((ref))/mapping-ignored-field.html) parameter is set to true, if a document field contains the wrong data type, the malformed field is ignored and the rest of the document is indexed.

From the data set table, you'll find information for each data set such as its namespace, when the data set was last active, and the percentage of degraded docs.
The percentage of degraded documents determines the data set's quality according to the following scale:

* Good (<DocImage flatImage alt="Good icon" url="images/green-dot-icon.png" />): 0% of the documents in the data set are degraded.
* Degraded (<DocImage flatImage alt="Degraded icon" url="images/yellow-dot-icon.png" />): Greater than 0% and up to 3% of the documents in the data set are degraded.
* Poor (<DocImage flatImage alt="Poor icon" url="images/red-dot-icon.png" />): Greater than 3% of the documents in the data set are degraded.

Opening the details of a specific data set shows the degraded documents history, a summary for the data set, and other details that can help you determine if you need to investigate any issues.

## Investigate issues
The Data Set Quality page has a couple of different ways to help you find ignored fields and investigate issues.
From the data set table, you can open the data set's details page, and view commonly ignored fields and information about those fields.
Open a logs data set in Logs Explorer or other data set types in Discover to find ignored fields in individual documents.

### Find ignored fields in data sets
To open the details page for a data set with poor or degraded quality and view ignored fields:

1. From the data set table, click <DocIcon type="expand" title="expand icon" /> next to a data set with poor or degraded quality.
1. From the details, scroll down to **Quality issues**.

The **Quality issues** section shows fields that have been ignored, the number of documents that contain ignored fields, and the timestamp of last occurrence of the field being ignored.

### Find ignored fields in individual logs
To use Logs Explorer or Discover to find ignored fields in individual logs:

1. Find data sets with degraded documents using the **Degraded Docs** column of the data sets table.
1. Click the percentage in the **Degraded Docs** column to open the data set in Logs Explorer or Discover.

The **Documents** table in Logs Explorer or Discover is automatically filtered to show documents that were not parsed correctly.
Under the **actions** column, you'll find the degraded document icon (<DocIcon type="indexClose" title="degraded document icon" />).

Now that you know which documents contain ignored fields, examine them more closely to find the origin of the issue:

1. Under the **actions** column, click <DocIcon type="expand" title="expand icon" /> to open the document details.
1. Select the **JSON** tab.
1. Scroll towards the end of the JSON to find the `ignored_field_values`.

Here, you'll find all of the `_ignored` fields in the document and their values, which should provide some clues as to why the fields were ignored.
Loading

0 comments on commit 7a93252

Please sign in to comment.