Skip to content

Unresolved Incidents


The Unresolved Incidents Explorer provides a centralized view of all incident records with alert levels in the current workspace, helping users fully understand the context of alert incidents and accelerate their comprehension. By associating monitors and alert strategies, it effectively reduces alert fatigue.

The data source for unresolved incidents queries incident data, aggregates it using df_fault_id as the unique identifier, and displays the most recent results. You can use the Explorer as a visualization tool to intuitively understand a series of key data points from the incident level to the triggered threshold baseline. From the incident level, duration, alert notifications, monitors, to the incident content and historical trigger trend chart, these pieces of information together form a comprehensive view, helping you analyze and understand incidents from different angles and make more informed response decisions.

Incident Card

Incident Level

Based on the trigger condition configuration of the monitor, the following status statistics are generated: Unresolved (df_status != ok), Critical (critical), Error (error), Warning (warning), and No Data (nodata).

In the Unresolved Incidents Explorer, the level of each incident is defined as the level at which the detection object last triggered the incident.

For more details, refer to Incident Level Description.

Incident Title

The incident title displayed in the Unresolved Incidents Explorer is directly sourced from the title set during the monitor rule configuration. It represents the title used by the detection object when it last triggered the incident.

Duration

Indicates the duration from the first time the detection object triggered an abnormal incident to the end time of the current time widget, such as 5 minutes (08/20 17:53:00 ~ 17:57:38).

Alert Notification

The alert notification status of the last incident triggered by the current detection object. It mainly includes the following three statuses:

  • Mute: Indicates that the current incident is affected by a mute rule but no alert notification has been sent externally;
  • Identifiers of the actual notification targets sent: Includes DingTalk bots, WeCom bots, Lark bots, etc.;
  • -: No external alert notification was triggered.

Monitor Detection Type

Refers to the monitor type.

Detection Object

When configuring the monitor rule, if the by group query is used in the detection Metrics, the incident card will display the filter conditions, such as source:kodo-servicemap.

Incident Content

The incident content of the last incident triggered by the current detection object, sourced from the preset content during the monitor rule configuration. It represents the incident content when the detection object last triggered the incident.

Historical Trigger Trend Chart

This trend is displayed using the Window function, showing the historical trend of the detection result values for the last 60 detections.

Based on the detection result values of the current unresolved incident, the historical trend of incident anomalies is displayed. The trigger threshold condition value configured in the monitor detection rule is set as a clear reference line. The system specifically marks the detection result of the last incident triggered by the current detection object, and through the vertical line in the trend chart, you can quickly locate the specific time point when the incident was triggered. At the same time, the corresponding detection interval of this detection result is also displayed, providing you with an intuitive analysis tool to evaluate the development process and impact of the incident.

Management Card

Display Items

The Unresolved Incidents list supports the following display styles:

  • Standard: Displays the incident title, detection dimensions, and incident content.
  • Expanded: In addition to the standard information, it also displays the historical trend of the detection result values for unresolved incidents.
  • List: Displays incident data in a list format.

View Only Associated Issue Incidents

After checking this option, you can filter out all incidents in the current incident list that are associated with Issues.

For a single incident with an associated relationship, click the icon on the right side of the incident data to directly jump to view it:

Issue & Create Issue

Create an Issue for unresolved incidents Create Issue, notifying relevant members to handle it promptly.

  • List mode:

  • Standard/Expanded mode:

  • Incident details:

Mute Incident

In large-scale monitoring scenarios, to avoid the cumbersome steps, time consumption, and easy omissions caused by manually handling a large number of similar alerts, you can directly "mute" the rules on the current page.

  1. Hover over a single incident and click Mute on the right side;
  2. Select the Mute Time Type;
  3. Confirm.

Mute Time Type

Supports customizing the start and end times for muting, or quickly setting it to 1 hour, 6 hours, 12 hours, 1 day, or 1 week.


  1. Select the start time and duration for muting;
  2. Select the mute cycle from a certain moment;
  3. Select the expiration time for muting. You can choose to repeat forever based on the above time or repeat until a specific moment.

Recover Incident

When the incident status is normal (df_sub_status = ok), it is considered a recovered incident.

  • To recover a single rule, you can click the button on the right side of the rule or go to the Monitor settings, or manually recover it.

  • If you click "Recover All", all abnormal incidents in the current list will be recovered, and you can choose whether to associate Issues.

There are four types of recovered incidents:

Name
df_status Description
Recover ok Previously detected "Critical", "Error", and "Warning" incidents, if not triggered again within N detections, are considered recovered.
No Data Recover ok Data stops being reported and then resumes, judged as recovered.
No Data Considered as Recover ok Detection data is interrupted, considered as normal status.
Manual Recover ok User manually clicks to recover, supports single/batch recovery.

Further Reading