Incident Details Page¶

The Incident Details page is a comprehensive page for viewing and handling a single incident. Here, you can understand incident details, perform status operations, analyze related data, and collaborate with your team.

Top Overview¶

The top of the Incident Details page displays the core information of the incident, including:

Incident Severity: Such as P0, P1, specified when the monitor was triggered and cannot be modified.
Status and Time: Current status (Open/Working/Resolved/Closed), its first trigger time, and the total incident duration.
Title: A brief description of the incident.
Assignee: Displays the current assignee. You can manually assign or change the assignee (member/team) here.

Status Flow¶

Status Switching: Only the current assignee can use the dropdown to change the incident status. Status changes are updated in real-time and recorded in the operation timeline.
Progress Nodes: Key nodes for status changes are displayed in a timeline format on the right side or top of the page.
Rollback Operation: The assignee can roll back an incident in Working status to Open. After rollback, the assignee is cleared.

Vacation Handling Mechanism¶

If you have claimed an incident but need to take a vacation:

Go to User Settings > Status > Select "On Vacation".
The system will no longer send notifications for this incident to you.
It is recommended to first hand over the incident to another user, or ensure that the escalation policy is configured with subsequent notifiers.

Incident Details¶

When entering the details page, the "Incident Details" tab page is displayed by default.

Error Distribution Chart¶

Displays a bar chart of error distribution for the last 1 hour for this incident's dimension. Clicking on a bar will carry the current filter conditions to jump to the Log or APM Explorer for further analysis.

Anomaly Description¶

The Anomaly Description area centrally displays the original information about the incident:

Detection Dimension: Shows the detection dimension associated with the incident, e.g., host:192.168.1.1 or service:auth, to quickly locate the affected object.
Source: Indicates the specific monitor or intelligent monitoring rule that triggered this incident, facilitating traceability of the alert source.
Event Content: Displays the original alert content, usually the specific information recorded when the monitor detected an anomaly, such as the original log text or metric value.
Detection Metric: Shows the DQL query statement of the trigger condition. You can directly refer to this statement to understand the detection logic.
Description: You can manually enter text here to provide supplementary explanations for the incident, facilitating team understanding.
Supplementary Information: Additional context added by the system or users, such as associated change records, ticket links, etc.

Operation Records¶

In the "Operation Records" section, you can view the complete handling history of this incident. The system clearly displays all key operations in reverse chronological order, including incident triggering, status changes, severity adjustments, assignee handovers, and escalation notification executions, helping you stay up-to-date with the latest progress and trace the complete handling process.

Collaboration Records¶

You can collaborate with your team using the comment function at the bottom of the current details page, supporting adding text, links, or uploading attachments.

All collaboration content will be aggregated into the Collaboration Records section. The system automatically records the complete operation log, including incident triggering, status changes, Operation Records, assignee adjustments, and escalation notifications, forming a clear audit trail for subsequent tracking and review.

Incident Metrics¶

You can view the periodic and cumulative data of this incident in the data section below the current details page, divided into two blocks:

Current Incident Cycle Metrics: Displays the timeline of the current cycle (Trigger / Assigned / Resolved / Closed) and three core metrics: Current Response Time (MTTA), Current Resolution Time (MTTR), and Current Duration.
Incident Statistics Metrics: Displays the first trigger time, latest trigger time, total reopen count, average MTTA, average MTTR, calculated cumulatively since the first trigger.

After an incident is reopened, the current cycle metrics only show the data of the latest cycle, while the statistics metrics continue to be updated cumulatively.

In the "Related Events" tab of the Incident Details page, the system centrally displays all monitoring events related to this incident. These events are automatically associated based on the same detection dimension and default to showing data from the 2 hours before and after the incident occurred.

You can view here:

The occurrence time, source, and specific content of events.
The detection metrics and description information associated with events.
The distribution of events (intuitively presented through a time bar chart).

Clicking on any event or a time interval in the distribution chart will carry the current filter conditions to jump to the corresponding analysis page, allowing you to further view detailed logs, metric trends, or APM information, assisting you in locating the root cause of the incident or assessing its impact scope.

Based on the incident's detection dimension (e.g., service, host, app_name), the system automatically loads the corresponding analysis tools without manual navigation:

If the detection dimension includes service: Displays related APM, Service Map, Related Logs, Analysis Dashboards, etc.
If the detection dimension includes host: Displays related built-in views for Metrics, Logs, Processes, Containers, Network, etc.
If the detection dimension includes app_name: Displays related RUM Errors, Analysis Dashboards (depending on the application type).
Other dimensions: Display corresponding built-in views based on the actual situation.

All data views default to focusing on the 2 hours before and after the incident occurred. You can quickly understand the impact situation through the distribution chart and click to jump to the corresponding page for in-depth analysis.