Incident List¶
The Incident List is a unified page in TrueWatch for centrally managing and displaying all incidents, used to view, claim, and track the progress of incident handling.
Incident List View¶
The list is sorted in reverse chronological order by default based on the incident trigger time. Each incident displays the following core information:
- Title: A brief description of the incident, generated by the monitor rule or manually supplemented by the handler.
- Level: The severity of the incident (e.g., P0, P1, P2, Unknown), specified when the monitor is triggered.
- Status: The current handling stage (Open, Working, Resolved, Closed).
- Assignee: The user or team currently claiming this incident.
- Detection Dimensions: Tags or dimension information associated with the incident (e.g.,
host:web-01,service:auth), helping to quickly determine the impact scope. - Duration: The length of time since the incident was triggered.
- Footer Information: Displays the associated on-call rule and recent events (e.g., "Status changed to Working 5 minutes ago").
Filtering and Searching Incidents¶
You can quickly find target incidents in various ways:
- Quick Filter Bar: Filter by incident tags, status, level, on-call rules.
- Global Search: Supports keyword search for incident titles. You can also use syntax for precise matching, for example:
status:open: Find unhandled incidents.level:p0: Find P0-level incidents.assignee:张三: Find incidents assigned to Zhang San.tag(service):auth: Find incidents containing the tagservice:auth.
Incident Claiming and Handling¶
All incidents are automatically created by monitors when anomalies are detected. The initial status of an incident is Open, and the system will immediately notify the corresponding personnel based on your configured on-call rules.
- Claim an Incident: You can actively claim an incident in the incident list or details page, thereby becoming the assignee. The status automatically changes to Working, following the principle of "claiming means handling".
- Assignee Change: For incidents in the Working status, other users can also actively claim them, and the assignee will change accordingly.
- Status Flow:
- Open → Working: Automatically switches after a user claims it.
- Working → Resolved/Closed: Manual operation by the handler, indicating the incident is resolved or closed.
- Rollback: The handler can roll back an incident in Working status to Open. After rollback, the assignee is cleared, and the incident re-enters the pending assignment process.
- Incident Reopening: If a recovered incident (Resolved) is triggered again by the same monitor, the system will automatically create a new incident with Open status.
- Escalation Notification: If an incident is not claimed or handled for a long time, it will automatically notify more or higher-level personnel according to your escalation strategy, ensuring alerts are always delivered.
Further Reading¶
You might be interested in the following:
