Incident List¶
The Incident List is a unified page in TrueWatch for centrally managing and displaying all incidents. It is used to view, claim, and track the progress of incident handling.
The list by default only shows incidents from the current workspace. If the current workspace has been granted access to incident data from other spaces, a space scope selector will appear to the left of the search box, allowing you to filter and view incidents by "Current Space" or specified authorized spaces.
Incident List View¶
The list is sorted in descending order by incident trigger time by default. Each incident displays the following core information:
- Title: A brief description of the incident, generated by the monitor rule or manually supplemented by the handler.
- Level: The severity of the incident (e.g., P0, P1, P2, Unknown), specified when the monitor is triggered.
- Status: The current handling stage (Open, Working, Resolved, Closed).
- Assignee: The user or team currently claiming responsibility for the incident.
- Detection Dimension: Tags or dimension information associated with the incident (e.g.,
host:web-01,service:auth), helping to quickly determine the scope of impact. - Duration: The length of time since the incident was triggered.
- Footer Information: Displays the associated on-call rule and the most recent event (e.g., "Status changed to Working 5 minutes ago").
Filtering and Searching Incidents¶
You can quickly find target incidents in various ways:
- Quick Filter Bar: Filter by incident tags, status, level, and on-call rules.
- Global Search: Supports keyword search for incident titles. You can also use syntax for precise matching, for example:
status:open: Find unhandled incidents.level:p0: Find P0-level incidents.assignee:Zhang San: Find incidents assigned to Zhang San.tag(service):auth: Find incidents containing the tagservice:auth.
Incident Claiming and Handling¶
All incidents are automatically created by monitors when anomalies are detected. The initial status of an incident is Open, and the system will immediately notify the corresponding personnel according to the on-call rules you have configured.
If the on-call rule has auto-claim enabled and only one on-call member matches the triggered incident's on-call schedule, that member will automatically be designated as the incident handler, and the incident status will be updated to Working accordingly.
- Claim an Incident: You can actively claim an incident from the incident list or details page, thereby becoming the responsible person. The status automatically changes to Working, following the principle of "claiming means handling".
- Assignee Change: For incidents in the Working status, other users can also actively claim them, and the assignee will change accordingly.
- Status Flow:
- Open → Working: Automatically switches after a user claims it.
- Working → Resolved/Closed: Manual operation by the handler, indicating the incident is resolved or closed.
- Rollback: The handler can roll back an incident in the Working status to Open. After rollback, the assignee is cleared, and the incident re-enters the pending assignment process.
- Incident Reopening: If a recovered incident (Resolved) is triggered again by the same monitor, the system will automatically create a new incident with an Open status.
- Escalation Notification: If an incident remains unclaimed or unhandled for an extended period, it will automatically notify more or higher-level personnel according to your configured escalation strategy, ensuring alerts are always delivered.
Further Reading¶
You might also be interested in:
