Frequently Asked Questions¶
Event Viewing and Time Range¶
The "Unrecovered Events" tab only shows events from the last 48 hours by default. How can I view older unrecovered events?
You can freely adjust the time range using the Time Widget in the upper right corner of the page.
What is the difference between the "All Events" and "Unrecovered Events" tabs? Why can both filter out unrecovered events?
| Comparison Item | Unrecovered Events | All Events |
|---|---|---|
| Default Time Range | Last 48 hours | Usually longer (configurable) |
| Default Filter Condition | df_status != ok |
None |
| Displayed Content | Only events in abnormal status | All statuses (ok/abnormal/recovered/no data) |
| Purpose | Quickly focus on current issues | Complete historical query and analysis |
Key Difference: "Unrecovered Events" is a shortcut that automatically filters for you; "All Events" requires manual addition of filter conditions but offers higher flexibility.
After adjusting the Time Widget, will my previously set filter conditions be reset?
No, they will not be reset. Adjusting the time range is independent. The filter conditions you set (event level, alert strategy, monitor name, etc.) will remain unchanged. The system will apply these filter conditions within the new time range.
Event Content and Variables¶
Variables in the event content (like {{df_dimension_tags}}, {{Result}}) appear empty or in the wrong format. How can I debug this?
Variable Source: Variables in the event content are defined in the Event Notification configuration area of the monitor. The system replaces them based on actual monitoring data.
Common Reasons for Empty Values:
- The DQL query did not return the corresponding field.
- Variable name spelling error (case-sensitive).
- Detection dimension tags are empty.
Debugging Methods:
- Check the Detection Metrics area in the event details to confirm the original query results.
- Verify the DQL query statement in the monitor's configuration.
Common Variable Reference:
{{Result}}- Detection value{{df_dimension_tags}}- Detection dimension tags JSON{{df_status}}- Event status{{df_monitor_name}}- Monitor name{{date}}- Event generation timestamp
What is the difference between the event title and event content? Which one is sent in notifications?
- Event Title: Displayed in the event list and also serves as the title for alert notifications (email subject, DingTalk message title, etc.).
- Event Content: The content displayed on the event details page, and also the default body of alert notifications.
Custom Notification Content:
When the monitor's "Custom Notification Content" switch is enabled:
- The event content is still saved in the event details.
- However, alert notifications use the custom template.
- Suitable for scenarios where content needs to be customized for different notification channels.
What template functions are supported in event content? Can numbers be formatted as percentages?
Currently supported template functions:
| Function | Purpose | Example |
|---|---|---|
to_datetime |
Convert timestamp to date | {{ date \| to_datetime }} |
to_status_human |
Convert status to readable text | {{ df_status \| to_status_human }} |
to_fixed(n) |
Fixed decimal places | {{ Result \| to_fixed(2) }} |
to_percent |
Convert to percentage | {{ Result \| to_percent }} |
to_pretty_tags |
Beautify tag output | {{ df_dimension_tags \| to_pretty_tags }} |
Event Status and Recovery¶
Is event recovery automatic or manual? Why do some events remain unrecovered?
Automatic Recovery: When the monitor detects that the metric has returned to normal, it automatically generates a recovery event, and df_status changes to ok.
Manual Recovery: You can manually recover events from the event list or details page.
Common Reasons for Unrecovered Status:
- Monitor Configuration Issue: Recovery conditions are not correctly configured.
- No Data Events: No recovery strategy configured or data has not been reported again.
- Detection Logic Issue: Threshold settings prevent automatic recovery determination.
Will events generated during a monitor's Mute period appear in the event list? What is their status?
Yes, they will appear in the event list, but:
- Alert notifications will not be sent.
- Incidents will not be created (if configured to sync creation).
- The event status is recorded normally (critical/warning, etc.).
Muting only pauses notification sending; it does not affect the generation and recording of the events themselves.
No Data Events¶
What is a No Data Event? In what scenarios should it be enabled?
A No Data Event occurs when the monitor does not query the expected data within its detection cycle.
Configuration Location: The "No Data Events" configuration area of the monitor (supported by some monitor types).
Three Handling Strategies:
- Do Not Trigger Event - Silent handling.
- Trigger Recovery Event - Treat no data as abnormal recovery.
- Trigger No Data Event - Generate a dedicated no data alert (configurable level).
Why is a No Data Event sometimes not triggered even when data is clearly not being reported?
TrueWatch uses an "Edge Triggering" mechanism to determine no data:
If the last query found X, and the current query cannot find X, then X experiences a data gap.
Key Limitations:
- No alert for missing data on first detection (the system doesn't know what "should have been").
- It must have been in a "data present" state before a subsequent detection failure triggers a no data judgment.
Troubleshooting Suggestions:
- Confirm the monitor has run normally for at least one detection cycle and has queried data.
- Check the "Detection Range Drift" mechanism (the actual detection time range drifts by 1 minute).
- Check the monitor's execution logs to confirm if the DQL query was successful.
What is the generation logic for No Data Events and Data Recovery Events?
Alternating Generation Mechanism:
- No Data Events and Data Recovery Events always appear alternately.
- Consecutive No Data Events will not be generated.
- Consecutive Data Recovery Events will not be generated.
Judgment Process:
First detection finds no data → No alert
↓
Data detected → Record "data present" state
↓
Subsequent detection finds no data → Trigger No Data Event
↓
Data reporting resumes → Trigger Data Recovery Event
↓
Subsequent detection finds no data → Trigger (new) No Data Event
Event Association and Troubleshooting¶
How are "Associated Events" on the event details page associated? Why is it sometimes empty?
Association Logic:
- Based on identical detection dimension tags (e.g., host, service, etc.).
- Based on a time window (related events within the same time period).
Common Reasons for Empty Association:
- The dimension tag does not exist in other events.
- No other related events within the time window.
- The current event type does not support association (some monitor types).
What does "Associated SLO" displaying 0 mean? When will data appear?
- Displaying 0: Indicates the event is not associated with any SLO task.
- When data appears: Only when the event is triggered by an SLO task will associated SLO information be displayed.
- Events triggered by monitors are not associated with SLO by default.
What does the "Detection Interval" dashed line in the historical trend chart mean? Can the time range be adjusted?
- Detection Interval Dashed Line: Marks the specific detection time window that triggered the alert.
- Chart Display: Shows the trend of the detection metric over a longer time range.
Click the "Get Chart Query" button in the upper right corner of the chart to jump to the Metrics or Log Explorer. In the Explorer, you can flexibly adjust the time range for longer-term trend analysis.
Event Source and Fields¶
What are the differences in fields for events from different sources (monitor, audit, OpenAPI)?
Different df_source values correspond to different additional fields:
| df_source | Source | Additional Fields |
|---|---|---|
monitor |
Monitor/Intelligent Inspection/SLO | Monitor-related fields (detection metrics, thresholds, etc.) |
audit |
Audit Event | Operator, operation type, change details, etc. |
user |
OpenAPI Write | User-defined fields |
What are the usage differences between custom events written via OpenAPI and system-generated events?
Functional Differences:
- Custom events can set fields like df_status, df_title, df_message, etc.
- Can specify df_dimension_tags for association.
- Will not automatically associate with dashboards like monitor events do.
- Need to handle event recovery logic yourself (via API calls to the recovery interface).
Use Cases: External system integration, custom business alerts, batch import of historical events, etc.
Audit Events¶
What specific operations do Audit Events record? Where can I view them?
Viewing Location: Management > Basic Settings > Security > Operation Audit
General Recording Scope:
- Adding/Deleting data authorizations (cross-workspace authorization).
- Creation, deletion, and modification of configurations like monitors, SLOs, alert strategies.
- Workspace member permission changes.
Field Characteristics: df_source = audit, contains audit-specific fields like operator, operation time, operation type, etc.
Common Troubleshooting¶
Why does the time displayed in the event details not match the actual incident occurrence time?
This is normal behavior, for two reasons:
-
Planned Trigger Time vs. Actual Execution Time:
- The time displayed for the event is the monitor's planned trigger time (based on Crontab's regular time).
- It is not the time the event was actually generated in the system.
-
Detection Range Drift:
- The actual data detection range is:
Planned trigger time - Detection range - 1 minutetoPlanned trigger time - 1 minute. - Therefore, the timestamp of the faulty data may be earlier than the event display time.
- The actual data detection range is:
Troubleshooting Suggestion: Check the "Detection Metrics" area in the event details to confirm the actual data detection time range.
I can see the faulty data when querying directly in the platform, but the monitor didn't generate an event. Why?
Common reasons:
- Data Persistence Delay: The faulty data was not yet queryable when the detection executed (monitors automatically drift 1 minute to avoid this, but fails if the delay exceeds 1 minute).
- DQL Query Failure: The detection process was interrupted due to a query failure.
- Monitor Muted: In a Mute period, notifications are not sent, but events are still generated (check the event list).
- Threshold Configuration: The actual detection value did not reach the trigger threshold.
How long can event data be retained? How can it be exported or archived?
Long-term Storage Solutions:
- Use the Dataway Sink feature to divert event data to external storage.
- Periodically pull event data to local storage via OpenAPI.
- Use the Data Forwarding feature to forward to external systems like Kafka, S3, etc.