Skip to content

Detection Rules


The system supports various monitoring detection rules, covering different data ranges.

Rule Types

Rule Name
Data Range
Basic Description
Threshold Detection All Anomaly detection based on set thresholds for metric data.
Mutation Detection Metrics (M) Anomaly detection based on historical data for sudden abnormal behavior of metrics, suitable for business data and short time windows.
Interval Detection Metrics (M) Detection of abnormal data points based on dynamic threshold ranges, suitable for stable trend time series.
Interval Detection V2 Metrics (M) Detection of abnormal data points based on dynamic threshold ranges, suitable for stable trend time series.
Outlier Detection Metrics (M) Detects if there are outlier deviations in metrics/statistical data under specific groupings.
Log Detection Logs (L) Anomaly detection for business applications based on log data.
Process Anomaly Detection Process Objects (O::host_processes) Regularly checks process data to understand process anomalies.
Infrastructure Survival Detection V2 Objects (O) Based on infrastructure object data, sets survival conditions to monitor infrastructure stability.
Application Performance Metrics Detection Traces (T) Based on APM data, sets threshold rules to detect anomalies.
Real User Metrics Detection Real User Data (R) Based on RUM data, sets threshold rules to detect anomalies.
Composite Detection All Combines the results of multiple monitors through expressions into one monitor, and alerts based on the combined result.
Security Check Anomaly Detection Security Checks (S) Anomaly detection based on data generated from security checks, effectively sensing host health status.
Synthetic Testing Anomaly Detection Synthetic Testing Data (L::type) Based on synthetic testing data, sets threshold rules to detect anomalies.
Network Data Detection Network (N) Based on network data, sets threshold rules to detect network performance stability.
Third-party Event Detection Others Generates event data by sending abnormal events or records from third-party systems to an HTTP server via POST requests at a specified URL.
Infrastructure Change Detection Objects (O) Based on tracking the lifecycle of infrastructure, monitors various change behaviors to accurately identify configuration drift, illegal operations, and other anomalies.

Rule Configuration

Detection Configuration

Set corresponding detection frequency, detection interval, detection metrics, etc., for different detection rules.

Event Notification

Event Title

Define the event name for the alert trigger condition; pre-defined template variables can be used.

Note

In the latest version, the monitor name will be generated synchronously after the event title is entered. In old monitors, there may be inconsistencies between the monitor name and the event title. It is recommended to synchronize to the latest version.

Event Content

Write the event notification content. When the trigger condition is met, the system will send this content externally. It generally includes the following information:

Note

The @ member configuration will only take effect and send the event content to the specified members when associated incident tracking is enabled.

The monitor will automatically generate jump links based on the detection metrics in the detection configuration. You can adjust the filter conditions and time range after inserting the link. It is generally a fixed link address prefix, which includes the current domain name and workspace ID; you can also choose to customize the jump link.

If you need to insert a link to a dashboard, based on the above logic, you also need to supplement the dashboard ID and name, and adjust the view variables and time range as needed.

Custom Advanced Configuration

Through advanced configuration, you can add related logs or error stacks to the event content to view the context data when anomalies occur.

  • Add related logs:

Query:

For example, get a log message with the index default:

{% set dql_data = DQL("L::RE(`.*`):(`message`) { `index` = 'default' } LIMIT 1") %}

Related log:

{{ dql_data.message | limit_lines(10) }}
  • Add related error stack

Query:

{% set dql_data = DQL("T::re(`.*`):(`error_message`,`error_stack`){ (`source` NOT IN ['service_map', 'tracing_stat', 'service_list_1m', 'service_list_1d', 'service_list_1h', 'profile']) AND (`error_stack` = exists()) } LIMIT 1") %}

Related error stack:

{{ dql_data.error_message | limit_lines(10) }}

{{ dql_data.error_stack | limit_lines(10) }}
Custom Notification Content

By default, the system will use the event content as the alert notification content. If you need to customize the actual notification sent externally, you can enable the switch here and fill in the notification information.

Note

Different alert notification targets support different Markdown syntax. For example, WeCom does not support unordered lists.

Data Outage Event

Customize the notification content for data outages. You can synchronize the configuration of the title, content, and other information that will be sent externally for this type of event.

If not configured here, the official default notification template will be automatically used when sent externally.

Related Incident Tracking

After enabling, if an incident event is generated under this monitor, an Issue will be created synchronously. You can choose to create Issues for different event levels.

  1. Select the event level;
  2. Define the level of the Issue to be created;
  3. Select the responsible person for this type of Issue;
  4. Select the delivery channel;
  5. Optionally choose whether to close the Issue synchronously after the event is restored.

The Issues generated here can be viewed in Incident Tracking > the selected channel.

Alert Configuration

After the monitoring trigger condition is met, immediately send an alert message to the specified notification targets. The alert strategy includes the event level to be notified, the notification targets, and the alert silence period.

The alert strategy supports single or multiple selections. Click the strategy name to expand the details page. To modify the strategy, click Edit Alert Strategy.

Association

Supports associating the monitor with a dashboard for quick jumps and visual data viewing.

Permissions

Set the operation permissions of the monitor to ensure that different users perform configuration operations according to their roles and permission levels.

  • Do not enable this configuration: Follow the default permissions of "Monitor Configuration Management";
  • Enable this configuration and select custom permission objects: Only the creator and the assigned permission objects can enable/disable, edit, and delete the rules set for this monitor;
  • Enable this configuration but do not select custom permission objects: Only the creator has the permissions to enable/disable, edit, and delete this monitor.
Note

The Owner role of the current workspace is not affected by the operation permission configuration here.

Recover Monitor

Supports viewing the status, last update time, creation time, and creator of existing monitors. Supports recovering monitors to view historical configurations, helping you quickly communicate and collaborate with other team members to update monitors.

Operation Example:

In Monitoring > Monitors, select to edit an existing monitor. On the monitor configuration page, click the button in the upper right corner to view the monitor's status, last update time, creation time, and creator.

Click the view button on the right side of Update Time in the above image to open a new browser window to view the previous version of the monitor configuration;

Click Recover This Version in the upper right corner of the previous version monitor. In the pop-up dialog, confirm the recovery to restore to the previous version of the monitor configuration for editing and saving.