Detection Rules¶

The system supports various monitoring detection rules, covering different data ranges.

Rule Types¶

Rule Name	Data Range	Basic Description
Threshold Detection	All	Anomaly detection based on set thresholds for metric data.
Mutation Detection	Metrics (M)	Anomaly detection based on historical data for sudden abnormal behavior of metrics, suitable for business data and short time windows.
Interval Detection	Metrics (M)	Detection of abnormal data points based on dynamic threshold ranges, suitable for stable trend time series.
Interval Detection V2	Metrics (M)	Detection of abnormal data points based on dynamic threshold ranges, suitable for stable trend time series.
Outlier Detection	Metrics (M)	Detects outlier deviations in metrics/statistical data of detected objects under specific groups.
Log Detection	Logs (L)	Anomaly detection for business applications based on log data.
Process Anomaly Detection	Process Objects (O::`host_processes`)	Regularly checks process data to understand process anomalies.
Infrastructure Survival Detection V2	Objects (O)	Monitors infrastructure stability based on infrastructure object data and set survival conditions.
Application Performance Metrics Detection	Traces (T)	Detects anomalies based on APM data by setting threshold rules.
Real User Metrics Detection	RUM Data (R)	Detects anomalies based on RUM data by setting threshold rules.
Composite Detection	All	Combines results from multiple monitors into one monitor using expressions, and alerts based on the combined result.
Synthetic Testing Anomaly Detection	Synthetic Testing Data (L::`type`)	Detects anomalies based on Synthetic Testing data by setting threshold rules.
Network Data Detection	Network (N)	Monitors network performance stability based on network data by setting threshold rules.
Third-party Event Detection	Others	Generates event data by sending abnormal events or records from third-party systems to an HTTP server via a specified URL address using POST requests.
Infrastructure Change Detection	Objects (O)	Tracks infrastructure lifecycle changes to monitor various change behaviors and accurately identify configuration drifts and illegal operations.

Rule Configuration¶

Detection Configuration¶

Set corresponding detection frequency, detection interval, detection metrics, etc., for different detection rules.

Event Notification¶

Event Title¶

Define the event name for alert trigger conditions; pre-defined template variables can be used.

Note

In the latest version, the monitor name will be generated synchronously after the event title is entered. In old monitors, there may be inconsistencies between the monitor name and the event title. It is recommended to synchronize to the latest version.

Event Content¶

Write the event notification content. When the trigger conditions are met, the system will send this content externally. It generally includes the following information:

Markdown formatted body;
Can insert related links and template variables;
Add related logs or error information based on advanced configuration;
Target notification members for sending event content.

Note

The @ member configuration will only take effect and send the event content to specified members when associated Incident is enabled.

The monitor will automatically generate jump links based on the detection metrics in the detection configuration. You can adjust filter conditions and time ranges after inserting the link. It is generally a fixed link address prefix containing the current domain and workspace ID; you can also choose to customize the jump link.

If you need to insert a link to a dashboard, based on the above logic, you also need to supplement the dashboard ID and name, and adjust view variables and time ranges as needed.

Custom Advanced Configuration¶

Through advanced configuration, you can add related logs or error stacks to the event content to view contextual data when anomalies occur.

Add related logs:

Query:

For example: Get a log message with index default:

{% set dql_data = DQL("L::RE(`.*`):(`message`) { `index` = 'default' } LIMIT 1") %}

Related logs:

{{ dql_data.message | limit_lines(10) }}

Add related error stacks

Query:

{% set dql_data = DQL("T::re(`.*`):(`error_message`,`error_stack`){ (`source` NOT IN ['service_map', 'tracing_stat', 'service_list_1m', 'service_list_1d', 'service_list_1h', 'profile']) AND (`error_stack` = exists()) } LIMIT 1") %}

Related error stacks:

{{ dql_data.error_message | limit_lines(10) }}

{{ dql_data.error_stack | limit_lines(10) }}

Custom Notification Content¶

By default, the system uses the event content as the alert notification content. If you need to customize the actual notification sent externally, you can enable the switch here and fill in the notification information.

Note

Different alert notification objects support different Markdown syntax. For example, WeCom does not support unordered lists.

Data Outage Events¶

Customize the notification content for data outages. You can synchronize the configuration of the title, content, etc., of such events when they are ultimately sent externally.

If not configured here, the official default notification template will be automatically used when sending externally.

Associated Incident¶

When enabled, if an abnormal event is generated under this monitor, an Issue will be created synchronously. You can choose to create Issues for different event levels.

Select the event level;
Define the level of the Issue ultimately generated;
Select the responsible person for this type of Issue;
Select the delivery channel;
Optionally choose whether to close the Issue synchronously after the event recovers.

The Issues generated here can be viewed in Incident > the selected channel.

Alert Configuration¶

When the monitoring trigger conditions are met, immediately send alert messages to the specified notification objects. The alert strategy includes the event level to be notified, notification objects, and the alert silence period.

Alert strategies support single or multiple selections. Click the strategy name to expand the details page. If you need to modify the strategy, click Edit Alert Strategy.

Association¶

Supports associating monitors with dashboards for quick jumps and visual viewing of related data.

Permissions¶

Set the operation permissions of the monitor to ensure that different users perform configuration operations according to their roles and permission levels.

Do not enable this configuration: Follow the default permissions of "Monitor Configuration Management";
Enable this configuration and select custom permission objects: Only the creator and the assigned permission objects can enable/disable, edit, and delete the rules set for this monitor;
Enable this configuration but do not select custom permission objects: Only the creator has the permissions to enable/disable, edit, and delete this monitor.

Note

The Owner role of the current workspace is not affected by the operation permission configuration here.

Trigger Detection Immediately¶

After the rule configuration is completed, you can choose to trigger detection immediately to test the overall effect of the current rule configuration.

Recover Monitor¶

Supports viewing the status, last update time, creation time, and creator of existing monitors. Supports recovering monitors to view historical configurations of monitors, helping you quickly communicate and collaborate with other team members to update monitors.

Operation Example:

In Monitoring > Monitors, select to edit an existing monitor. On the monitor configuration page, click the button in the upper right corner to view the monitor's status, last update time, creation time, and creator.

Click the view button to the right of Update Time in the above figure to open a new browser window to view the previous version of the monitor configuration;

Click Recover This Version in the upper right corner of the previous version of the monitor. In the pop-up dialog box, confirm the recovery to restore to the previous version of the monitor configuration for editing and saving.

Detection Rules¶

Rule Types¶

Rule Configuration¶

Detection Configuration¶

Event Notification¶

Event Title¶

Event Content¶

Related Links¶

Custom Advanced Configuration¶

Custom Notification Content¶

Data Outage Events¶

Associated Incident¶

Alert Configuration¶

Association¶

Permissions¶

Trigger Detection Immediately¶

Recover Monitor¶