Skip to content

DQL Function Reference


DQL provides a rich set of functions for data aggregation, transformation, and matching. This document details the semantics, parameters, and usage of each function.

Aggregation Functions

Aggregation functions are used to aggregate multiple rows of data into a single value, typically used with time windows (time-expr) and grouping (BY clause).

Basic Aggregation

sum

Calculates the sum of field values.

Syntax:

sum(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate total request count
M::http_requests:(sum(request_count)) [1h]

// Calculate total request count grouped by service
M::http_requests:(sum(request_count)) [1h] BY service

avg

Calculates the average of field values.

Syntax:

avg(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate average response time
M::response_time:(avg(duration)) [1h] BY endpoint

// Calculate average CPU usage
M::cpu:(avg(usage)) [1h] BY host

count

Counts the number of data rows.

Syntax:

count(field)
count(*)

Parameters:

  • field: Any field, counts the number of non-null values
  • *: Counts all rows

Examples:

// Count log entries
L::nginx:(count(*)) [1h]

// Count requests that have a response time
M::response_time:(count(duration)) [1h] BY service

min / max

Calculates the minimum or maximum value of a field.

Syntax:

min(field)
max(field)

Parameters:

  • field: A numeric field

Examples:

// Find the maximum response time
M::response_time:(max(duration)) [1h] BY endpoint

// Find the range of CPU usage
M::cpu:(min(usage), max(usage)) [1h] BY host

first / last

Gets the first or last value (in chronological order).

Syntax:

first(field)
last(field)
last_row(field)

Parameters:

  • field: Any field

Description:

  • first: Returns the earliest value in time
  • last: Returns the latest value in time; if the field is an array type, it will be expanded
  • last_row: Returns the latest value in time; array types are not expanded

Examples:

// Get the latest status value
M::system:(last(status)) [1h] BY host

// Get initial and final values
M::counter:(first(value), last(value)) [1h] BY metric

any

Returns any non-null value. Suitable for obtaining sample data or scenarios where a specific aggregation order is not required.

Syntax:

any(field)

Parameters:

  • field: Any field

Examples:

// Get a sample message
L::logs:(any(message)) [1h] BY service

// Get a sample error stack trace
L::error_logs:(any(stack_trace)) [1h] BY error_type

spread

Calculates the range (difference between maximum and minimum).

Syntax:

spread(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate the fluctuation range of response time
M::response_time:(spread(duration)) [1h] BY endpoint

stddev

Calculates the standard deviation.

Syntax:

stddev(field)

Parameters:

  • field: A numeric field

Examples:

// Calculate the standard deviation of response time
M::response_time:(stddev(duration)) [1h] BY endpoint

mode

Calculates the mode (the most frequently occurring value).

Syntax:

mode(field)

Parameters:

  • field: Any field

Examples:

// Find the most common response status code
M::http:(mode(status)) [1h] BY endpoint

count_series

Counts the number of time series (groups). Returns how many independent time series are within the current query scope.

Syntax:

count_series(field)

Parameters:

  • field: Any field (typically * or any existing field)

Examples:

// Count how many hosts are reporting CPU metrics
M::cpu:(count_series(*)) [1h]

// Count how many instances each service has
M::http_requests:(count_series(*)) [1h] BY service

Statistical Aggregation (Estimation Functions)

The following functions use probabilistic data structures for estimation, suitable for large-scale data scenarios, balancing precision and performance.

count_distinct

Calculates the number of distinct values for a field (estimated value).

Syntax:

count_distinct(field)

Parameters:

  • field: Any field

Algorithm Description:

Uses the HyperLogLog algorithm for cardinality estimation: - Number of registers: 2¹⁶ = 65536 - Uses the LogLog-Beta estimation method - Standard error: approximately 0.4%

Use Cases:

  • Counting unique users (UV)
  • Calculating the number of distinct IP addresses
  • Analyzing the number of unique request IDs

Examples:

// Count unique users
L::access_logs:(count_distinct(user_id)) [1d] BY service

// Count distinct accessing IPs
L::nginx:(count_distinct(client_ip)) [1h] BY endpoint

percentile

Calculates the percentile of a field (estimated value).

Syntax:

percentile(field, n)
pXX(field)  // Short form

Parameters:

  • field: A numeric field
  • n: Percentile, range 0-100

Short Forms:

  • p50(field) is equivalent to percentile(field, 50)
  • p95(field) is equivalent to percentile(field, 95)
  • p99(field) is equivalent to percentile(field, 99)

Algorithm Description:

Uses a log-linear interpolation histogram for estimation: - Bucket range: 10⁻⁹ to 10¹⁸, covering the vast majority of numerical scenarios - Each order of magnitude is divided into 128 buckets - Uses linear interpolation in log space to improve accuracy

Use Cases:

  • Calculating P99, P95 for response times
  • Analyzing tail latency of performance metrics
  • Evaluating Service Level Agreement (SLA) compliance

Examples:

// Calculate P99 response time
M::response_time:(percentile(duration, 99)) [1h] BY service

// Using the short form
M::response_time:(p99(duration)) [1h] BY service

// Calculate multiple percentiles simultaneously
M::response_time:(p50(duration), p95(duration), p99(duration)) [1h] BY service

median

Calculates the median, equivalent to percentile(field, 50).

Syntax:

median(field)

Examples:

// Calculate the median response time
M::response_time:(median(duration)) [1h] BY service

Histogram Functions

DQL provides three histogram-related functions for different data sources and scenarios:

Function Use Case Data Source Type Recommendation
histogram_auto Numerical distribution statistics for detailed data like logs, Trace Detail model (Log/Trace) ⭐⭐⭐ Recommended
histogram Histogram with fixed bucket boundaries Detail model (Log/Trace) ⭐⭐ Deprecated
histogram_quantile Calculate quantiles from Prometheus histogram metrics Prometheus metrics ⭐⭐⭐ Recommended

Automatically generates a distribution histogram, specifically designed for numerical distribution statistics of detailed data like logs and Trace.

Features:

  • No need to specify bucket boundaries; automatically adapts to data distribution
  • Uses a log-linear interpolation histogram algorithm, covering a numerical range from 10⁻⁹ to 10¹⁸
  • Returns both quantile statistics and bucket distribution information

Syntax:

histogram_auto(field)

Parameters:

  • field: A numeric field

Return Values: | Column Name | Description | | ---------------- | ------------------------------- | | lower_bounds | Array of lower bounds for each bucket | | upper_bounds | Array of upper bounds for each bucket | | counts | Array of counts for each bucket | | min | Minimum value | | p50 | Median (50th percentile) | | p75 | 75th percentile | | p90 | 90th percentile | | p95 | 95th percentile | | p99 | 99th percentile | | max | Maximum value |

Algorithm Description: Uses an estimation histogram (log-linear interpolation), dividing each order of magnitude into 128 buckets, suitable for distribution statistics on large datasets.

Use Cases:

  • Analyzing response time distribution in logs
  • Statistics on time consumption distribution in Trace
  • Exploratory data analysis without preset bucket boundaries

Examples:

// Analyze response time distribution in Nginx access logs
L::nginx:(histogram_auto(response_time)) [1h]

// Statistics on request duration distribution by service
L::app_logs:(histogram_auto(duration)) [1h] BY service

// Statistics on time consumption distribution for Trace calls
T::http_client:(histogram_auto(elapsed)) [1h] BY operation

Result Example:

lower_bounds upper_bounds counts min p50 p75 p90 p95 p99 max
[0, 10, 100] [10, 100, 1000] [1000, 500, 100] 0.5 45 120 280 450 850 1200

Note: lower_bounds, upper_bounds, counts are array types, representing the boundaries and counts of each bucket.


histogram (Deprecated)

Generates a histogram with specified bucket boundaries. This function is Deprecated; it is recommended to use histogram_auto instead.

Description: histogram requires manual specification of bucket boundary parameters, making it less flexible. histogram_auto can automatically adapt to data distribution, covers a wider numerical range, and returns richer statistical information.

Syntax:

histogram(field, left_bound, right_bound, bucket_size [, threshold])

Parameters:

  • field: A numeric field
  • left_bound: Left boundary
  • right_bound: Right boundary
  • bucket_size: Bucket size
  • threshold (optional): Minimum count per bucket; buckets below this value are not returned

Return Values: Returns two columns: bucket_le (right boundary of the bucket) and count (count)

Examples:

// Generate a histogram in the range 0-1000ms, with one bucket every 100ms
M::response_time:(histogram(duration, 0, 1000, 100)) [1h]

// Recommended to use histogram_auto instead
M::response_time:(histogram_auto(duration)) [1h]

Result Example:

bucket_le count
100 1500
200 2800
300 3500
... ...
1000 5000

histogram_quantile

Calculates quantiles from Prometheus histogram metrics.

Features:

  • Specifically designed to handle histogram-type metrics reported by Prometheus
  • Relies on the le label (or VictoriaMetrics' vmrange label) to identify bucket boundaries
  • Input data should be cumulative counts

Syntax:

histogram_quantile(field, q)

Parameters:

  • field: Histogram count field (e.g., http_request_duration_bucket)
  • q: Quantile, range 0-1 (e.g., 0.99 represents P99)

Use Case Comparison:

Scenario Recommended Function Description
Analyzing response time distribution in logs histogram_auto Logs are detailed data, no pre-aggregated histogram
Analyzing P99 of Prometheus histogram metrics histogram_quantile Metrics are pre-aggregated by le label
Statistics on time consumption distribution for Trace calls histogram_auto Trace is detailed data

le Label Processing Mechanism:

histogram_quantile relies on the le label (less than or equal) to identify histogram bucket boundaries:

  1. Prometheus format (default): Uses the le label directly to represent the bucket upper boundary
  2. le value is a number (e.g., "0.1", "1", "10") or "+Inf" (infinity)
  3. Data should be cumulative counts

  4. VictoriaMetrics format: Uses the vmrange label to represent a range

  5. Format is "lower...upper" (e.g., "0.1...0.2")
  6. Data is range counts (non-cumulative)
  7. The function automatically converts range counts to cumulative counts

Calculation Process: 1. Sort all buckets by le value 2. If in vmrange format, accumulate counts to convert to cumulative distribution 3. Ensure bucket counts are monotonically increasing (fix possible abnormal data) 4. Use linear interpolation to calculate the target quantile

Differences from PromQL:

Feature DQL PromQL
Function Type Aggregation function Transformation function
Input Data Directly reads metrics with le label Requires sum(rate(...)) by (le)
Usage histogram_quantile(field, 0.99) histogram_quantile(0.99, sum(rate(...)) by (le))
Data Format Supports both le and vmrange labels Only supports le label
Grouping Via DQL's BY clause Via explicit by (le) grouping

Equivalent Examples:

Assume a histogram metric http_request_duration_bucket, containing le labels (e.g., 0.1, 0.5, 1, 5, +Inf) and service label.

Scenario 1: Calculate P99 Latency

DQL:

M::http_request_duration:(histogram_quantile(duration_bucket, 0.99)) [1h] BY service

PromQL equivalent:

histogram_quantile(0.99, sum(rate(http_request_duration_bucket[1h])) by (le, service))

Scenario 2: Calculate P95 Latency for Each Service (Multiple Groups)

DQL:

M::http_request_duration:(
    histogram_quantile(duration_bucket, 0.95)
) [1h] BY service, endpoint

PromQL equivalent:

histogram_quantile(0.95, sum(rate(http_request_duration_bucket[1h])) by (le, service, endpoint))

Scenario 3: Calculate P50 (Median) and P99

DQL:

M::http_request_duration:(
    histogram_quantile(duration_bucket, 0.50) as p50,
    histogram_quantile(duration_bucket, 0.99) as p99
) [1h] BY service

PromQL equivalent:

label_join(
  histogram_quantile(0.50, sum(rate(http_request_duration_bucket[1h])) by (le, service)), "quantile", "", "0.50"
)
or
label_join(
  histogram_quantile(0.99, sum(rate(http_request_duration_bucket[1h])) by (le, service)), "quantile", "", "0.99"
)

Note: PromQL requires label_join or label_replace to distinguish results from different quantiles.

Notes:

  • Input data must contain le or vmrange labels, otherwise calculation is impossible
  • If there is no +Inf bucket, the upper boundary of the last bucket will be used as the maximum value
  • Buckets with count 0 or NaN are skipped
  • Returns -Inf when quantile < 0, and +Inf when > 1

TopN Functions

top

Gets the top N largest values.

Syntax:

top(field, n)

Parameters:

  • field: A numeric field
  • n: Number of values to return

Examples:

// Get the 5 requests with the longest response time
M::response_time:(top(duration, 5)) [1h] BY service

// Get the 10 hosts with the highest traffic
M::network:(top(bytes, 10)) [1h]

Result Example:

service top(duration, 5)
api 1250
api 1180
api 1050
api 980
api 920

Note: Returns multiple rows, each containing one TopN value.


bottom

Gets the top N smallest values.

Syntax:

bottom(field, n)

Parameters:

  • field: A numeric field
  • n: Number of values to return

Examples:

// Get the 5 requests with the shortest response time
M::response_time:(bottom(duration, 5)) [1h] BY service

Result Example:

service bottom(duration, 5)
api 12
api 18
api 25
api 32
api 45

Note: Returns multiple rows, each containing one BottomN value.


Value Collection Functions

distinct

Returns all distinct values of a field.

Syntax:

distinct(field)

Examples:

// Get all distinct status codes
M::http:(distinct(status)) [1h] BY endpoint

Result Example:

endpoint distinct(status)
/api/v1 200
/api/v1 404
/api/v1 500
/health 200

Note: Returns multiple rows, each containing one distinct value.


distinct_by_collapse

Gets distinct values of a field according to a collapse strategy, retaining the last value of other fields during deduplication.

Syntax:

distinct_by_collapse(field, [last_field1, last_field2, ...])

Parameters:

  • field: Field used as the basis for deduplication
  • last_fields (optional): List of fields for which the last value should be retained

Description: Unlike distinct, distinct_by_collapse retains information from other related fields (taking the last value) during deduplication, suitable for scenarios where contextual information needs to be preserved.

Examples:

// Get distinct user IDs, retaining the last access time for each user
L::access_logs:(distinct_by_collapse(user_id, [timestamp])) [1h]

// Get distinct hosts, retaining the last status and message
O::HOST:(distinct_by_collapse(host, [status, message])) [1h]

Result Example:

user_id last(timestamp) last(path)
user001 1704067200000 /checkout
user002 1704067100000 /product
user003 1704067000000 /home

Note: Returns the deduplicated main field value, and the last value of other fields specified by last_fields.


collect

Collects all values (including duplicates).

Syntax:

collect(field [, limit])

Parameters:

  • field: Any field
  • limit (optional): Maximum number of values to collect

Examples:

// Collect all response times
M::response_time:(collect(duration)) [1h] BY service

// Collect up to 100 values
M::response_time:(collect(duration, 100)) [1h] BY service

Result Example:

service collect(duration)
api [120, 135, 98, 142, ...]
web [45, 52, 48, 61, ...]

Note: Returns an array type, containing all collected values (may include duplicates).


collect_distinct

Collects all distinct values.

Syntax:

collect_distinct(field [, limit])

Parameters:

  • field: Any field
  • limit (optional): Maximum number of values to collect

Examples:

// Collect all distinct error types
L::error_logs:(collect_distinct(error_type)) [1h] BY service

Result Example:

service collect_distinct(error_type)
api ["timeout", "connection refused", "404"]
web ["200", "301", "404"]

Note: Returns an array type, containing all deduplicated values.


field_values

Gets all values of a field, returning an array type.

Syntax:

field_values(field)

Examples:

// Get all tag values
M::metrics:(field_values(tags)) [1h] BY metric_name

Result Example:

metric_name field_values(tags)
cpu_usage ["host:A", "env:prod", "team:backend"]
memory_used ["host:B", "env:staging", "team:frontend"]

Note: Returns an array type, containing all values of the field.


Filter Aggregation

count_filter

Counts the number of field values that are in a specified list.

Syntax:

count_filter(field, [value1, value2, ...])

Parameters:

  • field: Any field
  • values: List of values

Examples:

// Count requests with specific status codes
M::http:(count_filter(status, [200, 201, 204])) [1h] BY endpoint

// Count error-level logs
L::logs:(count_filter(level, ["error", "critical"])) [1h] BY service

Helper Functions

default

Sets a default value for a field, returning the default value when the field is empty.

Syntax:

default(field, default_value)

Parameters:

  • field: Any field
  • default_value: Default value (can be a number, string, boolean, or null)

Examples:

// Set a default response time for null values
M::response_time:(default(duration, 0)) [1h] BY service

Time Series Functions

Time series functions are used to process data that changes over time, especially Counter-type metrics.

Rollup Functions

Rollup functions are used to preprocess raw time series data within a time window. For detailed explanation, please refer to Rollup Functions in this article.

Writing Instructions:

  • Rollup is written in the time clause, e.g., [rate], [1h::5m:rate].
  • Writing in the outer layer of the query (e.g., rate(DQL)) belongs to outer functions, not Rollup.

Supported Rollup Functions:

Function Description
rate Calculates growth rate (per second)
irate Calculates instantaneous growth rate
increase Calculates increase amount
deriv Calculates derivative (rate of change)
difference Calculates difference
non_negative_derivative Calculates non-negative derivative
non_negative_difference Calculates non-negative difference
rate_over_sum Calculates average per second
rate_over_count Calculates count per second
sum Sum
avg Average
min Minimum
max Maximum
count Count
first First value
last Last value
stddev Standard deviation
mode Mode
spread Range
any Any value

Examples:

// Calculate request QPS
M::http_requests:(sum(request_count)) [1h::5m:rate] BY service

// Short form
M::cpu:(max(usage)) [rate]

Growth Rate Calculation

rate

Calculates the growth rate of a metric (per second).

Syntax:

rate(field)

Description: rate calculates the average growth rate of a Counter metric within a time window. For monotonically increasing Counter-type metrics, aggregating raw values directly is meaningless; the growth rate needs to be calculated first.

Use Cases:

  • Calculating request QPS
  • Calculating data write rate
  • Analyzing traffic growth trends

Examples:

// Calculate request QPS
M::http_requests:(sum(request_count)) [rate] BY service

// Calculate data write rate
M::data_ingestion:(sum(bytes)) [rate] BY source

irate

Calculates the instantaneous growth rate of a metric.

Syntax:

irate(field)

Description: Unlike rate, irate uses only the last two data points to calculate the growth rate, reflecting the instantaneous rate of change, making it more suitable for alerting scenarios.

Examples:

// Calculate instantaneous request QPS
M::http_requests:(sum(request_count)) [irate] BY service

increase

Calculates the increase amount of a metric.

Syntax:

increase(field)

Description: increase returns the total increase amount within the time window, not the growth rate.

Examples:

// Calculate total request increase
M::http_requests:(sum(request_count)) [increase] BY service

rate_over_sum

Calculates the average per second (sum / time window in seconds).

Syntax:

rate_over_sum(field)

Description: Equivalent to sum(field) / time window (seconds), used to calculate the average per second. Often used in the Rollup stage to convert cumulative values to per-second rates.

Difference from rate:

  • rate: Calculates growth rate for Counter (handles resets)
  • rate_over_sum: Simply divides sum by time window seconds

Examples:

// Calculate average requests per second
M::http_requests:(rate_over_sum(request_count)) [1h] BY service

rate_over_count

Calculates the count per second (count / time window in seconds).

Syntax:

rate_over_count(field)

Description: Equivalent to count(field) / time window (seconds), used to calculate occurrences per second.

Examples:

// Calculate error logs per second
L::error_logs:(rate_over_count(*)) [1h] BY error_type

Difference Calculation

This section describes function semantics. The same function can be used both as a Rollup (e.g., [rate], [increase]) and written in query expressions (e.g., rate(field), increase(field)); they execute at different stages. Choose the position based on business requirements.

rate / deriv

Calculates the rate of change (derivative). rate is used for Counter-type metrics (ignores negative values), deriv is used for Gauge-type metrics (preserves negative values).

Aliases: rate is aliased as non_negative_derivative; deriv is aliased as derivative (PromQL style)

Syntax:

// Counter metrics: Calculate non-negative rate of change (ignores negative values due to resets)
rate(field)

// Gauge metrics: Calculate full rate of change (including negative values)
deriv(field)

Function Selection:

Function Description Use Case
rate Calculates only non-negative rate of change Counter-type metrics (monotonically increasing)
deriv Calculates full rate of change (including negative values) Gauge-type metrics (can increase or decrease)

Examples:

// Counter metrics: Calculate request QPS
M::requests:(rate(count)) [1h::5m] BY service

// Gauge metrics: Calculate memory usage change rate
M::memory:(deriv(used)) [1h::5m] BY host

increase / difference

Calculates the difference between adjacent values. increase is used for Counter-type metrics (ignores negative values), difference is used for Gauge-type metrics (preserves negative values).

Note: increase and difference are two independent functions with different behaviors; they are not aliases.

Syntax:

// Counter metrics: Calculate non-negative difference (ignores negative values due to resets)
increase(field)

// Gauge metrics: Calculate full difference (including negative values)
difference(field)

Function Selection:

Function Description Use Case
increase Calculates only non-negative difference Counter-type metrics (monotonically increasing)
difference Calculates full difference (including negative values) Gauge-type metrics (can increase or decrease)

Examples:

// Counter metrics: Calculate request increase amount
M::requests:(increase(count)) [1h::5m] BY service

// Gauge metrics: Calculate request count change (may increase or decrease)
M::requests:(difference(count)) [1h::5m] BY service

Moving Calculation

moving_average

Calculates the moving average.

Syntax:

moving_average(field, n)

Parameters:

  • field: A numeric field
  • n: Window size (number of data points)

Examples:

// Calculate 5-point moving average
M::cpu:(moving_average(usage, 5)) [1h::1m] BY host

cumsum

Calculates the cumulative sum.

Syntax:

cumsum(DQL_expression)

Examples:

// Calculate cumulative request count
cumsum(M::requests:(sum(count)) [1h::5m] BY service)

Transformation Functions

Transformation functions are used to perform mathematical operations, type conversions, or string processing on field values.

Mathematical Functions

abs

Calculates the absolute value.

Syntax:

abs(field)

Examples:

// Calculate absolute value of temperature deviation
M::temperature:(abs(deviation)) [1h] BY sensor

round / ceil / floor

Rounding functions.

Syntax:

round(field)  // Round to nearest integer
ceil(field)   // Round up
floor(field)  // Round down

Examples:

// Round up response time
M::response_time:(ceil(duration)) [1h]

// Round percentage
M::cpu:(round(usage)) [1h] BY host

log / log2 / log10

Logarithmic functions.

Syntax:

log(field)    // Natural logarithm
log2(field)   // Base-2 logarithm
log10(field)  // Base-10 logarithm

Examples:

// Calculate log-transformed value
M::metrics:(log(value)) [1h] BY metric_name

Type Conversion

int / uint / float / string / bool

Type conversion functions.

Syntax:

int(field)    // Convert to signed integer
uint(field)   // Convert to unsigned integer
float(field)  // Convert to floating-point
string(field) // Convert to string
bool(field)   // Convert to boolean

Examples:

// Convert string to number
L::logs:(int(response_time)) [1h] BY service

// Convert number to string for concatenation
M::metrics:(string(value)) [1h] BY metric_name

String Functions

lower

Converts a string to lowercase.

Syntax:

lower(field)

Parameters:

  • field: A string field

Return Value: Returns the converted lowercase string.


upper

Converts a string to uppercase.

Syntax:

upper(field)

Parameters:

  • field: A string field

Return Value: Returns the converted uppercase string.


trim

Removes whitespace characters from both ends of a string.

Syntax:

trim(field)

Parameters:

  • field: A string field

Return Value: Returns the string with leading and trailing whitespace removed.


ltrim

Removes whitespace characters from the left side of a string.

Syntax:

ltrim(field)

Parameters:

  • field: A string field

Return Value: Returns the string with leading whitespace removed.


rtrim

Removes whitespace characters from the right side of a string.

Syntax:

rtrim(field)

Parameters:

  • field: A string field

Return Value: Returns the string with trailing whitespace removed.


length

Returns the length of a string (counted by characters).

Syntax:

length(field)

Parameters:

  • field: A string field

Return Value: Returns the string length.


substr

Extracts a substring.

Syntax:

substr(field, start)
substr(field, start, length)

Parameters:

  • field: A string field
  • start: Starting position (0-based, negative numbers count from the end)
  • length (optional): Length of the substring

Return Value: Returns the extracted substring.

Examples:

// Extract first 100 characters of a message
L::logs:(substr(message, 0, 100)) [1h]

// Extract last 10 characters
L::logs:(substr(message, -10)) [1h]

Result Example:

message substr(message, 0, 10) substr(message, -5)
"Error: connection timeout" "Error: con" "eout"

regexp_extract

Regular expression extraction.

Syntax:

regexp_extract(field, pattern)
regexp_extract(field, pattern, n)

Parameters:

  • field: A string field
  • pattern: Regular expression
  • n (optional): Extract the nth capture group, defaults to 0 (entire match)

Return Value: Returns a single string, extracting the content of the nth capture group. Returns null if no match.

Examples:

// Extract error code
L::logs:(regexp_extract(message, 'error_code: (\d+)', 1)) [1h]

// Extract IP address
L::nginx:(regexp_extract(message, '(\d+\.\d+\.\d+\.\d+)', 1)) [1h]

Result Example:

service regexp_extract(message, 'error_code: (\d+)', 1)
api "404"
api "500"
web null

regexp_extract_all

Extracts all matching results.

Syntax:

regexp_extract_all(field, pattern)
regexp_extract_all(field, pattern, n)

Return Value: Returns a string array containing all matching substrings.

Examples:

// Extract all numbers
L::logs:(regexp_extract_all(message, '\d+', 0)) [1h]

// Extract all IP addresses
L::logs:(regexp_extract_all(message, '\d+\.\d+\.\d+\.\d+', 0)) [1h]

Result Example:

message regexp_extract_all(message, '\d+.\d+.\d+.\d+', 0)
Request from 192.168.1.1 to 10.0.0.1 ["192.168.1.1", "10.0.0.1"]

md5

Calculates the MD5 hash value.

Syntax:

md5(field)

Return Value: Returns a 32-character hexadecimal string (lowercase).

Examples:

// Calculate MD5 of a message
L::logs:(md5(message)) [1h]

Result Example:

service md5(message)
api 5d41402abc4b2a76b9719d911017c592
web 098f6bcd4621d373cade4e832627b4f6

concat

String concatenation.

Syntax:

concat(field, ...)

Return Value: Returns a single concatenated string.

Examples:

// Concatenate multiple fields
L::logs:(concat(service, ":", level)) [1h]

// Results: "api:error", "web:info", etc.

Result Example:

service level concat(service, ":", level)
api error "api:error"
web info "web:info"

set

Deduplicates and sorts an array field.

Syntax:

set(DQL_expression)

Return Value: Returns a deduplicated and sorted array.

Examples:

// Get all distinct tags
set(M::metrics:(tags) [1h] BY metric_name)

// Deduplicate collect results
set(M::http:(collect(status)) [1h] BY endpoint)

Result Example:

metric_name set(tags)
cpu_usage ["env:prod", "host:A", "team:backend"]
memory_used ["env:staging", "host:B", "team:frontend"]

Log Clustering

drain

Uses the Drain algorithm to cluster logs and generate log templates.

Syntax:

drain(field, similarity_threshold, max_clusters)

Parameters:

  • field: A string field (usually message)
  • similarity_threshold: Similarity threshold, range (0, 1]; larger values mean stricter clustering
  • max_clusters: Maximum number of clusters, range [1, 10000]

Algorithm Description:

Drain is a log clustering algorithm based on a parse tree, capable of automatically identifying constant and variable parts in logs to generate log templates.

Use Cases:

  • Log pattern recognition
  • Anomaly log clustering analysis
  • Log noise reduction

Examples:

// Cluster logs with similarity 0.7, up to 1000 clusters
L::logs:(drain(message, 0.7, 1000)) [1h]

// Stricter clustering
L::logs:(drain(message, 0.9, 500)) [1h]

Result Example:

service drain(message, 0.7, 1000)
api "Request from to took
db "Query executed in

Note: Returns a string-type log template, where <...> represents variable parts.


Matching Functions

Matching functions are used for text matching in WHERE clauses and can also be used as expressions returning boolean values.

Substring Matching

match

Checks if a field contains a specified substring.

Syntax:

match(pattern)
match(field, pattern)

Parameters:

  • pattern: Substring to match
  • field: Field name (optional, can be omitted in WHERE clause)

Examples:

// Use in WHERE
L::logs:(message) {match(message, "error")} [1h]

// Short form
L::logs:(message) {match("error")} [1h]

// As an expression
L::logs:(match(message, "timeout")) [1h] BY match_result

Phrase Matching

Tokenized phrase matching, supports mixed Chinese and English.

Syntax:

search(query)
search(field, query)

Parameters:

  • query: Query phrase
  • field: Field name (optional)

Matching Rules:

  • Chinese: Tokenized by character for matching
  • English: Tokenized by word boundaries (space, punctuation)
  • Case-insensitive

Examples:

// Match logs containing "connection timeout"
L::logs:(message) {search("connection timeout")} [1h]

// Chinese matching
L::logs:(message) {search("连接超时")} [1h]

// Mixed Chinese and English
L::logs:(message) {search("error 错误")} [1h]

Regular Expression Matching

re / regex / regexp

Regular expression matching.

Syntax:

re(pattern)
re(field, pattern)
regex(pattern)
regex(field, pattern)
regexp(pattern)
regexp(field, pattern)

Parameters:

  • pattern: Regular expression (supports PromRegex syntax)
  • field: Field name (optional)

Examples:

// Match logs starting with error
L::logs:(message) {re("error.*")} [1h]

// Match error codes of specific format
L::logs:(message) {regexp(message, "ERR-\d{4}")} [1h]

// Use regex in data source
M::re('cpu.*'):(usage) [1h]

Wildcard Matching

wildcard

Wildcard pattern matching.

Syntax:

wildcard(pattern)
wildcard(field, pattern)

Parameters:

  • pattern: Wildcard pattern
  • *: Matches any number of characters
  • ?: Matches a single character
  • field: Field name (optional)

Examples:

// Match messages starting with error
L::logs:(message) {wildcard("error*")} [1h]

// Match specific format
L::logs:(message) {wildcard(message, "ERR-????")} [1h]

CIDR Matching

cidr

IP address network segment matching.

Syntax:

cidr(cidr)
cidr(field, cidr)

Parameters:

  • cidr: Network segment in CIDR notation, e.g., 192.168.1.0/24
  • field: IP address field (optional)

Examples:

// Match internal IPs
L::nginx:(*) {cidr(client_ip, "10.0.0.0/8")} [1h]

// Match specific network segment
L::nginx:(*) {cidr(client_ip, "192.168.1.0/24")} [1h]

Field Existence Check

exists

Checks if a field exists. exists() is a special placeholder, typically used on the right side of a comparison expression.

Syntax:

exists()

Recommended Usage:

field = exists()   // Field exists (non-null)
field != exists()  // Field does not exist or is null

Examples:

// Find logs with error_type field
L::logs:(message) {error_type = exists()} [1h]

// Find logs without error_type field
L::logs:(message) {error_type != exists()} [1h]

Query String Syntax

query_string

Uses query string syntax for complex matching.

Syntax:

query_string(query)
query_string(field, query)

Parameters:

  • query: Query string
  • field: Field name (optional, defaults to full-text matching)

Query String Syntax:

1. Term Matching
foo              # Match content containing foo
"foo bar"        # Exact phrase match, must appear consecutively
foo\ bar         # Escape space, match "foo bar" as a whole
2. Wildcards
foo*             # Match content starting with foo
foo?bar          # ? matches a single character
"foo*bar"        # Wildcards inside quotes are not parsed, matched literally
3. Regular Expressions
/foo.*bar/       # Regular expression wrapped in slashes
/joh?n(ath[oa]n)/  # Complex regex
4. Boolean Operators
foo AND bar      # Logical AND, both must be contained
foo OR bar       # Logical OR, at least one must be contained
NOT foo          # Logical NOT, does not contain foo

# Short forms
foo && bar       # Equivalent to foo AND bar
foo || bar       # Equivalent to foo OR bar
!foo             # Equivalent to NOT foo
5. Grouping
(foo OR bar) AND baz       # Use parentheses to change precedence
!(status 429 reading)      # Negate the entire expression
6. Default Operator

When multiple terms are separated by spaces, they are connected by OR by default (can be configured to AND):

foo bar          # Equivalent to foo OR bar

Examples:

// Simple term matching
L::logs:(message) {query_string("error timeout")} [1h]

// Boolean combination
L::logs:(message) {query_string("error AND NOT timeout")} [1h]

// Regular expression
L::logs:(message) {query_string("/ERR-\d{4}/")} [1h]

// Complex query
L::logs:(message) {query_string("(error OR warn) AND service")} [1h]

// Specify field
L::logs:(*) {query_string(message, "error AND timeout")} [1h]

// Chinese query
L::logs:(message) {query_string("错误 AND 超时")} [1h]

Outer Functions

Usage Recommendation: Outer functions are legacy design. For scenarios that can be solved using Rollup + aggregation functions (e.g., [rate], [last], [increase], etc.), prefer the Rollup approach. Use outer functions only in scenarios where Rollup is not applicable (e.g., when secondary calculation on aggregated results is needed).

Outer functions operate on the entire DQL query result, used for secondary calculation on the output time series data. Outer functions wrap the entire DQL expression, rather than being written inside the Select clause.

In-Query Functions vs. Outer Functions

  • In-query functions: Used inside DQL expressions, e.g., sum, avg, max
  • Outer functions: Wrap the entire DQL query result, performing post-processing on the output time series

Writing Comparison:

// Rollup (time clause): First calculate growth rate for each time series, then aggregate
M::http_requests:(sum(request_count)) [rate] BY service

// Outer function: First get query result, then perform secondary calculation
rate(M::http_requests:(sum(request_count)) [1h::1m] BY service)

Syntax:

outer_function(DQL_expression)

Examples:

// In-query function: Average raw data
M::cpu:(avg(usage)) [1h::5m] BY host

// Outer function: Calculate moving average on query result
moving_average(M::cpu:(avg(usage)) [1h::5m] BY host, 5)

Cumulative Calculation

cumsum

Calculates the cumulative sum, computing the sum of all previous points for each point in the time series.

Syntax:

cumsum(DQL_expression)

Examples:

// Calculate cumulative request count
cumsum(M::requests:(sum(count)) [1h::5m] BY service)

Difference and Derivative

Recommended Usage: These functions are preferably used as Rollup functions (e.g., [rate], [deriv]). The outer function form is only used for secondary calculation on aggregated results.

The following functions can be used as outer functions:

Function Description
derivative(DQL) Calculates derivative (rate of change)
difference(DQL) Calculates difference from previous value
non_negative_derivative(DQL) Calculates non-negative derivative
non_negative_difference(DQL) Calculates non-negative difference
rate(DQL) Calculates growth rate (per second)
irate(DQL) Calculates instantaneous growth rate

Examples:

// Outer function: Calculate derivative on query result
derivative(M::cpu:(avg(usage)) [1h::5m] BY host)

// Recommended: Use Rollup approach
M::cpu:(deriv(usage)) [1h::5m:last] BY host

Moving Calculation

moving_average

Calculates moving average on query result.

Recommended Usage: Prefer the Rollup approach moving_average(field, n). The outer function form is only used for secondary smoothing on aggregated results.

Syntax:

moving_average(DQL_expression, n)

Parameters:

  • DQL_expression: DQL query expression
  • n: Window size (number of data points)

Examples:

// Outer function: Calculate moving average on query result
moving_average(M::cpu:(avg(usage)) [1h::1m] BY host, 5)

// Recommended: Use Rollup approach
M::cpu:(moving_average(usage, 5)) [1h::1m] BY host

TopN

top / bottom

Recommended Usage: Prefer the Rollup approach top(field, n) or bottom(field, n). The outer function form is only used for secondary filtering on aggregated results.

Gets TopN or BottomN from query result.

Syntax:

top(DQL_expression, n)
bottom(DQL_expression, n)

Examples:

// Outer function: Get TopN from query result
top(M::response_time:(max(duration)) [1h::5m] BY service, 5)

// Recommended: Use Rollup approach
M::response_time:(top(duration, 5)) [1h::5m] BY service

Null Value Filling

fill

Fills null values in query result.

For detailed explanation, please refer to fill function.

Examples:

// Fill nulls with 0
fill(M::cpu:(avg(usage)) [1h::5m] BY host, 0)

// Fill with linear interpolation
fill(M::cpu:(avg(usage)) [1h::5m] BY host, LINEAR)

Other Outer Functions

The following functions can also be used as outer functions:

Function Description
abs(DQL) Absolute value
round(DQL) Round
ceil(DQL) Round up
floor(DQL) Round down
log(DQL) / log2(DQL) / log10(DQL) Logarithmic transformation
set(DQL) Deduplicate and sort
concat(DQL, ...) String concatenation

Combined Usage

Outer functions can be combined:

// Calculate moving average then round
round(moving_average(M::cpu:(avg(usage)) [1h::1m] BY host, 5))

// Calculate moving average of growth rate
moving_average(rate(M::requests:(sum(count)) [1h::5m] BY service), 3)

eval Expression Calculation

eval is a special function that allows expression calculation outside the query, referencing results from multiple subqueries for combined operations.

Syntax

eval(expression, name1=(query1), name2=(query2), ..., alias="result_name")

Parameters:

  • expression: Mathematical expression, using name.field to reference subquery results
  • name=(query): Named subquery
  • alias: Result alias (optional)

Note: eval also supports the historical syntax name="query", but name=(query) is recommended for better type expression and readability.

How It Works

  1. Execute all named subqueries
  2. Align results of each subquery by time
  3. Calculate the expression for each time point
  4. Return the calculation result

Use Cases

  • Calculating ratios of multiple metrics (e.g., error rate, utilization rate)
  • Comparing metrics from different time periods
  • Combining calculation results from multiple data sources

Examples

Calculate Error Rate

// Calculate error rate = error count / total requests * 100
eval(a / b * 100,
     a=(M::http:(sum(error_count)) [1h] BY service),
     b=(M::http:(sum(request_count)) [1h] BY service),
     alias="error_rate")

Calculate CPU Usage

// Usage = used / total * 100
eval(used / total * 100,
     used=(M::memory:(sum(used_bytes)) [1h] BY host),
     total=(M::memory:(sum(total_bytes)) [1h] BY host),
     alias="memory_usage_percent")

Calculate Baseline Comparison Growth Rate

// Calculate growth rate of current value relative to baseline
eval(current / baseline - 1,
     current=(M::sales_current:(sum(amount)) [7d]),
     baseline=(M::sales_baseline:(sum(amount)) [7d]),
     alias="growth_rate")

Reference Subquery Fields

// Reference specific fields of subqueries
eval(a.usage / b.total * 100,
     a=(M::cpu:(avg(usage) as usage) [1h] BY host),
     b=(M::cpu:(avg(total) as total) [1h] BY host),
     alias="cpu_percent")

Notes

  • Time windows of all subqueries must be compatible
  • Grouping dimensions of subqueries should be consistent
  • Field names in expressions use name.field format
  • If there is only one subquery, field names can be used directly

Other Functions

fill

Fills specified values for null values in query result.

Recommended Usage: fill is recommended to be used as an outer function, acting on the entire query result:

fill(M::cpu:(avg(usage)) [1h::5m] BY host, 0)

Although the syntax fill(avg(usage), 0) is also supported in the Select clause, fill actually fills the result after aggregation is complete, so the outer function form better matches its working mechanism.

Syntax (Outer Function):

fill(DQL_expression, value)
fill(DQL_expression, LINEAR)
fill(DQL_expression, PREVIOUS)

Parameters:

  • DQL_expression: DQL query expression
  • value: Fill value, supports multiple modes:
  • Specific value: number, string, null
  • LINEAR: Linear interpolation
  • PREVIOUS: Fill with previous non-null value

Examples:

// Recommended: Use as outer function
fill(M::cpu:(avg(usage)) [1h::5m] BY host, 0)

// Fill with linear interpolation
fill(M::cpu:(avg(usage)) [1h::5m] BY host, LINEAR)

// Fill with previous value
fill(M::cpu:(avg(usage)) [1h::5m] BY host, PREVIOUS)

now

Returns the current timestamp (milliseconds).

Syntax:

now()

Examples:

// Query recently updated data
O::HOST:(*) {last_update_time > now() - 600000}  // Updated within 10 minutes

unwrap

Unwraps the packaging of aggregated results.

Syntax:

unwrap(field)

Examples:

// Unwrap aggregated field
M::cpu:(unwrap(usage)) [1h] BY host

Show Functions

Show functions are used to view metadata (e.g., measurement, tag, field, cardinality and series count), often used for modeling troubleshooting and pre-query exploration.

General Syntax

show_xxx(arg1=..., arg2=...){ where_conditions } [time_window] LIMIT n OFFSET m
  • where, time_window, LIMIT, OFFSET are all optional.
  • LIMIT/OFFSET cannot be negative.

M Namespace Built-in Show Functions

Function Parameters Return Columns Description
show_measurement Optional re('pattern') name List measurements
show_tag_key Optional from=['measurement'] tagKey List tag keys
show_field_key Optional from=['measurement'] fieldKey, fieldType List field keys (current fieldType is float)
show_tag_value keyin=['tagKey'] (required), optional from key, value List tag values
show_measurement_cardinality No mandatory parameters count Number of measurements
show_series_cardinality No mandatory parameters count Series cardinality (estimated)
show_tag_key_cardinality No mandatory parameters count Tag key cardinality (estimated)
show_tag_value_cardinality keyin=['tagKey'] (required) count Value cardinality for specified tag key (estimated)
show_field_key_cardinality No mandatory parameters count Field key cardinality (estimated)
show_series_count_by_field_key from=['measurement'] (recommended) name, count Count series by field key
show_series_count_by_tag_key from=['measurement'] (recommended) name, count, value_count Count series and values by tag key
show_series_count_by_tag_value keyin=['tagKey'] (required), from=['measurement'] (recommended) name, count Count series by value of specified tag key

Cardinality-related functions use HyperLogLog merging at the bottom layer, returning estimated values.

Non-M Namespace Show Functions (Suffix Pattern)

For non-M namespaces, the following suffix patterns are supported:

  • show_<namespace>_source
  • show_<namespace>_class
  • show_<namespace>_type
  • show_<namespace>_field
  • show_<namespace>_label

Where <namespace> is automatically mapped from the middle segment of the function name, for example:

  • show_logging_source -> L
  • show_tracing_field -> T
  • show_object_source -> O

Common examples:

show_logging_source()
show_tracing_field('mysql')
show_logging_field('*')
show_logging_label(name='env')
show_logging_label(names=['env', 'team'])

Parameter and Behavior Description

  • from: List of measurements, supports string or string array.
  • keyin: List of tag keys, supports string or string array.
  • field: List of fields, supports string or string array (for metric show field filtering).
  • For show_*_field:
  • Unnamed parameters (e.g., 'mysql') are usually used as source filtering.
  • '*' is equivalent to not specifying a source.
  • Named parameters are converted to where filtering conditions.
  • For show_*_label:
  • Requires named parameters; names is treated as an alias for name.

Constraints and Notes

  • show_tag_value and show_tag_value_cardinality must provide keyin.
  • show_series_count_by_tag_value must provide keyin.
  • show_series_count_by_* must provide from, or provide equivalent source constraints in where (e.g., @__source__ condition).
  • show_<namespace>_source, show_<namespace>_class, show_<namespace>_type currently share the same execution path, returning deduplicated source list.
  • Currently parser does not support show_<namespace>_index syntax (even though the execution layer has corresponding branches).
  • When Query API does not provide a show time range, some log show queries will fall back to the last 30-minute window for execution.

Return Examples

The following examples only show typical column structure and sample rows; actual results are affected by tenant data, filtering conditions, time range, and LIMIT/OFFSET.

Query Typical Columns Sample Rows (Illustrative)
show_measurement() name cpu, disk, memory
show_tag_value(from=['cpu'], keyin=['host']) key, value host, web-01; host, web-02
show_series_count_by_tag_key(from=['cpu']) name, count, value_count host, 3200, 120; service, 2800, 35
show_tag_value_cardinality(keyin=['host']) count 120
show_logging_field('*') fieldKey, fieldType, fieldIndices service, keyword, ["idx_service"]
show_logging_source() source nginx, mysql, redis

Function Classification Quick Reference

Basic Aggregation

Function Description Exact/Estimated
sum Sum Exact
avg Average Exact
count Count Exact
count_distinct Distinct count Estimated (HyperLogLog, error≈0.4%)
min / max Minimum/Maximum Exact
first / last First/Last value Exact
any Any value Exact

Statistical Aggregation

Function Description Exact/Estimated
percentile / pXX Percentile Estimated (log histogram)
median Median Estimated
stddev Standard deviation Exact
mode Mode Exact
spread Range Exact
count_series Time series count Exact

Filter Aggregation

Function Description Exact/Estimated
top / bottom TopN / BottomN Exact
count_filter Conditional count Exact

Histogram Functions

Function Description Data Source Exact/Estimated
histogram_auto Automatic histogram (Recommended) Log, Trace detail data Estimated
histogram Fixed-boundary histogram (deprecated) Log, Trace detail data Exact
histogram_quantile Calculate quantile from Prometheus histogram Prometheus metrics Estimated

Set Functions

Function Description Exact/Estimated
distinct Distinct value list Exact
distinct_by_collapse Collapse deduplication (retain other fields) Exact
collect Collect all values Exact
collect_distinct Collect distinct values Exact

Helper Functions

Function Description Exact/Estimated
default Set default value Exact

Time Series Functions

Function Description
rate Growth rate (per second)
irate Instantaneous growth rate
increase Increase amount
derivative Derivative
difference Difference
non_negative_derivative Non-negative derivative
non_negative_difference Non-negative difference
moving_average Moving average
cumsum Cumulative sum

Rollup Functions

Function Description
rate Calculate growth rate (per second)
irate Calculate instantaneous growth rate
increase Calculate increase amount
rate_over_sum Calculate average per second
rate_over_count Calculate count per second
deriv Calculate derivative
difference Calculate difference
sum Sum
avg Average
min Minimum
max Maximum
count Count
first First value
last Last value
stddev Standard deviation
mode Mode
spread Range
any Any value

Transformation Functions

Function Description
abs Absolute value
round / ceil / floor Rounding
log / log2 / log10 Logarithm
int / uint / float / string / bool Type conversion
substr Substring
regexp_extract Regex extraction
regexp_extract_all Regex extract all
md5 MD5 hash
concat String concatenation
set Array deduplication and sorting
drain Log clustering

Matching Functions

Function Description
match Substring matching
phrase / search Tokenized phrase matching
re / regex / regexp Regex matching
wildcard Wildcard matching
cidr CIDR network segment matching
query_string Query string syntax
exists Field existence check

Outer Functions (Prefer Rollup)

Function Description
cumsum Cumulative sum
rate / irate Growth rate (non-negative)
deriv Derivative (allows negative)
increase Increase amount (non-negative)
difference Difference (allows negative)
moving_average Moving average
top / bottom TopN
fill Null value filling (recommended as outer)
abs / round / ceil / floor Mathematical operations
set Deduplication and sorting
concat String concatenation

Expression Calculation

Function Description
eval Multi-query expression calculation