Skip to content

DataKit metrics

· Version-1.10.0


This input collects DataKit runtime metrics, including runtime environment, CPU, memory usage, and core module metrics. The collected data can be used by the DataKit dashboard, Bug Report troubleshooting, and runtime metric archiving.

Configuration

After DataKit starts, it exposes Prometheus metrics. The dk input starts by default and replaces the earlier self input.

Basic features:

  • Collects DataKit CPU, memory, Goroutine, HTTP API, data upload, Pipeline, filter, disk cache, and other runtime metrics.
  • Uses interval to adjust the collection interval and metric_types to limit collected metric types.
  • Supports custom tags through [inputs.dk.tags].

Starting from Version-2.2.0, dk collects all DataKit self-metrics except internally blocked metrics by default and no longer provides metric_name_filter. To keep only part of the metrics, use Pipeline filtering by metric name.

To disable DataKit self-metric collection, set the following in dk.conf:

[[inputs.dk]]
  enabled = false

Collector Configuration

To adjust the collection interval, metric types, self profiling, or other settings, go to the conf.d/samples directory under the DataKit installation directory, copy dk.conf.sample and name it dk.conf. Examples are as follows:

[[inputs.dk]]

  # set false to disable Datakit self-metrics collection
  enabled = true

  # keep empty to collect all types(count/gauge/summary/...)
  metric_types = []

  # collect frequency
  interval = "30s"

  # Upload Datakit runtime profiles when resource thresholds are matched. Disabled by default.
  [inputs.dk.self_profiling]
    enabled  = false # enable threshold-triggered self profiling
    interval = "10s" # interval for checking process CPU and memory
    cooldown = "5m" # minimum interval between two profile collections

    # Profiles to collect on each trigger. CPU is sampled for duration/emergency_duration;
    # other profile types are collected as snapshots.
    enabled_types      = ["cpu", "heap", "goroutine"] # cpu, heap, goroutine
    duration           = "30s"                        # CPU sample duration for normal threshold triggers
    emergency_duration = "10s"                        # CPU sample duration for emergency threshold triggers

    # Resource bases for percent thresholds. Same unit style as [resource_limit].
    cpu_cores  = 2.0  # CPU cores used as the 100% base
    mem_max_mb = 4096 # memory MiB used as the 100% base

    # Normal thresholds use the average of the latest recent_points samples.
    recent_points     = 5     # number of samples for average thresholds
    cpu_usage_percent = 80    # average CPU percent threshold, based on cpu_cores; set 0 to disable this threshold
    mem_usage_percent = 80    # average memory percent threshold, based on mem_max_mb; set 0 to disable this threshold
    mem_usage_mb      = 3072  # average RSS memory threshold in MiB; set 0 to disable this threshold

    # Emergency thresholds use the current sample.
    mem_usage_percent_emergency = 95    # current memory percent threshold, based on mem_max_mb; set 0 to disable this threshold
    mem_usage_mb_emergency      = 0     # current RSS memory threshold in MiB; set 0 to disable this threshold

    # Local queue and upload settings. Profiles are queued locally before uploading to Dataway.
    cache_path        = "dk_self_profile" # disk queue path; relative path is under DataKit cache dir
    cache_capacity_mb = 1024              # disk queue capacity in MiB
    send_timeout      = "60s"             # timeout for each upload attempt
    send_retry_count  = 4                 # max upload attempts for each queued profile payload

[inputs.dk.tags]
   # tag1 = "val-1"
   # tag2 = "val-2"

After configuration, restart DataKit.

Can be turned on by ConfigMap Injection Collector Configuration or Config ENV_DATAKIT_INPUTS .

Configuration can also be adjusted through environment variables:

  • ENV_INPUT_DK_INTERVAL

    Collect interval

    Type: Duration

    input.conf: interval

    Example: 10s

    Default: 30s

  • ENV_INPUT_DK_ENABLE_SELF_PROFILING

    Enable threshold-triggered DataKit self profiling

    Type: Boolean

    input.conf: self_profiling.enabled

    Example: true

    Default: false

Metric

DataKit exported Prometheus metrics, see here for full metric list.

Self Profiling

Starting from Version-2.2.0, dk supports threshold-triggered DataKit self profiling. This feature is disabled by default. It can collect CPU, heap, goroutine, and other profiles when DataKit CPU or memory reaches the configured thresholds.

For host installation, enable it in dk.conf:

[inputs.dk.self_profiling]
  enabled = true

For Kubernetes installation, enable it through an environment variable:

ENV_INPUT_DK_ENABLE_SELF_PROFILING=true

See [inputs.dk.self_profiling] in the sample configuration above for threshold, profile type, cache, and upload settings.