Skip to content

PostgreSQL

·


PostgreSQL collector can collect the running status index from PostgreSQL instance, and collect the index to TrueWatch to help monitor and analyze various abnormal situations of PostgreSQL.

Configuration

Preconditions

  • PostgreSQL version >= 9.0
  • Create user

    -- PostgreSQL >= 10
    create user datakit with password '<PASSWORD>';
    grant pg_monitor to datakit;
    grant SELECT ON pg_stat_database to datakit;
    
    -- PostgreSQL < 10
    create user datakit with password '<PASSWORD>';
    grant SELECT ON pg_stat_database to datakit;
    
  • Enable the pg_stat_statements Extension (Optional)

    When collecting PostgreSQL Object, some key metrics such as qps/tps/avg_query_time rely on the pg_stat_statements extension. The specific steps to enable this extension are as follows:

    • Modify the Configuration File

      Locate and edit the PostgreSQL configuration file (common paths are /var/lib/pgsql/data/postgresql.conf or /etc/postgresql/<version>/main/postgresql.conf), and add or modify the following configuration items:

      # Enable the pg_stat_statements
      shared_preload_libraries = 'pg_stat_statements'
      
      # Optional configuration
      pg_stat_statements.track = 'all'  # collect all SQL statements
      pg_stat_statements.max = 10000  # max number of SQL statements to track
      pg_stat_statements.track_utility = off # ignore utility statements, only track regular SQL statements like `SELECT`, `INSERT`, `UPDATE`, `DELETE`
      
    • Restart the PostgreSQL Service

      After modifying the configuration file, it is necessary to restart the PostgreSQL service for the configurations to take effect.​

    • Create the Extension

      Connect to the target database and execute the following SQL statement to create the extension:

      CREATE EXTENSION pg_stat_statements;
      
      • Verify the Successful Enablement of the Extension:
      SELECT * FROM pg_extension WHERE extname = 'pg_stat_statements';
      SELECT * FROM pg_stat_statements LIMIT 10;
      

Collector Configuration

Go to the conf.d/samples directory under the DataKit installation directory, copy postgresql.conf.sample and name it postgresql.conf. Examples are as follows:

[[inputs.postgresql]]
  ## Server address
  # URI format
  # postgres://[datakit[:PASSWORD]]@localhost[/dbname]?sslmode=[disable|verify-ca|verify-full]
  # or simple string
  # host=localhost user=pqgotest password=... sslmode=... dbname=app_production

  address = "postgres://datakit:PASSWORD@localhost/postgres?sslmode=disable"

  ## Ignore databases which are gathered. Do not use with 'databases' option.
  #
  # ignored_databases = ["db1"]

  ## Specify the list of the databases to be gathered. Do not use with the 'ignored_databases' option.
  #
  # databases = ["db1"]

  ## Specify the name used as the "server" tag.
  #
  # outputaddress = "db01"

  ## Collect interval
  # Time unit: "ns", "us", "ms", "s", "m", "h"
  #
  interval = "10s"

  ## @param connect_timeout - duration - optional - default: "10s"
  ## Timeout for connecting to PostgreSQL and executing collector queries.
  # connect_timeout = "10s"

  ## Set true to enable election
  #
  election = true

  ## v2+ overrides metric-category measurement names to "postgresql_metric". Default: v2
  ## Set to "v1" to keep legacy names: postgresql, postgresql_function, postgresql_lock, postgresql_stat,
  ## postgresql_index, postgresql_size, postgresql_statio, postgresql_replication, postgresql_replication_slot,
  ## postgresql_slru, postgresql_io, postgresql_bgwriter, postgresql_connection, postgresql_conflict,
  ## postgresql_archiver, postgresql_dbm_metric, postgresql_dbm_session, postgresql_dbm_connection
  measurement_version = "v2"

  ## Set true to enable collecting function metrics from pg_stat_user_functions.
  #
  collect_function_metrics = false

  ## Metric name in metric_exclude_list will not be collected.
  #
  metric_exclude_list = [""]

  ## Set dbm to true to collect database activity 
  dbm = false

  ## Config dbm metric 
  [inputs.postgresql.dbm_metric]
    enabled = true
    interval = "60s"

  ## Config dbm sample 
  [inputs.postgresql.dbm_sample]
    enabled = true
    ## @param explain_cache_ttl - duration - optional - default: "10m"
    ## TTL for explain-rate cache.
    explain_cache_ttl = "10m"
    ## @param plan_cache_ttl - duration - optional - default: "1h"
    ## Do not re-emit the same execution plan within this window.
    plan_cache_ttl = "1h"

  ## Config dbm activity
  [inputs.postgresql.dbm_activity]
    enabled = true
    interval = "10s"

  ## Config database discovery. 
  # Discovered databases will be used to for database specific metric collection, 
  # such as object collection and function metrics collection.
  #
  [inputs.postgresql.auto_discovery_database]
    # Set true to enable database discovery
    enabled = false

    # Maximum number of databases to discover
    max_databases = 100

    # Regex pattern to include databases
    include = [".*"]

    # Regex pattern to exclude databases. The pattern in "exclude" takes precedence over "include". 
    exclude = ["model", "msdb", "cloudsqladmin", "rdsadmin"]

    #Discovery interval
    interval = "600s"

  ## collect object
  [inputs.postgresql.object]
    # Set true to enable collecting objects
    enabled = true

    # interval to collect postgresql object which will be greater than collection interval
    interval = "600s"

    [inputs.postgresql.object.collect_schemas]
      # Set true to enable collecting schemas
      enabled = true

      # Maximum number of tables to collect
      max_tables = 300

      # Maximum number of databases to collect
      max_database = 100

  ## Relations config
  #
  # The list of relations/tables can be specified to track per-relation metrics. 
  # To collect relation metrics, you need to specify the databases or database_regex field to indicate which databases to collect.
  # relation_name refer to the name of a relation, either relation_name or relation_regex must be set.
  # relation_regex is a regex rule, only takes effect when relation_name is not set.
  # schemas used for filtering, ignore this field when it is empty
  # relkind can be a list of the following options:
  #   r(ordinary table), i(index), S(sequence), t(TOAST table), p(partitioned table),
  #   m(materialized view), c(composite type), f(foreign table)
  # Size metrics are collected only for ordinary tables. Index metrics are collected only for user indexes.
  # Lock metrics are collected for all relation types. The rest of the metrics are collected only for user tables.
  #
  # [[inputs.postgresql.relations]]
  # databases = ["postgres"]
  # database_regex = "<DATABASE_PATTERN>"
  # relation_name = "<TABLE_NAME>"
  # relation_regex = "<TABLE_PATTERN>"
  # schemas = ["public"]
  # relkind = ["r", "p"]

  ## Run a custom SQL query and collect corresponding metrics.
  #
  # [[inputs.postgresql.custom_queries]]
  #   sql = '''
  #     select datname,numbackends,blks_read
  #     from pg_stat_database
  #     limit 10
  #   '''
  #   metric = "postgresql_custom_stat"
  #   tags = ["datname" ]
  #   fields = ["numbackends", "blks_read"]
  #   interval = "10s"

  ## Log collection
  #
  # [inputs.postgresql.log]
  # files = []
  # pipeline = "postgresql.p"

  ## Custom tags
  #
  [inputs.postgresql.tags]
  # some_tag = "some_value"
  # more_tag = "some_other_value"
  # ...

After configuration, restart DataKit.

The collector can now be turned on by ConfigMap Injection Collector Configuration.

Database Performance Metrics Collection

Version-1.84.0

Database performance metrics are mainly derived from the built-in system views and extension plugins of PostgreSQL, with the most core ones including pg_stat_activity and pg_stat_statements. These tools provide methods to obtain the internal execution status of the database at runtime: pg_stat_activity displays real-time information such as the activity status of current sessions, executed queries, and waiting events; pg_stat_statements records the execution statistics of historical SQL statements, including execution times, time consumption, IO status, etc. Through these views and plugins, DataKit can collect statement metrics, query samples, execution plans, real-time session activities, and the derived session/connection metrics required for PostgreSQL DBM.

To enable this feature, the following steps need to be performed.

  • Modify the configuration file to enable monitoring and collection
[[inputs.postgresql]]

  ## Set dbm to true to collect database activity 
  dbm = true

  ## Config dbm metric 
  [inputs.postgresql.dbm_metric]
    enabled = true
    interval = "60s"

  ## Config dbm sample 
  [inputs.postgresql.dbm_sample]
    enabled = true
    explain_cache_ttl = "10m"
    plan_cache_ttl = "1h"

  ## Config dbm activity
  [inputs.postgresql.dbm_activity]
    enabled = true
    interval = "10s"

...

When DBM is enabled, DataKit also collects postgresql_dbm_session and postgresql_dbm_connection automatically. postgresql_dbm_session is aggregated from DBM activity rows, while postgresql_dbm_connection is collected through a separate grouped query against pg_stat_activity. No extra switch is required for these two metric sets.

  • PostgreSQL Configuration

Modify the configuration file (e.g., postgresql.conf) and configure the relevant parameters:

shared_preload_libraries = 'pg_stat_statements'
track_activity_query_size = 4096 # Required for collection of larger queries.
  • Permission Configuration

Account Authorization

ALTER ROLE datakit INHERIT;

Execute the following SQL in each database:

CREATE SCHEMA datakit;
GRANT USAGE ON SCHEMA datakit TO datakit;
GRANT USAGE ON SCHEMA public TO datakit;
GRANT pg_monitor TO datakit;
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

Execute the following SQL in each database:

CREATE SCHEMA datakit;
GRANT USAGE ON SCHEMA datakit TO datakit;
GRANT USAGE ON SCHEMA public TO datakit;
GRANT pg_monitor TO datakit;
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

Execute the following SQL in each database:

CREATE SCHEMA datakit;
GRANT USAGE ON SCHEMA datakit TO datakit;
GRANT USAGE ON SCHEMA public TO datakit;
GRANT SELECT ON pg_stat_database TO datakit;
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;

CREATE OR REPLACE FUNCTION datakit.pg_stat_activity() RETURNS SETOF pg_stat_activity AS
  $$ SELECT * FROM pg_catalog.pg_stat_activity; $$
LANGUAGE sql
SECURITY DEFINER;
CREATE OR REPLACE FUNCTION datakit.pg_stat_statements() RETURNS SETOF pg_stat_statements AS
    $$ SELECT * FROM pg_stat_statements; $$
LANGUAGE sql
SECURITY DEFINER;
  • Additional permissions and explain-plan helper

    If dbm_sample.enabled = true, create the helper function in every database where execution plans need to be collected:

    CREATE OR REPLACE FUNCTION datakit.explain_statement(
       l_query TEXT,
       OUT explain JSON
    )
    RETURNS SETOF JSON AS
    $$
    DECLARE
      curs REFCURSOR;
      plan JSON;
    BEGIN
      OPEN curs FOR EXECUTE pg_catalog.concat('EXPLAIN (FORMAT JSON) ', l_query);
      FETCH curs INTO plan;
      CLOSE curs;
      RETURN QUERY SELECT plan;
    END;
    $$
    LANGUAGE 'plpgsql'
    RETURNS NULL ON NULL INPUT
    SECURITY DEFINER;
    

Metric

For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.postgresql.tags] if needed:

postgresql_metric

Metric set including PostgreSQL database, function, lock, statistics, index, size, relation I/O, replication, replication slot, simple LRU cache, I/O, background writer, connection, conflict, archiver, and DBM (metric/session/connection) statistics, unified in v2

Tags & Fields Description
application_name
(tag)
Name of the application connected to this backend
backend_type
(tag)
Type of backend (e.g. background worker, autovacuum worker)
client_addr
(tag)
IP address of the client connected to this backend.
context
(tag)
The context of an I/O operation
database_instance
(tag)
PostgreSQL instance identifier from configured tag database_instance or system_identifier. Common tag.
db
(tag)
The database name
function
(tag)
The function name
host
(tag)
The server host address. Common tag.
locktype
(tag)
The lock type
mode
(tag)
The lock mode
name
(tag)
The name of the SLRU
object
(tag)
Target object of an I/O operation
pg_index
(tag)
The index name
query_signature
(tag)
The hash signature computed from db, rolname, and normalized query text.
queryid
(tag)
The query ID reported by pg_stat_statements.
rolname
(tag)
The role name
schema
(tag)
The schema name
server
(tag)
The address of the server. The value is host:port. Common tag.
session_status
(tag)
Derived session status: active / idle / blocked.
slot_name
(tag)
The replication slot name
slot_type
(tag)
The replication slot type
state
(tag)
Current overall state of this backend
table
(tag)
The table name
usename
(tag)
Name of the user logged into this backend
wait_event
(tag)
Wait event name if backend is currently waiting.
wait_event_type
(tag)
Type of event for which the backend is waiting.
wait_group
(tag)
Datakit unified wait group: Lock, I/O, Concurrency, Network, CPU, Commit/Log, Other.
active_time Time spent executing SQL statements in this database, in milliseconds.
Type: float | (count)
Unit: count
Tagged by: db
analyze_count The number of times this table has been manually analyzed.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
archived_count Number of WAL files that have been successfully archived.
Type: int | (count)
Unit: count
archived_failed_count Number of failed attempts for archiving WAL files.
Type: int | (count)
Unit: count
autoanalyze_count The number of times this table has been analyzed by the autovacuum daemon.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
autovacuum_count The number of times this table has been vacuumed by the autovacuum daemon.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
avg_total_exec_time Average execution time per call in current interval.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
avg_total_plan_time Average plan time per call in current interval.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
blk_read_time Total time the statement spent reading data file blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). Postgres < 17
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
blk_write_time Total time the statement spent writing data file blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). Postgres < 17
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
blks_exists Number of blocks checked for existence for this SLRU (simple least-recently-used) cache.
Type: int | (gauge)
Unit: count
Tagged by: name
blks_hit Number of block cache hits. Emitted either as database-wide statistics (tagged by db) or SLRU cache statistics (tagged by name).
Type: int | (gauge)
Unit: count
Tagged by: name
blks_read Number of disk blocks read. Emitted either as database-wide statistics (tagged by db) or SLRU cache statistics (tagged by name).
Type: int | (gauge)
Unit: count
Tagged by: name
blks_written Number of disk blocks written for this SLRU (simple least-recently-used) cache.
Type: int | (gauge)
Unit: count
Tagged by: name
blks_zeroed Number of blocks zeroed during initializations of SLRU (simple least-recently-used) cache.
Type: int | (gauge)
Unit: count
Tagged by: name
buffers_alloc The number of buffers allocated
Type: int | (count)
Unit: count
buffers_backend The number of buffers written directly by a backend.
Type: int | (count)
Unit: count
buffers_backend_fsync The of times a backend had to execute its own fsync call instead of the background writer.
Type: int | (count)
Unit: count
buffers_checkpoint The number of buffers written during checkpoints.
Type: int | (count)
Unit: count
buffers_clean The number of buffers written by the background writer.
Type: int | (count)
Unit: count
calls Number of times the statement was executed.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
checkpoint_sync_time The total amount of checkpoint processing time spent synchronizing files to disk.
Type: float | (count)
Unit: time,ms
checkpoint_write_time The total amount of checkpoint processing time spent writing files to disk.
Type: float | (count)
Unit: time,ms
checkpoints_req The number of requested checkpoints that were performed.
Type: int | (count)
Unit: count
checkpoints_timed The number of scheduled checkpoints that were performed.
Type: int | (count)
Unit: count
confl_bufferpin Number of queries in this database that have been canceled due to pinned buffers.
Type: int | (count)
Unit: count
Tagged by: db
confl_deadlock Number of queries in this database that have been canceled due to deadlocks.
Type: int | (count)
Unit: count
Tagged by: db
confl_lock Number of queries in this database that have been canceled due to dropped tablespaces. This will occur when a temp_tablespace is dropped while being used on a standby.
Type: int | (count)
Unit: count
Tagged by: db
confl_snapshot Number of queries in this database that have been canceled due to old snapshots.
Type: int | (count)
Unit: count
Tagged by: db
confl_tablespace Number of queries in this database that have been canceled due to dropped tablespaces. This will occur when a temp_tablespace is dropped while being used on a standby.
Type: int | (count)
Unit: count
Tagged by: db
connection_count Number of active user connections grouped by application, state, user and database.
Type: int | (gauge)
Unit: count
Tagged by: application_name, db, state, usename
database_size The disk space used by this database.
Type: int | (gauge)
Unit: count
Tagged by: db
dbm_qps DBM query throughput derived from delta_total_calls divided by the DBM metric collection interval.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
deadlocks The number of deadlocks detected in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
delta_blk_read_time Delta value of blk_read_time between collection intervals.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
delta_blk_write_time Delta value of blk_write_time between collection intervals.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
delta_calls Delta value of calls between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_local_blks_dirtied Delta value of local_blks_dirtied between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_local_blks_hit Delta value of local_blks_hit between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_local_blks_read Delta value of local_blks_read between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_local_blks_written Delta value of local_blks_written between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_rows Delta value of rows between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_shared_blk_read_time Delta value of shared_blk_read_time between collection intervals.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
delta_shared_blk_write_time Delta value of shared_blk_write_time between collection intervals.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
delta_shared_blks_dirtied Delta value of shared_blks_dirtied between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_shared_blks_hit Delta value of shared_blks_hit between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_shared_blks_read Delta value of shared_blks_read between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_shared_blks_written Delta value of shared_blks_written between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_temp_blks_read Delta value of temp_blks_read between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_temp_blks_written Delta value of temp_blks_written between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_total_calls Delta value of total_calls between collection intervals.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_total_exec_time Delta value of execution time between collection intervals. For older PostgreSQL versions this is derived from total_time.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
delta_total_plan_time Delta value of total_plan_time between collection intervals.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
delta_wal_bytes Delta value of wal_bytes between collection intervals.
Type: float | (gauge)
Unit: digital,B
Tagged by: db, query_signature, queryid, rolname
delta_wal_fpi Delta value of wal_fpi between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
delta_wal_records Delta value of wal_records between collection intervals.
Type: float | (gauge)
Unit: count
Tagged by: db, query_signature, queryid, rolname
evictions Number of times a block has been written out from a shared or local buffer in order to make it available for another use.
Type: int | (gauge)
Unit: count
Tagged by: backend_type, context, object
extend_time Time spent in extend operations in milliseconds (if track_io_timing is enabled, otherwise zero).
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: backend_type, context, object
extends Number of relation extend operations, each of the size specified in op_bytes.
Type: int | (gauge)
Unit: count
Tagged by: backend_type, context, object
flushes Number of flush of dirty data for this SLRU (simple least-recently-used) cache.
Type: int | (gauge)
Unit: count
Tagged by: name
fsync_time Time spent in fsync operations in milliseconds (if track_io_timing is enabled, otherwise zero).
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: backend_type, context, object
fsyncs Number of fsync calls. These are only tracked in context normal.
Type: int | (gauge)
Unit: count
Tagged by: backend_type, context, object
function_calls The number of times this function has been called.
Type: int | (count)
Unit: count
Tagged by: db, function, schema
function_self_time Time spent in this function, not including time spent in called functions, in milliseconds.
Type: float | (timeStamp,msec)
Unit: count
Tagged by: db, function, schema
function_total_time Total time spent in this function, in milliseconds.
Type: float | (timeStamp,msec)
Unit: count
Tagged by: db, function, schema
heap_blks_hit The number of buffer hits in this table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
heap_blks_read The number of disk blocks read from this table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
hits The number of times a desired block was found in a shared buffer.
Type: int | (gauge)
Unit: count
Tagged by: backend_type, context, object
idle_in_transaction_time Time spent idling while in a transaction in this database, in milliseconds.
Type: float | (count)
Unit: count
Tagged by: db
idx_blks_hit The number of buffer hits in all indexes on this table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
idx_blks_read The number of disk blocks read from all indexes on this table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
idx_scan Number of index scans. When pg_index is present the metric comes from per-index statistics; otherwise it comes from table-level statistics.
Type: int | (gauge)
Unit: count
Tagged by: db, pg_index, schema, table
idx_tup_fetch Number of live rows fetched by index scans. When pg_index is present the metric comes from per-index statistics; otherwise it comes from table-level statistics.
Type: int | (gauge)
Unit: count
Tagged by: db, pg_index, schema, table
idx_tup_read The number of index entries returned by scans on this index.
Type: int | (gauge)
Unit: count
Tagged by: db, pg_index, schema, table
index_size The total disk space used by indexes attached to the specified table.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, schema, table
local_blks_dirtied Total number of local blocks dirtied by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
local_blks_hit Total number of local block cache hits by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
local_blks_read Total number of local blocks read by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
local_blks_written Total number of local blocks written by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
lock_count The number of locks active for this database.
Type: int | (gauge)
Unit: count
Tagged by: db, locktype, mode, schema, table
max_connections The maximum number of client connections allowed to this database.
Type: float | (gauge)
Unit: count
max_plan_time Maximum time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero).
Type: int | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
maxwritten_clean The number of times the background writer stopped a cleaning scan due to writing too many buffers.
Type: int | (count)
Unit: count
min_plan_time Minimum time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero).
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
n_dead_tup The estimated number of dead rows.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
n_live_tup The estimated number of live rows.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
n_tup_del The number of rows deleted by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
n_tup_hot_upd The number of rows HOT updated, meaning no separate index update was needed.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
n_tup_ins The number of rows inserted by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
n_tup_upd The number of rows updated by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
numbackends The number of active connections to this database.
Type: int | (gauge)
Unit: count
Tagged by: db
percent_usage_connections The number of connections to this database as a fraction of the maximum number of allowed connections.
Type: float | (gauge)
Unit: count
read_time Time spent in read operations in milliseconds (if track_io_timing is enabled, otherwise zero).
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: backend_type, context, object
reads Number of read operations, each of the size specified in op_bytes.
Type: int | (gauge)
Unit: count
Tagged by: backend_type, context, object
replication_delay The current replication delay in seconds. Only available with postgresql 9.1 and newer.
Type: int | (gauge)
Unit: time,s
Tagged by: db
replication_delay_bytes The current replication delay in bytes. Only available with postgresql 9.2 and newer.
Type: int | (gauge)
Unit: digital,B
Tagged by: db
rows Total number of rows retrieved or affected by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
seq_scan The number of sequential scans initiated on this table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
seq_tup_read The number of live rows fetched by sequential scans.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
session_blocked_count Number of blocked sessions in this dimension group.
Type: int | (gauge)
Unit: count
Tagged by: application_name, client_addr, db, session_status, usename, wait_event, wait_event_type, wait_group
session_group_count Number of sessions in this dimension group.
Type: int | (gauge)
Unit: count
Tagged by: application_name, client_addr, db, session_status, usename, wait_event, wait_event_type, wait_group
session_time Time spent by database sessions in this database, in milliseconds.
Type: float | (count)
Unit: count
Tagged by: db
sessions Total number of sessions established to this database.
Type: int | (count)
Unit: count
Tagged by: db
sessions_abandoned Number of database sessions to this database that were terminated because connection to the client was lost.
Type: int | (count)
Unit: count
Tagged by: db
sessions_fatal Number of database sessions to this database that were terminated by fatal errors.
Type: int | (count)
Unit: count
Tagged by: db
sessions_killed Number of database sessions to this database that were terminated by operator intervention.
Type: int | (count)
Unit: count
Tagged by: db
shared_blk_read_time Total time the statement spent reading shared blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). Postgres >= 17
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
shared_blk_write_time Total time the statement spent writing shared blocks, in milliseconds (if track_io_timing is enabled, otherwise zero). Postgres >= 17
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
shared_blks_dirtied Total number of shared blocks dirtied by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
shared_blks_hit Total number of shared block cache hits by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
shared_blks_read Total number of shared blocks read by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
shared_blks_written Total number of shared blocks written by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
spill_bytes Amount of decoded transaction data spilled to disk while performing decoding of changes from WAL for this slot. This and other spill counters can be used to gauge the I/O which occurred during logical decoding and allow tuning logical_decoding_work_mem. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, slot_name, slot_type
spill_count Number of times transactions were spilled to disk while decoding changes from WAL for this slot. This counter is incremented each time a transaction is spilled, and the same transaction may be spilled multiple times. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: count
Tagged by: db, slot_name, slot_type
spill_txns Number of transactions spilled to disk once the memory used by logical decoding to decode changes from WAL has exceeded logical_decoding_work_mem. The counter gets incremented for both top-level transactions and subtransactions. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: count
Tagged by: db, slot_name, slot_type
stream_bytes Amount of transaction data decoded for streaming in-progress transactions to the decoding output plugin while decoding changes from WAL for this slot. This and other streaming counters for this slot can be used to tune logical_decoding_work_mem. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, slot_name, slot_type
stream_count Number of times in-progress transactions were streamed to the decoding output plugin while decoding changes from WAL for this slot. This counter is incremented each time a transaction is streamed, and the same transaction may be streamed multiple times. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: count
Tagged by: db, slot_name, slot_type
stream_txns Number of in-progress transactions streamed to the decoding output plugin after the memory used by logical decoding to decode changes from WAL for this slot has exceeded logical_decoding_work_mem. Streaming only works with top-level transactions (subtransactions can't be streamed independently), so the counter is not incremented for subtransactions. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: count
Tagged by: db, slot_name, slot_type
table_size The total disk space used by the specified table with TOAST data. Free space map and visibility map are not included.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, schema, table
temp_blks_read Total number of temp blocks read by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
temp_blks_written Total number of temp blocks written by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
temp_bytes The amount of data written to temporary files by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
temp_files The number of temporary files created by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
tidx_blks_hit The number of buffer hits in this table's TOAST table index.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
tidx_blks_read The number of disk blocks read from this table's TOAST table index.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
toast_blks_hit The number of buffer hits in this table's TOAST table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
toast_blks_read The number of disk blocks read from this table's TOAST table.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
total_bytes Amount of transaction data decoded for sending transactions to the decoding output plugin while decoding changes from WAL for this slot. Note that this includes data that is streamed and/or spilled. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, slot_name, slot_type
total_calls Total number of statement calls aggregated across all pg_stat_statements rows.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
total_exec_time Total time spent executing the statement, in milliseconds.
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
total_plan_time Total time spent planning the statement, in milliseconds (if pg_stat_statements.track_planning is enabled, otherwise zero).
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: db, query_signature, queryid, rolname
total_size The total disk space used by the table, including indexes and TOAST data.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, schema, table
total_txns Number of decoded transactions sent to the decoding output plugin for this slot. This counts top-level transactions only, and is not incremented for subtransactions. Note that this includes the transactions that are streamed and/or spilled. Only available with PostgreSQL 14 and newer.
Type: int | (gauge)
Unit: count
Tagged by: db, slot_name, slot_type
truncates Number of truncates for this SLRU (simple least-recently-used) cache.
Type: int | (gauge)
Unit: count
Tagged by: name
tup_deleted The number of rows deleted by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
tup_fetched The number of rows fetched by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
tup_inserted The number of rows inserted by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
tup_returned The number of rows returned by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
tup_updated The number of rows updated by queries in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
vacuum_count The number of times this table has been manually vacuumed.
Type: int | (gauge)
Unit: count
Tagged by: db, schema, table
wal_bytes Total amount of WAL generated by the statement in bytes.
Type: int | (gauge)
Unit: digital,B
Tagged by: db, query_signature, queryid, rolname
wal_fpi Total number of WAL full page images generated by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
wal_records Total number of WAL records generated by the statement.
Type: int | (count)
Unit: count
Tagged by: db, query_signature, queryid, rolname
wraparound The number of transactions that can occur until a transaction wraparound.
Type: float | (gauge)
Unit: count
Tagged by: db
write_time Time spent in write operations in milliseconds (if track_io_timing is enabled, otherwise zero).
Type: float | (gauge)
Unit: timeStamp,msec
Tagged by: backend_type, context, object
writes Number of write operations, each of the size specified in op_bytes.
Type: int | (gauge)
Unit: count
Tagged by: backend_type, context, object
xact_commit The number of transactions that have been committed in this database.
Type: int | (gauge)
Unit: count
Tagged by: db
xact_rollback The number of transactions that have been rolled back in this database.
Type: int | (gauge)
Unit: count
Tagged by: db

collector

Tags & Fields Description
instance
(tag)
Server addr of the instance
job
(tag)
Server name of the instance
up
Type: int | (gauge)
Unit: -

Object

db_query

PostgreSQL DBM query object. Each object represents a unique normalized SQL statement identified by query_signature, which is derived from db, rolname, and query text.

Tags & Fields Description
database_instance
(tag)
PostgreSQL instance identifier from configured tag database_instance or system_identifier.
database_type
(tag)
The type of database. The value is PostgreSQL
db
(tag)
The database name
name
(tag)
Object identity built from server, optional database_instance, and query_signature.
query_signature
(tag)
The hash signature computed from db, rolname, and normalized query text.
queryid
(tag)
The query ID reported by pg_stat_statements, if available.
rolname
(tag)
The role name
server
(tag)
The PostgreSQL server address
message The normalized/obfuscated SQL text.
Type: string
Unit: -

db_exec_plan

PostgreSQL database statement sample information, including statement text, execution plans, used for detailed analysis of SQL execution and performance issues.( Version-1.84.0)

Tags & Fields Description
application_name
(tag)
Name of the application that is connected to this backend.
client_addr
(tag)
IP address of the client that is connected to this backend.
client_hostname
(tag)
Host name of the connected client, as reported by a reverse DNS lookup of client_addr. This field will only be non-null for IP connections, and only when log_hostname is enabled.
client_port
(tag)
TCP port number that the client is using for communication, or -1 if a Unix socket is used.
database_instance
(tag)
PostgreSQL instance identifier from configured tag database_instance or system_identifier.
database_type
(tag)
The database type. The value is PostgreSQL
db
(tag)
Name of the database this backend is connected to.
name
(tag)
Object identity built from server, optional database_instance, and query_signature:plan_signature.
plan_signature
(tag)
The hash signature computed from the normalized execution plan.
plan_type
(tag)
The format of the plan content. The value is JSON.
query_signature
(tag)
The hash value computed from query
server
(tag)
The server address
service
(tag)
The service name postgresql
usename
(tag)
Name of the user logged into this backend.
message The obfuscated/normalized JSON execution plan definition.
Type: string | (gauge)
Unit: -
statement The obfuscated/normalized SQL text corresponding to this execution plan.
Type: string | (gauge)
Unit: -

database

PostgreSQL object metrics( Version-1.76.0)

Tags & Fields Description
database_instance
(tag)
PostgreSQL instance identifier from configured tag database_instance or system_identifier.
database_type
(tag)
The type of the database. The value is PostgreSQL
host
(tag)
The hostname of the PostgreSQL server
name
(tag)
The object identifier. The value is <server>-<database_instance> when database_instance is configured, otherwise host:port.
port
(tag)
The port of the PostgreSQL server
server
(tag)
The server address of the PostgreSQL server. The value is host:port
version
(tag)
The version of the PostgreSQL server
avg_query_time The average time taken by a query to execute
Type: float
Unit: timeStamp,usec
message Summary of database information
Type: string
Unit: -
qps The number of queries executed by the database per second
Type: float
Unit: gauge
slow_queries The number of queries that have taken more than long_query_time seconds. This counter increments regardless of whether the slow query log is enabled.
Type: int
Unit: count
slow_query_log Whether the slow query log is enabled. The value can OFF to disable the log or ON to enable the log.
Type: string
Unit: -
tps The number of transactions executed by the database per second
Type: float
Unit: gauge
uptime The number of seconds that the server has been up
Type: int
Unit: time,s

Structure of the message field

The basic structure of the message field is as follows:​

{
  "setting": {
    "DateStyle":"ISO, MDY",
    ...
  },

  "databases": [ # databases information
    {
      "name": "db1",
      "encoding": "utf8",
      "owner": "datakit",
      "schemas": [ # schemas information
        {
          "name": "schema1",
          "owner": "datakit",
          "tables": [ # tables information
            {
              "name": "table1",
              "columns": [], # columns information
              "indexes": [], # indexes information
              "foreign_keys": [], # foreign keys information
            }
            ...
          ]
        },
        ...
      ]
    }
    ...
  ]
}

setting

The data in the setting field is sourced from the pg_settings system view and is used to display the configuration parameter information of the current database. For more details, please refer to the PostgreSQL documentation.

databases

The databases field stores information about all databases on the PostgreSQL server. The detailed information of each database is shown in the following table:

Field Name Description Type
name Database name string
encoding Database encoding string
owner Role name string
description Description text string
schemas List containing schema information list

The schemas field contains information about all the schemas in the database. The information for each schema is as follows:

Field Name Description Type
name schema name string
owner Role name string
tables List containing table information list

The tables field contains information about all the tables in the database. The information for each table is as follows:

Field Name Description Type
name table name string
owner table owner string
has_indexes Whether there are indexes bool
has_partitions Whether there are partitions bool
toast_table toast table name string
partition_key Partition key string
num_partitions Number of partitions int64
foreign_keys List containing foreign key information list
columns List containing column information list
indexes List containing index information list
  • tables.columns

The columns field contains information about all the columns in the tables. The information for each column is as follows:

Field Name Description Type
name Name string
data_type Data type string
nullable Whether column can be null bool
default Default value string
  • tables.indexes

The indexes field contains information about all the indexes in the database tables. The information for each index is as follows:

Field Name Description Type
name Name string
columns Columns included in the index list
index_type Index type string
definition Index definition string
is_unique Whether unique bool
is_primary Whether primary bool
is_exclusion Whether exclusion constraint index bool
is_immediate Whether to check constraints immediately after each statement execution bool
is_valid Whether it is valid bool
is_clustered Whether it is a clustered index bool
is_checkxmin Whether to check xmin bool
is_ready Whether it is ready bool
is_live Whether it is live bool
is_replident Whether it is a row identifier index bool
is_partial Whether it is a partial index bool

The indexes.columns field, which contains information about the columns included in the index, has the following details for each column:

Field Name Description Type
name Name string
  • tables.foreign_keys

The foreign_keys field contains information about all the foreign_keys in the database tables. The information for each foreign_key is as follows:

Field Name Description Type
name Name string
definition Foreign key definition string
constraint_schema Foreign key schema string
column_names Foreign key column names string
referenced_table_schema Referenced table schema string
referenced_table_name Referenced table name string
referenced_column_names Referenced column names string
update_action Cascade update rule (such as CASCADE, RESTRICT) string
delete_action Cascade delete rule (such as CASCADE, SET NULL) string

Logging

postgresql_dbm_activity

PostgreSQL database activity information collected from the pg_stat_activity view. Provides detailed information about the current activity of each backend, including query status, execution time, and resource usage.( Version-1.84.0)

Tags & Fields Description
application_name
(tag)
Name of the application that is connected to this backend.
backend_type
(tag)
Type of current backend.
client_addr
(tag)
IP address of the client that is connected to this backend. If this field is null, it indicates either that the client is connected via a Unix socket or that this is an internal process (such as autovacuum worker).
client_hostname
(tag)
Host name of the connected client, as reported by a reverse DNS lookup of client_addr. This field will only be non-null for IP connections, and only when log_hostname is enabled.
client_port
(tag)
TCP port number that the client is using for communication, or -1 if a Unix socket is used.
db
(tag)
The database name.
message
(tag)
Text of a normalized statement.
pid
(tag)
Process ID of this backend.
query_signature
(tag)
The hash value computed from query
server
(tag)
The server address
service
(tag)
The service name postgresql
state
(tag)
Current overall state of this backend.
status
(tag)
The status of the statement. The value is only info for now.
usename
(tag)
Name of the user logged into this backend.
wait_event
(tag)
Wait event name if backend is currently waiting.
wait_event_type
(tag)
Type of event for which the backend is waiting.
wait_group
(tag)
Datakit unified wait group: Lock, I/O, Concurrency, Network, CPU, Commit/Log, Other.
backend_start Time when this process was started. For client backends, this is the time the client connected to the server.
Type: int | (count)
Unit: timeStamp,usec
blocking_pids The raw pg_blocking_pids(pid) result for this backend.
Type: string | (string)
Unit: -
duration The query duration derived from query_start and state/sample time.
Type: int | (gauge)
Unit: time,μs
query_start Time when the currently active query was started, or if state is not active, when the last query was started.
Type: int | (count)
Unit: timeStamp,usec
state_change Time when the state was last changed.
Type: int | (count)
Unit: timeStamp,usec
tx_duration The transaction duration derived from sample time - xact_start.
Type: int | (gauge)
Unit: time,μs
wait_duration The wait duration derived from sample time - state_change.
Type: int | (gauge)
Unit: time,μs
xact_start Time when this process' current transaction was started, or null if no transaction is active. If the current query is the first of its transaction, this column is equal to the query_start column.
Type: int | (count)
Unit: timeStamp,usec

File log

Log Collection

  • PostgreSQL logs are output to stderr by default. To open file logs, configure them in postgresql's configuration file /etc/postgresql/<VERSION>/main/postgresql.conf as follows:
logging_collector = on    # Enable log writing to files

log_directory = 'pg_log'  # Set the file storage directory, absolute path or relative path (relative PGDATA)

log_filename = 'pg.log'   # Log file name
log_statement = 'all'     # Record all queries

#log_duration = on
log_line_prefix= '%m [%p] %d [%a] %u [%h] %c ' # 日志行前缀
log_file_mode = 0644

# For Windows
#log_destination = 'eventlog'

For more configuration, please refer to the doc

  • The PostgreSQL collector does not have log collection enabled by default. You can open files in conf.d/db/postgresql.conf and write to the absolute path of the PostgreSQL log file. For example:
[[inputs.postgresql]]

  ...

  [inputs.postgresql.log]
  files = ["/tmp/pgsql/postgresql.log"]

When log collection is turned on, a log with a log source of postgresql is generated by default.

Notices:

  • Log collection only supports logs on hosts where DataKit is installed.

Log Pipeline Cut

The original log is

2021-05-31 15:23:45.110 CST [74305] test [pgAdmin 4 - DB:postgres] postgres [127.0.0.1] 60b48f01.12241 LOG:  statement:
        SELECT psd.*, 2^31 - age(datfrozenxid) as wraparound, pg_database_size(psd.datname) as pg_database_size
        FROM pg_stat_database psd
        JOIN pg_database pd ON psd.datname = pd.datname
        WHERE psd.datname not ilike 'template%'   AND psd.datname not ilike 'rdsadmin'
        AND psd.datname not ilike 'azure_maintenance'   AND psd.datname not ilike 'postgres'

Description of the cut field:

Field name Field Value Description
application_name pgAdmin 4 - DB:postgres The name of the application connecting to the current database
db_name test Database accessed
process_id 74305 The client process ID of the current connection
remote_host 127.0.0.1 Address of the client
session_id 60b48f01.12241 ID of the current session
user postgres Current Access User Name
status LOG Current log level (LOG,ERROR,FATAL,PANIC,WARNING,NOTICE,INFO)
time 1622445825110000000 Log generation time

FAQ

Missing metrics

For metrics postgresql_lock/postgresql_stat/postgresql_index/postgresql_size/postgresql_statio, the relations field in the configuration file needs to be enabled. If some of these metrics are partially missing, it may be because there is no data for the relevant metrics.