PostgreSQL
PostgreSQL collector can collect the running status index from PostgreSQL instance, and collect the index to TrueWatch to help monitor and analyze various abnormal situations of PostgreSQL.
Configuration¶
Preconditions¶
- PostgreSQL version >= 9.0
-
Create user
-
Enable the
pg_stat_statements
Extension (Optional)When collecting PostgreSQL Object, some key metrics such as
qps/tps/avg_query_time
rely on thepg_stat_statements
extension. The specific steps to enable this extension are as follows:-
Modify the Configuration File
Locate and edit the PostgreSQL configuration file (common paths are
/var/lib/pgsql/data/postgresql.conf
or/etc/postgresql/<version>/main/postgresql.conf
), and add or modify the following configuration items:# Enable the pg_stat_statements shared_preload_libraries = 'pg_stat_statements' # Optional configuration pg_stat_statements.track = 'all' # collect all SQL statements pg_stat_statements.max = 10000 # max number of SQL statements to track pg_stat_statements.track_utility = off # ignore utility statements, only track regular SQL statements like `SELECT`, `INSERT`, `UPDATE`, `DELETE`
-
Restart the PostgreSQL Service
After modifying the configuration file, it is necessary to restart the PostgreSQL service for the configurations to take effect.
-
Create the Extension
Connect to the target database and execute the following SQL statement to create the extension:
- Verify the Successful Enablement of the Extension:
-
Collector Configuration¶
Go to the conf.d/db
directory under the DataKit installation directory, copy postgresql.conf.sample
and name it postgresql.conf
. Examples are as follows:
[[inputs.postgresql]]
## Server address
# URI format
# postgres://[datakit[:PASSWORD]]@localhost[/dbname]?sslmode=[disable|verify-ca|verify-full]
# or simple string
# host=localhost user=pqgotest password=... sslmode=... dbname=app_production
address = "postgres://datakit:PASSWORD@localhost/postgres?sslmode=disable"
## Ignore databases which are gathered. Do not use with 'databases' option.
#
# ignored_databases = ["db1"]
## Specify the list of the databases to be gathered. Do not use with the 'ignored_databases' option.
#
# databases = ["db1"]
## Specify the name used as the "server" tag.
#
# outputaddress = "db01"
## Collect interval
# Time unit: "ns", "us", "ms", "s", "m", "h"
#
interval = "10s"
## Set true to enable election
#
election = true
## Metric name in metric_exclude_list will not be collected.
#
metric_exclude_list = [""]
## collect object
[inputs.postgresql.object]
# Set true to enable collecting objects
enabled = true
# interval to collect postgresql object which will be greater than collection interval
interval = "600s"
[inputs.postgresql.object.collect_schemas]
# Set true to enable collecting schemas
enabled = true
# Maximum number of tables to collect
max_tables = 300
# Set true to enable auto discovery database
auto_discovery_database = false
# Maximum number of databases to collect
max_database = 100
## Relations config
#
# The list of relations/tables can be specified to track per-relation metrics. To collect relation
# relation_name refer to the name of a relation, either relation_name or relation_regex must be set.
# relation_regex is a regex rule, only takes effect when relation_name is not set.
# schemas used for filtering, ignore this field when it is empty
# relkind can be a list of the following options:
# r(ordinary table), i(index), S(sequence), t(TOAST table), p(partitioned table),
# m(materialized view), c(composite type), f(foreign table)
#
# [[inputs.postgresql.relations]]
# relation_name = "<TABLE_NAME>"
# relation_regex = "<TABLE_PATTERN>"
# schemas = ["public"]
# relkind = ["r", "p"]
## Run a custom SQL query and collect corresponding metrics.
#
# [[inputs.postgresql.custom_queries]]
# sql = '''
# select datname,numbackends,blks_read
# from pg_stat_database
# limit 10
# '''
# metric = "postgresql_custom_stat"
# tags = ["datname" ]
# fields = ["numbackends", "blks_read"]
# interval = "10s"
## Log collection
#
# [inputs.postgresql.log]
# files = []
# pipeline = "postgresql.p"
## Custom tags
#
[inputs.postgresql.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
# ...
After configuration, restart DataKit.
The collector can now be turned on by ConfigMap Injection Collector Configuration.
Metric¶
For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.postgresql.tags]
if needed:
postgresql
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
active_time | Time spent executing SQL statements in this database, in milliseconds. Type: float Unit: count |
blks_hit | The number of times disk blocks were found in the buffer cache, preventing the need to read from the database. Type: int Unit: count |
blks_read | The number of disk blocks read in this database. Type: int Unit: count |
database_size | The disk space used by this database. Type: int Unit: count |
deadlocks | The number of deadlocks detected in this database. Type: int Unit: count |
idle_in_transaction_time | Time spent idling while in a transaction in this database, in milliseconds. Type: float Unit: count |
numbackends | The number of active connections to this database. Type: int Unit: count |
session_time | Time spent by database sessions in this database, in milliseconds. Type: float Unit: count |
sessions | Total number of sessions established to this database. Type: int Unit: count |
sessions_abandoned | Number of database sessions to this database that were terminated because connection to the client was lost. Type: int Unit: count |
sessions_fatal | Number of database sessions to this database that were terminated by fatal errors. Type: int Unit: count |
sessions_killed | Number of database sessions to this database that were terminated by operator intervention. Type: int Unit: count |
temp_bytes | The amount of data written to temporary files by queries in this database. Type: int Unit: count |
temp_files | The number of temporary files created by queries in this database. Type: int Unit: count |
tup_deleted | The number of rows deleted by queries in this database. Type: int Unit: count |
tup_fetched | The number of rows fetched by queries in this database. Type: int Unit: count |
tup_inserted | The number of rows inserted by queries in this database. Type: int Unit: count |
tup_returned | The number of rows returned by queries in this database. Type: int Unit: count |
tup_updated | The number of rows updated by queries in this database. Type: int Unit: count |
wraparound | The number of transactions that can occur until a transaction wraparound. Type: float Unit: count |
xact_commit | The number of transactions that have been committed in this database. Type: int Unit: count |
xact_rollback | The number of transactions that have been rolled back in this database. Type: int Unit: count |
postgresql_lock
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
locktype ( tag ) |
The lock type |
mode ( tag ) |
The lock mode |
schema ( tag ) |
The schema name |
server ( tag ) |
The address of the server. The value is host:port |
table ( tag ) |
The table name |
lock_count | The number of locks active for this database. Type: int Unit: count |
postgresql_index
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
pg_index ( tag ) |
The index name |
schema ( tag ) |
The schema name |
server ( tag ) |
The address of the server. The value is host:port |
table ( tag ) |
The table name |
idx_scan | The number of index scans initiated on this table, tagged by index. Type: int Unit: count |
idx_tup_fetch | The number of live rows fetched by index scans. Type: int Unit: count |
idx_tup_read | The number of index entries returned by scans on this index. Type: int Unit: count |
postgresql_replication
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
replication_delay | The current replication delay in seconds. Only available with postgresql 9.1 and newer.Type: int Unit: time,s |
replication_delay_bytes | The current replication delay in bytes. Only available with postgresql 9.2 and newer.Type: int Unit: digital,B |
postgresql_replication_slot
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
slot_name ( tag ) |
The replication slot name |
slot_type ( tag ) |
The replication slot type |
spill_bytes | Amount of decoded transaction data spilled to disk while performing decoding of changes from WAL for this slot. This and other spill counters can be used to gauge the I/O which occurred during logical decoding and allow tuning logical_decoding_work_mem . Only available with PostgreSQL 14 and newer.Type: int Unit: digital,B |
spill_count | Number of times transactions were spilled to disk while decoding changes from WAL for this slot. This counter is incremented each time a transaction is spilled, and the same transaction may be spilled multiple times. Only available with PostgreSQL 14 and newer. Type: int Unit: count |
spill_txns | Number of transactions spilled to disk once the memory used by logical decoding to decode changes from WAL has exceeded logical_decoding_work_mem . The counter gets incremented for both top-level transactions and subtransactions. Only available with PostgreSQL 14 and newer.Type: int Unit: count |
stream_bytes | Amount of transaction data decoded for streaming in-progress transactions to the decoding output plugin while decoding changes from WAL for this slot. This and other streaming counters for this slot can be used to tune logical_decoding_work_mem . Only available with PostgreSQL 14 and newer.Type: int Unit: digital,B |
stream_count | Number of times in-progress transactions were streamed to the decoding output plugin while decoding changes from WAL for this slot. This counter is incremented each time a transaction is streamed, and the same transaction may be streamed multiple times. Only available with PostgreSQL 14 and newer. Type: int Unit: count |
stream_txns | Number of in-progress transactions streamed to the decoding output plugin after the memory used by logical decoding to decode changes from WAL for this slot has exceeded logical_decoding_work_mem . Streaming only works with top-level transactions (subtransactions can't be streamed independently), so the counter is not incremented for subtransactions. Only available with PostgreSQL 14 and newer.Type: int Unit: count |
total_bytes | Amount of transaction data decoded for sending transactions to the decoding output plugin while decoding changes from WAL for this slot. Note that this includes data that is streamed and/or spilled. Only available with PostgreSQL 14 and newer. Type: int Unit: digital,B |
total_txns | Number of decoded transactions sent to the decoding output plugin for this slot. This counts top-level transactions only, and is not incremented for subtransactions. Note that this includes the transactions that are streamed and/or spilled. Only available with PostgreSQL 14 and newer. Type: int Unit: count |
postgresql_size
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
schema ( tag ) |
The schema name |
server ( tag ) |
The address of the server. The value is host:port |
table ( tag ) |
The table name |
index_size | The total disk space used by indexes attached to the specified table. Type: int Unit: digital,B |
table_size | The total disk space used by the specified table with TOAST data. Free space map and visibility map are not included. Type: int Unit: digital,B |
total_size | The total disk space used by the table, including indexes and TOAST data. Type: int Unit: digital,B |
postgresql_statio
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
schema ( tag ) |
The schema name |
server ( tag ) |
The address of the server. The value is host:port |
table ( tag ) |
The table name |
heap_blks_hit | The number of buffer hits in this table. Type: int Unit: count |
heap_blks_read | The number of disk blocks read from this table. Type: int Unit: count |
idx_blks_hit | The number of buffer hits in all indexes on this table. Type: int Unit: count |
idx_blks_read | The number of disk blocks read from all indexes on this table. Type: int Unit: count |
tidx_blks_hit | The number of buffer hits in this table's TOAST table index. Type: int Unit: count |
tidx_blks_read | The number of disk blocks read from this table's TOAST table index. Type: int Unit: count |
toast_blks_hit | The number of buffer hits in this table's TOAST table. Type: int Unit: count |
toast_blks_read | The number of disk blocks read from this table's TOAST table. Type: int Unit: count |
postgresql_stat
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
schema ( tag ) |
The schema name |
server ( tag ) |
The address of the server. The value is host:port |
table ( tag ) |
The table name |
analyze_count | The number of times this table has been manually analyzed. Type: int Unit: count |
autoanalyze_count | The number of times this table has been analyzed by the autovacuum daemon.Type: int Unit: count |
autovacuum_count | The number of times this table has been vacuumed by the autovacuum daemon.Type: int Unit: count |
idx_scan | The number of index scans initiated on this table, tagged by index. Type: int Unit: count |
idx_tup_fetch | The number of live rows fetched by index scans. Type: int Unit: count |
n_dead_tup | The estimated number of dead rows. Type: int Unit: count |
n_live_tup | The estimated number of live rows. Type: int Unit: count |
n_tup_del | The number of rows deleted by queries in this database. Type: int Unit: count |
n_tup_hot_upd | The number of rows HOT updated, meaning no separate index update was needed. Type: int Unit: count |
n_tup_ins | The number of rows inserted by queries in this database. Type: int Unit: count |
n_tup_upd | The number of rows updated by queries in this database. Type: int Unit: count |
seq_scan | The number of sequential scans initiated on this table. Type: int Unit: count |
seq_tup_read | The number of live rows fetched by sequential scans. Type: int Unit: count |
vacuum_count | The number of times this table has been manually vacuumed. Type: int Unit: count |
postgresql_slru
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
name ( tag ) |
The name of the SLRU |
server ( tag ) |
The address of the server. The value is host:port |
blks_exists | Number of blocks checked for existence for this SLRU (simple least-recently-used) cache.Type: int Unit: count |
blks_hit | Number of times disk blocks were found already in the SLRU (simple least-recently-used.)Type: int Unit: count |
blks_read | Number of disk blocks read for this SLRU (simple least-recently-used) cache. SLRU caches are created with a fixed number of pages.Type: int Unit: count |
blks_written | Number of disk blocks written for this SLRU (simple least-recently-used) cache.Type: int Unit: count |
blks_zeroed | Number of blocks zeroed during initializations of SLRU (simple least-recently-used) cache.Type: int Unit: count |
flushes | Number of flush of dirty data for this SLRU (simple least-recently-used) cache.Type: int Unit: count |
truncates | Number of truncates for this SLRU (simple least-recently-used) cache.Type: int Unit: count |
postgresql_bgwriter
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
buffers_alloc | The number of buffers allocated Type: int Unit: count |
buffers_backend | The number of buffers written directly by a backend. Type: int Unit: count |
buffers_backend_fsync | The of times a backend had to execute its own fsync call instead of the background writer. Type: int Unit: count |
buffers_checkpoint | The number of buffers written during checkpoints. Type: int Unit: count |
buffers_clean | The number of buffers written by the background writer. Type: int Unit: count |
checkpoint_sync_time | The total amount of checkpoint processing time spent synchronizing files to disk. Type: float Unit: time,ms |
checkpoint_write_time | The total amount of checkpoint processing time spent writing files to disk. Type: float Unit: time,ms |
checkpoints_req | The number of requested checkpoints that were performed. Type: int Unit: count |
checkpoints_timed | The number of scheduled checkpoints that were performed. Type: int Unit: count |
maxwritten_clean | The number of times the background writer stopped a cleaning scan due to writing too many buffers. Type: int Unit: count |
postgresql_connection
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
max_connections | The maximum number of client connections allowed to this database. Type: float Unit: count |
percent_usage_connections | The number of connections to this database as a fraction of the maximum number of allowed connections. Type: float Unit: count |
postgresql_conflict
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
confl_bufferpin | Number of queries in this database that have been canceled due to pinned buffers. Type: int Unit: count |
confl_deadlock | Number of queries in this database that have been canceled due to deadlocks. Type: int Unit: count |
confl_lock | Number of queries in this database that have been canceled due to dropped tablespaces. This will occur when a temp_tablespace is dropped while being used on a standby.Type: int Unit: count |
confl_snapshot | Number of queries in this database that have been canceled due to old snapshots. Type: int Unit: count |
confl_tablespace | Number of queries in this database that have been canceled due to dropped tablespaces. This will occur when a temp_tablespace is dropped while being used on a standby.Type: int Unit: count |
postgresql_archiver
¶
Tags & Fields | Description |
---|---|
db ( tag ) |
The database name |
server ( tag ) |
The address of the server. The value is host:port |
archived_count | Number of WAL files that have been successfully archived. Type: int Unit: count |
archived_failed_count | Number of failed attempts for archiving WAL files. Type: int Unit: count |
collector
¶
Tags & Fields | Description |
---|---|
instance ( tag ) |
Server addr of the instance |
job ( tag ) |
Server name of the instance |
up | Type: int Unit: - |
Object¶
database
¶
PostgreSQL object metrics( Version-1.76.0)
Tags & Fields | Description |
---|---|
database_type ( tag ) |
The type of the database. The value is PostgreSQL |
host ( tag ) |
The hostname of the PostgreSQL server |
name ( tag ) |
The name of the database. The value is host:port in default |
port ( tag ) |
The port of the PostgreSQL server |
server ( tag ) |
The server address of the PostgreSQL server. The value is host:port |
version ( tag ) |
The version of the PostgreSQL server |
avg_query_time | The average time taken by a query to execute Type: float Unit: timeStamp,usec |
message | Summary of database information Type: string Unit: - |
qps | The number of queries executed by the database per second Type: float Unit: gauge |
slow_queries | The number of queries that have taken more than long_query_time seconds. This counter increments regardless of whether the slow query log is enabled. Type: int Unit: count |
slow_query_log | Whether the slow query log is enabled. The value can OFF to disable the log or ON to enable the log. Type: string Unit: - |
tps | The number of transactions executed by the database per second Type: float Unit: gauge |
uptime | The number of seconds that the server has been up Type: int Unit: time,s |
Structure of the message
field¶
The basic structure of the message
field is as follows:
{
"setting": {
"DateStyle":"ISO, MDY",
...
},
"databases": [ # databases information
{
"name": "db1",
"encoding": "utf8",
"owner": "datakit",
"schemas": [ # schemas information
{
"name": "schema1",
"owner": "datakit",
"tables": [ # tables information
{
"name": "table1",
"columns": [], # columns information
"indexes": [], # indexes information
"foreign_keys": [], # foreign keys information
}
...
]
},
...
]
}
...
]
}
setting
¶
The data in the setting
field is sourced from the pg_settings
system view and is used to display the configuration parameter information of the current database. For more details, please refer to the PostgreSQL documentation.
databases
¶
The databases
field stores information about all databases on the PostgreSQL
server. The detailed information of each database is shown in the following table:
Field Name | Description | Type |
---|---|---|
name |
Database name | string |
encoding |
Database encoding | string |
owner |
Role name | string |
description |
Description text | string |
schemas |
List containing schema information |
list |
The schemas
field contains information about all the schemas
in the database. The information for each schema
is as follows:
Field Name | Description | Type |
---|---|---|
name |
schema name |
string |
owner |
Role name | string |
tables |
List containing table information |
list |
The tables
field contains information about all the tables
in the database. The information for each table
is as follows:
Field Name | Description | Type |
---|---|---|
name |
table name |
string |
owner |
table owner |
string |
has_indexes |
Whether there are indexes | bool |
has_partitions |
Whether there are partitions | bool |
toast_table |
toast table name |
string |
partition_key |
Partition key | string |
num_partitions |
Number of partitions | int64 |
foreign_keys |
List containing foreign key information | list |
columns |
List containing column information |
list |
indexes |
List containing index information |
list |
tables.columns
The columns
field contains information about all the columns
in the tables
. The information for each column
is as follows:
Field Name | Description | Type |
---|---|---|
name |
Name | string |
data_type |
Data type | string |
nullable |
Whether column can be null |
bool |
default |
Default value | string |
tables.indexes
The indexes
field contains information about all the indexes
in the database tables. The information for each index
is as follows:
Field Name | Description | Type |
---|---|---|
name |
Name | string |
columns |
Columns included in the index | list |
index_type |
Index type | string |
definition |
Index definition | string |
is_unique |
Whether unique | bool |
is_primary |
Whether primary | bool |
is_exclusion |
Whether exclusion constraint index | bool |
is_immediate |
Whether to check constraints immediately after each statement execution | bool |
is_valid |
Whether it is valid | bool |
is_clustered |
Whether it is a clustered index | bool |
is_checkxmin |
Whether to check xmin |
bool |
is_ready |
Whether it is ready | bool |
is_live |
Whether it is live | bool |
is_replident |
Whether it is a row identifier index | bool |
is_partial |
Whether it is a partial index | bool |
The indexes.columns
field, which contains information about the columns included in the index
, has the following details for each column:
Field Name | Description | Type |
---|---|---|
name |
Name | string |
tables.foreign_keys
The foreign_keys
field contains information about all the foreign_keys
in the database tables. The information for each foreign_key
is as follows:
Field Name | Description | Type |
---|---|---|
name |
Name | string |
definition |
Foreign key definition | string |
constraint_schema |
Foreign key schema | string |
column_names |
Foreign key column names | string |
referenced_table_schema |
Referenced table schema | string |
referenced_table_name |
Referenced table name | string |
referenced_column_names |
Referenced column names | string |
update_action |
Cascade update rule (such as CASCADE, RESTRICT) | string |
delete_action |
Cascade delete rule (such as CASCADE, SET NULL) | string |
Log Collection¶
- PostgreSQL logs are output to
stderr
by default. To open file logs, configure them in postgresql's configuration file/etc/postgresql/<VERSION>/main/postgresql.conf
as follows:
logging_collector = on # Enable log writing to files
log_directory = 'pg_log' # Set the file storage directory, absolute path or relative path (relative PGDATA)
log_filename = 'pg.log' # Log file name
log_statement = 'all' # Record all queries
#log_duration = on
log_line_prefix= '%m [%p] %d [%a] %u [%h] %c ' # 日志行前缀
log_file_mode = 0644
# For Windows
#log_destination = 'eventlog'
For more configuration, please refer to the doc。
- The PostgreSQL collector does not have log collection enabled by default. You can open
files
inconf.d/db/postgresql.conf
and write to the absolute path of the PostgreSQL log file. For example:
When log collection is turned on, a log with a log source
of postgresql
is generated by default.
Notices:
- Log collection only supports logs on hosts where DataKit is installed.
Log Pipeline Cut¶
The original log is
2021-05-31 15:23:45.110 CST [74305] test [pgAdmin 4 - DB:postgres] postgres [127.0.0.1] 60b48f01.12241 LOG: statement:
SELECT psd.*, 2^31 - age(datfrozenxid) as wraparound, pg_database_size(psd.datname) as pg_database_size
FROM pg_stat_database psd
JOIN pg_database pd ON psd.datname = pd.datname
WHERE psd.datname not ilike 'template%' AND psd.datname not ilike 'rdsadmin'
AND psd.datname not ilike 'azure_maintenance' AND psd.datname not ilike 'postgres'
Description of the cut field:
Field name | Field Value | Description |
---|---|---|
application_name |
pgAdmin 4 - DB:postgres |
The name of the application connecting to the current database |
db_name |
test |
Database accessed |
process_id |
74305 |
The client process ID of the current connection |
remote_host |
127.0.0.1 |
Address of the client |
session_id |
60b48f01.12241 |
ID of the current session |
user |
postgres |
Current Access User Name |
status |
LOG |
Current log level (LOG,ERROR,FATAL,PANIC,WARNING,NOTICE,INFO) |
time |
1622445825110000000 |
Log generation time |
FAQ¶
Missing metrics¶
For metrics postgresql_lock/postgresql_stat/postgresql_index/postgresql_size/postgresql_statio
, the relations
field in the configuration file needs to be enabled. If some of these metrics are partially missing, it may be because there is no data for the relevant metrics.