Oracle
Oracle monitoring metrics collection has the following data collection functions.
- process correlation
- Tablespace related data
- system data collection
- Custom query data collection
Already tested version:
- Oracle 19c
- Oracle 12c
- Oracle 11g
Starting from DataKit 1.32.0 版本, support is provided for collecting Oracle metrics using both direct collection through DataKit and via external collectors.
Configuration¶
Precondition¶
- Create a monitoring account
If you wish to monitor only a single PDB or non-CDB instance, a local user is sufficient:
-- Create the datakit user. Replace the password placeholder with a secure password.
CREATE USER datakit IDENTIFIED BY <PASSWORD>;
-- Grant access to the datakit user.
GRANT CONNECT, CREATE SESSION TO datakit;
GRANT SELECT_CATALOG_ROLE to datakit;
GRANT SELECT ON DBA_TABLESPACE_USAGE_METRICS TO datakit;
GRANT SELECT ON DBA_TABLESPACES TO datakit;
GRANT SELECT ON DBA_USERS TO datakit;
GRANT SELECT ON SYS.DBA_DATA_FILES TO datakit;
GRANT SELECT ON V_$ACTIVE_SESSION_HISTORY TO datakit;
GRANT SELECT ON V_$ARCHIVE_DEST TO datakit;
GRANT SELECT ON V_$ASM_DISKGROUP TO datakit;
GRANT SELECT ON V_$DATABASE TO datakit;
GRANT SELECT ON V_$DATAFILE TO datakit;
GRANT SELECT ON V_$INSTANCE TO datakit;
GRANT SELECT ON V_$LOG TO datakit;
GRANT SELECT ON V_$OSSTAT TO datakit;
GRANT SELECT ON V_$PGASTAT TO datakit;
GRANT SELECT ON V_$PROCESS TO datakit;
GRANT SELECT ON V_$RECOVERY_FILE_DEST TO datakit;
GRANT SELECT ON V_$RESTORE_POINT TO datakit;
GRANT SELECT ON V_$SESSION TO datakit;
GRANT SELECT ON V_$SGASTAT TO datakit;
GRANT SELECT ON V_$SYSMETRIC TO datakit;
GRANT SELECT ON V_$SYSTEM_PARAMETER TO datakit;
If you want to monitor table spaces from the CDB and all PDBs, you need a common user with the appropriate permissions:
-- Create the datakit user. Replace the password placeholder with a secure password.
CREATE USER datakit IDENTIFIED BY <PASSWORD>;
-- Grant access to the datakit user.
ALTER USER datakit SET CONTAINER_DATA=ALL CONTAINER=CURRENT;
GRANT CONNECT, CREATE SESSION TO datakit;
GRANT SELECT_CATALOG_ROLE to datakit;
GRANT SELECT ON v_$instance TO datakit;
GRANT SELECT ON v_$database TO datakit;
GRANT SELECT ON v_$sysmetric TO datakit;
GRANT SELECT ON v_$system_parameter TO datakit;
GRANT SELECT ON v_$session TO datakit;
GRANT SELECT ON v_$recovery_file_dest TO datakit;
GRANT SELECT ON v_$active_session_history TO datakit;
GRANT SELECT ON v_$osstat TO datakit;
GRANT SELECT ON v_$restore_point TO datakit;
GRANT SELECT ON v_$process TO datakit;
GRANT SELECT ON v_$datafile TO datakit;
GRANT SELECT ON v_$pgastat TO datakit;
GRANT SELECT ON v_$sgastat TO datakit;
GRANT SELECT ON v_$log TO datakit;
GRANT SELECT ON v_$archive_dest TO datakit;
GRANT SELECT ON v_$asm_diskgroup TO datakit;
GRANT SELECT ON sys.dba_data_files TO datakit;
GRANT SELECT ON DBA_TABLESPACES TO datakit;
GRANT SELECT ON DBA_TABLESPACE_USAGE_METRICS TO datakit;
GRANT SELECT ON DBA_USERS TO datakit;
Attention
Some of the SQL above may lead to non-existent failure due to diverse Oracle version, just ignore it.
- Deploy dependency package
If you are using DataKit direct collection, you may skip this step.
Select the appropriate installation package based on the operating system and Oracle version, refer to here. For example:
wget https://download.oracle.com/otn_software/linux/instantclient/2110000/instantclient-basiclite-linux.x64-21.10.0.0.0dbru.zip
unzip instantclient-basiclite-linux.x64-21.10.0.0.0dbru.zip
Add the extracted directory file path to the LD_LIBRARY_PATH
environment variable path in the following configuration information.
You can also download our pre-prepared dependency package directly:
wget https://static.truewatch.com/otn_software/instantclient/instantclient-basiclite-linux.x64-21.10.0.0.0dbru.zip \
-O /usr/local/datakit/externals/instantclient-basiclite-linux.zip \
&& unzip /usr/local/datakit/externals/instantclient-basiclite-linux.zip -d /opt/oracle \
&& mv /opt/oracle/instantclient_21_10 /opt/oracle/instantclient;
wget https://download.oracle.com/otn_software/linux/instantclient/1919000/instantclient-basiclite-linux.arm64-19.19.0.0.0dbru.zip
unzip instantclient-basiclite-linux.arm64-19.19.0.0.0dbru.zip
Add the extracted directory file path to the LD_LIBRARY_PATH
environment variable path in the following configuration information.
You can also download our pre-prepared dependency package directly:
wget https://static.truewatch.com/otn_software/instantclient/instantclient-basiclite-linux.arm64-19.19.0.0.0dbru.zip \
-O /usr/local/datakit/externals/instantclient-basiclite-linux.zip \
&& unzip /usr/local/datakit/externals/instantclient-basiclite-linux.zip -d /opt/oracle \
&& mv /opt/oracle/instantclient_19_19 /opt/oracle/instantclient;
For some OS need to install additional dependent libraries:
Configuration¶
Go to the conf.d/db
directory under the DataKit installation directory, copy oracle.conf.sample
and name it oracle.conf
. Examples are as follows:
[[inputs.oracle]]
# host name
host = "localhost"
## port
port = 1521
## user name
user = "datakit"
## password
password = "<PASS>"
## service
service = "XE"
## interval
interval = "10s"
## connection timeout
connect_timeout = "30s"
## slow query time threshold defined. If larger than this, the executed sql will be reported.
slow_query_time = "0s"
## Metric name in metric_exclude_list will not be collected.
metric_exclude_list = [""]
## Set true to enable election
election = true
## collect object
[inputs.oracle.object]
# Set true to enable collecting objects
enabled = true
# interval to collect oracle object which will be greater than collection interval
interval = "600s"
## Run a custom SQL query and collect corresponding metrics.
# [[inputs.oracle.custom_queries]]
# sql = '''
# SELECT
# GROUP_ID, METRIC_NAME, VALUE
# FROM GV$SYSMETRIC
# '''
# metric = "oracle_custom"
# tags = ["GROUP_ID", "METRIC_NAME"]
# fields = ["VALUE"]
# interval = "10s"
[inputs.oracle.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
Once configured, restart DataKit.
The collector can now be turned on by ConfigMap Injection Collector Configuration.
Example for external collector is as follows:
[[inputs.external]]
daemon = true
name = "oracle"
cmd = "/usr/local/datakit/externals/oracle"
## Set true to enable election
election = true
## Modify below if necessary.
## The password use environment variable named "ENV_INPUT_ORACLE_PASSWORD".
args = [
"--interval" , "1m" ,
"--host" , "<your-oracle-host>" ,
"--port" , "1521" ,
"--username" , "<oracle-user-name>" ,
"--service-name" , "<oracle-service-name>" ,
"--slow-query-time" , "0s" ,
"--log" , "/var/log/datakit/oracle.log" ,
]
envs = [
"ENV_INPUT_ORACLE_PASSWORD=<oracle-password>",
"LD_LIBRARY_PATH=/opt/oracle/instantclient:$LD_LIBRARY_PATH",
]
[inputs.external.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
## Run a custom SQL query and collect corresponding metrics.
# [[inputs.external.custom_queries]]
# sql = '''
# SELECT
# GROUP_ID, METRIC_NAME, VALUE
# FROM GV$SYSMETRIC
# '''
# metric = "oracle_custom"
# tags = ["GROUP_ID", "METRIC_NAME"]
# fields = ["VALUE"]
#############################
# Parameter Description (Marked with * is required field)
#############################
# *--interval : Collect interval (Default is 1m).
# *--host : Oracle instance address (IP).
# *--port : Oracle listen port (Default is 1521).
# *--username : Oracle username.
# *--service-name : Oracle service name.
# *--slow-query-time : Oracle slow query time threshold defined. If larger than this, the executed sql will be reported.
# *--log : Collector log path.
# *ENV_INPUT_ORACLE_PASSWORD : Oracle password.
Tip
The configuration above would shows in the process list(including password). If want to hide the password, can use the environment variable ENV_INPUT_ORACLE_PASSWORD
, like below:
The environment variable has highest priority, which means if existed that environment variable, the value in the environment variable will always treated as the password.
Metric¶
For all of the following data collections, the global election tags will added automatically, we can add extra tags in [inputs.oracle.tags]
if needed:
oracle_process
¶
Tags & Fields | Description |
---|---|
host ( tag ) |
Host name |
oracle_server ( tag ) |
Server addr. Deprecated. Please use server |
oracle_service ( tag ) |
Server service |
pdb_name ( tag ) |
PDB name |
program ( tag ) |
Program in progress |
server ( tag ) |
The address of the server. The value is host:port |
pga_alloc_mem | PGA memory allocated by process Type: float Unit: digital,B |
pga_freeable_mem | PGA memory freeable by process Type: float Unit: digital,B |
pga_max_mem | PGA maximum memory ever allocated by process Type: float Unit: digital,B |
pga_used_mem | PGA memory used by process Type: float Unit: digital,B |
pid | Oracle process identifier Type: int Unit: N/A |
oracle_tablespace
¶
Tags & Fields | Description |
---|---|
host ( tag ) |
Host name |
oracle_server ( tag ) |
Server addr. Deprecated. Please use server |
oracle_service ( tag ) |
Server service |
pdb_name ( tag ) |
PDB name |
server ( tag ) |
The address of the server. The value is host:port |
tablespace_name ( tag ) |
Table space name |
in_use | Percentage of used space,as a function of the maximum possible Tablespace size Type: float Unit: percent,percent |
off_use | Total space consumed by the Tablespace, in database blocks Type: float Unit: digital,B |
ts_size | Table space size Type: float Unit: digital,B |
used_space | Used space Type: float Unit: digital,B |
oracle_system
¶
You have to wait for a few minutes to see these metrics when your running Oracle database's version is earlier than 12c.
Tags & Fields | Description |
---|---|
host ( tag ) |
Host name |
oracle_server ( tag ) |
Server addr. Deprecated. Please use server |
oracle_service ( tag ) |
Server service |
pdb_name ( tag ) |
PDB name |
server ( tag ) |
The address of the server. The value is host:port |
active_sessions | Number of active sessions Type: float Unit: count |
buffer_cachehit_ratio | Ratio of buffer cache hits Type: float Unit: percent,percent |
cache_blocks_corrupt | Corrupt cache blocks Type: float Unit: count |
cache_blocks_lost | Lost cache blocks Type: float Unit: count |
consistent_read_changes | Consistent read changes per second Type: float Unit: count |
consistent_read_gets | Consistent read gets per second Type: float Unit: count |
cursor_cachehit_ratio | Ratio of cursor cache hits Type: float Unit: percent,percent |
database_cpu_time_ratio | Database CPU time ratio Type: float Unit: percent,percent |
database_wait_time_ratio | Memory sorts per second Type: float Unit: percent,percent |
db_block_changes | DB block changes per second Type: float Unit: count |
db_block_gets | DB block gets per second Type: float Unit: count |
disk_sorts | Disk sorts per second Type: float Unit: count |
enqueue_timeouts | Enqueue timeouts per second Type: float Unit: count |
execute_without_parse | Execute without parse ratio Type: float Unit: count |
gc_cr_block_received | GC CR block received Type: float Unit: count |
host_cpu_utilization | Host CPU utilization (%) Type: float Unit: percent,percent |
library_cachehit_ratio | Ratio of library cache hits Type: float Unit: percent,percent |
logical_reads | Logical reads per second Type: float Unit: count |
logons | Number of logon attempts Type: float Unit: count |
memory_sorts_ratio | Memory sorts ratio Type: float Unit: percent,percent |
pga_over_allocation_count | Over-allocating PGA memory count Type: float Unit: count |
physical_reads | Physical reads per second Type: float Unit: count |
physical_reads_direct | Physical reads direct per second Type: float Unit: count |
physical_writes | Physical writes per second Type: float Unit: count |
redo_generated | Redo generated per second Type: float Unit: count |
redo_writes | Redo writes per second Type: float Unit: count |
rows_per_sort | Rows per sort Type: float Unit: count |
service_response_time | Service response time Type: float Unit: time,ms |
session_count | Session count Type: float Unit: count |
session_limit_usage | Session limit usage Type: float Unit: percent,percent |
shared_pool_free | Shared pool free memory % Type: float Unit: percent,percent |
soft_parse_ratio | Soft parse ratio Type: float Unit: percent,percent |
sorts_per_user_call | Sorts per user call Type: float Unit: count |
temp_space_used | Temp space used Type: float Unit: digital,B |
uptime | Instance uptime Type: int Unit: time,s |
user_rollbacks | Number of user rollbacks Type: float Unit: count |
oracle_waiting_event
¶
Tags & Fields | Description |
---|---|
event ( tag ) |
Event name |
event_type ( tag ) |
Event type, such as USER/BACKGROUND |
oracle_server ( tag ) |
Server addr. Deprecated. Please use server |
program ( tag ) |
Program(process) name that waiting the event |
server ( tag ) |
The address of the server. The value is host:port |
username ( tag ) |
Oracle username that waiting the event |
count | Waiting event count Type: int Unit: count |
oracle_locked_session
¶
Tags & Fields | Description |
---|---|
event ( tag ) |
Locked session that waiting the specified event name |
host ( tag ) |
Host name |
oracle_server ( tag ) |
Server addr. Deprecated. Please use server |
oracle_service ( tag ) |
Server service |
server ( tag ) |
The address of the server. The value is host:port |
waiting_session_count | Locked session count Type: int Unit: count |
collector
¶
Tags & Fields | Description |
---|---|
instance ( tag ) |
Server addr of the instance |
job ( tag ) |
Server name of the instance |
up | Type: int Unit: - |
Object¶
database
¶
Oracle object metrics( Version-1.77.0)
Tags & Fields | Description |
---|---|
database_type ( tag ) |
The type of the database. The value is Oracle |
host ( tag ) |
The hostname of the Oracle server |
name ( tag ) |
The name of the database. The value is host:port in default |
port ( tag ) |
The port of the Oracle server |
server ( tag ) |
The address of the server. The value is host:port |
version ( tag ) |
The version of the Oracle server |
avg_query_time | The average time taken by a query to execute Type: float Unit: timeStamp,usec |
qps | The number of queries executed by the database per second Type: float Unit: gauge |
slow_queries | The number of queries that have taken more than slow_query_time .Type: int Unit: count |
slow_query_log | Whether the slow query log is enabled according to whether slow_query_time is greater than 0 . The value can be OFF to disable the log or ON to enable the log.Type: string Unit: - |
tps | The number of transactions executed by the database per second Type: float Unit: gauge |
uptime | The number of seconds that the server has been up Type: int Unit: time,s |
Slow query¶
DataKit could reports the SQLs, those executed time exceeded the threshold time defined by user, to TrueWatch, displays in the Logs
side bar, the source name is oracle_log
.
This function is disabled by default, user could enabling it by modify DataKit's Oracle configuration like followings:
Change the value of the field slow_query_time
from 0s
to the threshold time, minimal value is 1 millsecond. Generally, recommand it to 10s
.
Fields description
avg_elapsed
: The SQL executed average time cost.username
: The user who executed the SQL.failed_obfuscate
:SQL obfuscated failed reason. Only exist when SQL obfuscated failed. Original SQL will be reported when SQL obfuscated failed.- If
slow-query-time
is empty or less than 1 millisecond, this function is disabled, which is also the default state. - The SQL would not display here when NOT executed completed.
For more fields, see here.
oracle_log
¶
For full and detailed field into, see here
Tags & Fields | Description |
---|---|
action ( tag ) |
Contains the name of the action that was executing when the SQL statement was first parsed as set by calling DBMS_APPLICATION_INFO.SET_ACTION |
command_type ( tag ) |
Oracle command type definition |
module ( tag ) |
Contains the name of the module that was executing when the SQL statement was first parsed as set by calling DBMS_APPLICATION_INFO.SET_MODULE |
oracle_server ( tag ) |
Server addr. Deprecated. Please use server |
server ( tag ) |
The address of the server. The value is host:port |
sql_id ( tag ) |
SQL identifier of the parent cursor in the library cache |
status ( tag ) |
Log level, always warning here |
application_wait_time | Application wait time (in microseconds) Type: int Unit: time,μs |
avg_elapsed | Average elapsed time of executions(elapsed_time/executions )Type: int Unit: time,μs |
buffer_gets | Sum of buffer gets over all child cursors Type: int Unit: count |
cluster_wait_time | Cluster wait time (in microseconds) Type: int Unit: time,μs |
concurrency_wait_time | Concurrency wait time (in microseconds) Type: int Unit: time,μs |
cpu_time | CPU time (in microseconds) used by this cursor for parsing, executing, and fetching Type: int Unit: time,μs |
disk_reads | Sum of the number of disk reads over all child cursors Type: int Unit: count |
elapsed_time | Elapsed time (in microseconds) used by this cursor for parsing, executing, and fetching. If the cursor uses parallel execution, then ELAPSED_TIME is the cumulative time...Type: int Unit: time,μs |
executions | Total number of executions, totalled over all the child cursors Type: int Unit: count |
last_active_time | Time at which the query plan was last active Type: string Unit: N/A |
message | JSON dump of all queried fields of table V$SQLAREA Type: string Unit: N/A |
parse_calls | Sum of all parse calls to all the child cursors under this parent Type: int Unit: count |
parsing_schema_name | Schema name that was used to parse this child cursor Type: string Unit: N/A |
plan_hash_value | Numeric representation of the current SQL plan for this cursor. Comparing one PLAN_HASH_VALUE to another easily identifies whether or not two plans are the same (rather than comparing the two plans line by line).Type: int Unit: N/A |
rows_processed | Total number of rows processed on behalf of this SQL statement Type: int Unit: count |
sorts | Sum of the number of sorts that were done for all the child cursors Type: int Unit: count |
sql_fulltext | All characters of the SQL text for the current cursor Type: string Unit: N/A |
user_io_wait_time | User I/O Wait Time (in microseconds) Type: int Unit: time,μs |
username | Name of the user Type: string Unit: N/A |
FAQ¶
How to view the log of Oracle Collector?¶
Because the Oracle collector is an external collector, its logs by default are stored separately in [DataKit-install-path]/externals/oracle.log.
In addition, the log path could modified by using --log
parameter in configuration file.
Since DataKit version 1.32.0, the Oracle collector is no longer an external collector, and its logs are located together with the DataKit's own logs (under Linux, at /var/log/datakit/log).
Why is there no data displayed in monitor?¶
There are several possible reasons:
- Oracle dynamic library dependencies are problematic
Even though you may already have a corresponding Oracle package on your machine, it is recommended to use the dependency package specified in the above document and ensure that its installation path is consistent with the path specified by LD_LIBRARY_PATH
.
- There is a problem with the glibc version
As the Oracle collector is compiled independently and CGO is turned on, its runtime requires glibc dependencies. On Linux, you can check whether there is any problem with the glibc dependencies of the current machine by the following command:
$ ldd <DataKit-install-path>/externals/oracle
linux-vdso.so.1 (0x00007ffed33f9000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f70144e1000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f70144be000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f70142cc000)
/lib64/ld-linux-x86-64.so.2 (0x00007f70144fc000)
If the following information is reported, it is basically caused by the low glibc version on the current machine:
- Oracle Collector is only available on Linux x86_64/ARM64 architecture DataKit and is not supported on other platforms.
This means that the Oracle collector can only run on x86_64/ARM64 Linux, and no other platform can run the current Oracle collector.
oracle_system
measurements not found?¶
It needs to taking 1 minute to see them after the database system starting up.