AWS MemoryDB¶
Use the "TrueWatch Cloud Sync" series of script packages in the script market to synchronize cloud monitoring and cloud asset data to TrueWatch
Configuration¶
Install Func¶
It is recommended to activate the TrueWatch Integration - Extensions - DataFlux Func (Automata): all prerequisites are automatically installed, please proceed with the script installation.
If deploying Func manually, refer to Manual Func Deployment
Install Script¶
Note: Please prepare the required Amazon AK in advance (for simplicity, you can grant global read-only permissions
ReadOnlyAccess
)
Activate Script for Hosted Version¶
- Log in to the TrueWatch console
- Click on the 【Integration】 menu, select 【Cloud Account Management】
- Click on 【Add Cloud Account】, select 【AWS】, and fill in the required information on the interface. If you have already configured the cloud account information before, you can skip this step.
- Click on 【Test】, and if the test is successful, click on 【Save】. If the test fails, please check if the relevant configuration information is correct and test again.
- Click on 【Cloud Account Management】, and you can see the added cloud account in the list. Click on the corresponding cloud account to enter the details page.
- Click on the 【Integration】 button on the cloud account details page, and find
AWS MemoryDB
under theNot Installed
list. Click on the 【Install】 button, and the installation interface will pop up for installation.
Manually Activate Script¶
-
Log in to the Func console, click on 【Script Market】, enter the TrueWatch script market, and search for
integration_aws_memorydb
. -
Click on 【Install】, and enter the corresponding parameters: AWS AK ID, AK Secret, and account name.
-
Click on 【Deploy Startup Script】, the system will automatically create the
Startup
script set and configure the corresponding startup script. -
After activation, you can see the corresponding automatic trigger configuration in 「Manage / Automatic Trigger Configuration」. Click on 【Execute】 to immediately execute once without waiting for the scheduled time. After a while, you can check the execution task records and corresponding logs.
We have collected some configurations by default, see the Metrics section for details.
Verification¶
- In 「Manage / Automatic Trigger Configuration」, confirm whether the corresponding task has the automatic trigger configuration, and check the corresponding task records and logs to see if there are any exceptions.
- In TrueWatch, check if the asset information exists in 「Infrastructure / Custom」.
- In TrueWatch, check if there are corresponding monitoring data in 「Metrics」.
Metrics¶
After configuring Amazon Cloud Monitoring, the default Measurement is as follows. More Metrics can be collected through configuration Amazon Cloud Monitoring Metrics Details
Metric | Description | Unit |
---|---|---|
ActiveDefragHits |
The number of value reallocations performed by the active defragmentation process per minute. This is derived from the active_defrag_hits statistic in Redis INFO. |
Number |
AuthenticationFailures |
The total number of failed attempts to authenticate to Redis using the AUTH command. You can use the ACL LOG command to find more information about individual authentication failures. We recommend setting an alarm for this to detect unauthorized access attempts. | Count |
BytesUsedForMemoryDB |
The total number of bytes allocated by MemoryDB for all purposes, including datasets, buffers, etc. | Bytes |
CommandAuthorizationFailures |
The number of failed attempts by users to run commands they do not have permission to invoke. You can use the ACL LOG command to find more information about individual authentication failures. We recommend setting an alarm for this to detect unauthorized access attempts. | Count |
CurrConnections |
The number of client connections, excluding connections from read-only replicas. MemoryDB uses two to four connections to monitor the cluster in various scenarios. This is derived from the connected_clients statistic in Redis INFO. |
Count |
CurrItems |
The number of items in the cache. This value is derived from the Redis keyspace statistic obtained by summing the total number of keys in the entire keyspace. |
Count |
DatabaseMemoryUsagePercentage |
The percentage of available memory used by the cluster. This is calculated using used_memory/maxmemory from Redis INFO. |
Percentage |
EngineCPUUtilization |
Provides the CPU utilization of the Redis engine thread. Since Redis is single-threaded, you can use this metric to analyze the load of the Redis process itself. The EngineCPUUtilization metric more accurately represents the Redis process. You can use it in conjunction with the CPUUtilization metric. CPUUtilization exposes the overall CPU usage of the server instance, including other operating system and management processes. For node types with four or more vCPUs, use the EngineCPUUtilization metric to monitor and set scaling thresholds. Note that on MemoryDB hosts, background processes monitor the host to provide a managed database experience. These background processes may consume a significant portion of the CPU workload. This has less impact on larger hosts with more than two vCPUs, but has a greater impact on smaller hosts with two or fewer vCPUs. If you only monitor the EngineCPUUtilization metric, you will not be able to detect host overload situations caused by high CPU usage of Redis or background monitoring processes. Therefore, we recommend monitoring the CPUUtilization metric for hosts with two or fewer vCPUs. |
Percentage |
Evictions |
The number of keys evicted due to the maxmemory limit. This is derived from the evicted_keys statistic in Redis INFO. |
Count |
IsPrimary |
Indicates whether the node is the primary node of the current partition. The metric can be 0 (not primary) or 1 (primary). | Count |
KeyAuthorizationFailures |
The number of failed attempts by users to access keys they do not have permission to access. You can use the ACL LOG command to find more information about individual authentication failures. We recommend setting an alarm for this to detect unauthorized access attempts. | Count |
KeyspaceHits |
The number of successful read-only key lookups in the main dictionary. This is derived from the keyspace_hits statistic in Redis INFO. |
Count |
KeyspaceMisses |
The number of failed read-only key lookups in the main dictionary. This is derived from the keyspace_misses statistic in Redis INFO. |
Count |
KeysTracked |
The percentage of keys tracked by Redis key tracking out of tracking-table-max-keys . Key tracking is used to help client-side caching and notify clients when keys are modified. |
Count |
MaxReplicationThroughput |
The maximum replication throughput observed in the last measurement period. | Bytes per second |
MemoryFragmentationRatio |
Indicates the efficiency of Redis engine memory allocation. Certain thresholds will indicate different behaviors. The recommended value is to have fragmentation greater than 1.0. This is calculated from the mem_fragmentation_ratio statistic in Redis INFO. |
Number |
NewConnections |
The total number of connections accepted by the server during this period. This is derived from the total_connections_received statistic in Redis INFO. |
Count |
PrimaryLinkHealthStatus |
This status has two values: 0 or 1. A value of 0 indicates that the data in the MemoryDB primary node is not synchronized with Redis on EC2. A value of 1 indicates that the data is synchronized. | Boolean |
Reclaimed |
The total number of key expiration events. This is derived from the expired_keys statistic in Redis INFO. |
Count |
ReplicationBytes |
For nodes in a replication configuration, ReplicationBytes reports the number of bytes sent by the primary to all its replicas. This metric represents the write load on the cluster. This is derived from the master_repl_offset statistic in Redis INFO. |
Bytes |
ReplicationDelayedWriteCommands |
The number of commands delayed due to exceeding the maximum replication throughput. | Count |
ReplicationLag |
This metric is only applicable to nodes running as read-only replicas. It represents the time (in seconds) that the replica lags behind in applying changes from the primary node. | Seconds |
CPUUtilization |
The percentage of CPU utilization of the entire host. Since Redis is single-threaded, we recommend monitoring the EngineCPUUtilization metric for nodes with 4 or more vCPUs. |
Percentage |
FreeableMemory |
The amount of idle memory available on the host. This is derived from RAM, buffers, and is reported as free by the operating system. | Bytes |
NetworkBytesIn |
The number of bytes read from the network by the host. | Bytes |
NetworkBytesOut |
The number of bytes sent by the instance on all network interfaces. | Bytes |
NetworkConntrackAllowanceExceeded |
The number of packets formed due to connection tracking exceeding the maximum value of the instance and new connections cannot be established. This may cause packet loss in traffic to and from the instance. | Count |
SwapUsage |
The swap usage on the host. | Bytes |
Object¶
The collected AWS MemoryDB object data structure can be seen in 「Infrastructure - Custom」.
{
"measurement": "aws_memorydb",
"tags": {
"RegionId" : "cn-north-1",
"Status" : "xxxx",
"ClusterName" : "xxxxxx",
"AvailabilityMode" : "xxxxxx",
"NodeType" : "xxxxxx",
"EngineVersion" : "xxxxxx",
"EnginePatchVersion" : "xxxxxx",
"ParameterGroupName" : "xxxxxx",
"ParameterGroupStatus" : "xxxxxx",
"ARN" : "arn:aws-cn:kms:cn-northwest-1:xxxx",
"SnsTopicStatus" : "xxxxxx",
"SnsTopicArn" : "xxxxxx",
"MaintenanceWindow" : "xxxxxx",
"SnapshotWindow" : "xxxxxx",
"ACLName" : "xxxxxx",
"name" : "xxxxxx"
},
"fields": {
"Description": "xxxxxx",
"SecurityGroups": "xxxxxx",
"NumberOfShards": "xxxxxx",
"TLSEnabled": "xxxxxx",
"SnapshotRetentionLimit": "xxxxxx",
"AutoMinorVersionUpgrade": "xxxxxx",
"NumberOfShards" : "1",
"message" : "{Instance JSON Data}"
}
}
Note: The fields in
tags
andfields
may change with subsequent updates ```