Skip to content

Volcengine VKE

Volcengine Kubernetes Engine (VKE), VKE Metrics collection, including Cluster, Container, Node, Pod, etc.

Configuration

Install Func

It is recommended to enable TrueWatch Integration - Extensions - DataFlux Func (Automata): All prerequisites are automatically installed, please proceed with the script installation.

If you want to deploy Func by yourself, refer to Self-deploy Func

Install Script

Note: Please prepare the Volcengine AK with the required permissions in advance (for simplicity, you can grant the global read-only permission ReadOnlyAccess).

To synchronize the monitoring data of VKE cloud resources, we install the corresponding collection script: "TrueWatch Integration (Volcengine-VKE Collection)" (ID: integration_volcengine_vke).

After clicking 【Install】, enter the corresponding parameters: Volcengine AK, Volcengine account name.

Click 【Deploy Startup Script】, the system will automatically create the Startup script set and configure the corresponding startup scripts.

After enabling, you can see the corresponding automatic trigger configuration in 「Manage / Automatic Trigger Configuration」. Click 【Execute】 to execute it immediately without waiting for the scheduled time. After a while, you can check the execution task records and corresponding logs.

If you want to collect corresponding logs, you also need to enable the corresponding log collection script. If you want to collect billing data, you need to enable the cloud billing collection script.

Verification

  1. In 「Manage / Automatic Trigger Configuration」, confirm whether the corresponding task has the automatic trigger configuration, and you can also check the corresponding task records and logs to see if there are any exceptions.
  2. In TrueWatch, check if the asset information exists in 「Infrastructure / Custom」.
  3. In TrueWatch, check if there is corresponding monitoring data in 「Metrics」.

Metrics

After configuring Volcengine Cloud Monitoring, the default Metrics are as follows. You can collect more Metrics through configuration Volcengine Cloud Monitoring Metrics Details

Note: You need to install the monitoring plugin in the volcengine VKE console.

MetricName SubNamespace Metric Name MetricUnit Dimension
Cluster_MemoryUsed Cluster Cluster Memory Usage Bytes(SI) Cluster
Cluster_CPUUsage Cluster Cluster CPU Usage Percent Cluster
Cluster_MemoryUsage Cluster Cluster Memory Usage Percent Cluster
Cluster_NodeCount Cluster Cluster Node Count Count Cluster
Cluster_CPUUsed Cluster Cluster CPU Usage Core Cluster
Container_MemoryUsed Container Container Memory Usage Bytes(SI) Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container
Container_CPUUsage Container Container CPU Usage (against limit) Percent Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container
Container_MemoryUsage Container Container Memory Usage (against limit) Percent Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container
Container_CPUUsed Container Container CPU Usage Core Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container
Container_GPU_Memory_Free Container Container GPU Memory Free Megabytes Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container,GPU
Container_GPU_Memory_Used Container Container GPU Memory Usage Megabytes Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container,GPU
Container_GPU_Usage Container Container GPU Usage Percent Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container,GPU
Container_GPU_Count Container Container GPU Count Count Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container,GPU
Container_GPU_Memory_Usage Container Container GPU Memory Usage Percent Cluster,Namespace,Deployment,StatefulSet,DaemonSet,CronJob,Job,Pod,Container,GPU
CronJob_MemoryUsed CronJob CronJob Memory Usage Bytes(SI) Cluster,Namespace,CronJob
CronJob_CPUUsage CronJob CronJob CPU Usage (against limit) Percent Cluster,Namespace,CronJob
CronJob_MemoryUsage CronJob CronJob Memory Usage (against limit) Percent Cluster,Namespace,CronJob
CronJob_CPUUsed CronJob CronJob CPU Usage Core Cluster,Namespace,CronJob
CronJob_GPU_Memory_Free CronJob CronJob GPU Memory Free Megabytes Cluster,Namespace,CronJob,GPU
CronJob_GPU_Memory_Used CronJob CronJob GPU Memory Usage Megabytes Cluster,Namespace,CronJob,GPU
CronJob_GPU_Usage CronJob CronJob GPU Usage Percent Cluster,Namespace,CronJob,GPU
CronJob_GPU_Count CronJob CronJob GPU Count Count Cluster,Namespace,CronJob,GPU
CronJob_GPU_Memory_Usage CronJob CronJob GPU Memory Usage Percent Cluster,Namespace,CronJob,GPU
DaemonSet_MemoryUsed DaemonSet DaemonSet Memory Usage Bytes(SI) Cluster,Namespace,DaemonSet
DaemonSet_CPUUsage DaemonSet DaemonSet CPU Usage (against limit) Percent Cluster,Namespace,DaemonSet
DaemonSet_MemoryUsage DaemonSet DaemonSet Memory Usage (against limit) Percent Cluster,Namespace,DaemonSet
DaemonSet_CPUUsed DaemonSet DaemonSet CPU Usage Core Cluster,Namespace,DaemonSet
DaemonSet_GPU_Memory_Free DaemonSet DaemonSet GPU Memory Free Megabytes Cluster,Namespace,DaemonSet,GPU
DaemonSet_GPU_Memory_Used DaemonSet DaemonSet GPU Memory Usage Megabytes Cluster,Namespace,DaemonSet,GPU
DaemonSet_GPU_Usage DaemonSet DaemonSet GPU Usage Percent Cluster,Namespace,DaemonSet,GPU
DaemonSet_GPU_Count DaemonSet DaemonSet GPU Count Count Cluster,Namespace,DaemonSet,GPU
DaemonSet_GPU_Memory_Usage DaemonSet DaemonSet GPU Memory Usage Percent Cluster,Namespace,DaemonSet,GPU
Deployment_MemoryUsed Deployment Deployment Memory Usage Bytes(SI) Cluster,Namespace,Deployment
Deployment_CPUUsage Deployment Deployment CPU Usage (against limit) Percent Cluster,Namespace,Deployment
Deployment_MemoryUsage Deployment Deployment Memory Usage (against limit) Percent Cluster,Namespace,Deployment
Deployment_CPUUsed Deployment Deployment CPU Usage Core Cluster,Namespace,Deployment
Deployment_GPU_Memory_Free Deployment Deployment GPU Memory Free Megabytes Cluster,Namespace,Deployment,GPU
Deployment_GPU_Memory_Used Deployment Deployment GPU Memory Usage Megabytes Cluster,Namespace,Deployment,GPU
Deployment_GPU_Usage Deployment Deployment GPU Usage Percent Cluster,Namespace,Deployment,GPU
Deployment_GPU_Count Deployment Deployment GPU Count Count Cluster,Namespace,Deployment,GPU
Deployment_GPU_Memory_Usage Deployment Deployment GPU Memory Usage Percent Cluster,Namespace,Deployment,GPU
Job_CPUUsed Job Job CPU Usage Core Cluster,Namespace,Job
Job_MemoryUsed Job Job Memory Usage Bytes(SI) Cluster,Namespace,Job
Job_CPUUsage Job Job CPU Usage (against limit) Percent Cluster,Namespace,Job
Job_MemoryUsage Job Job Memory Usage (against limit) Percent Cluster,Namespace,Job
Job_GPU_Memory_Free Job Job GPU Memory Free Megabytes Cluster,Namespace,Job,GPU
Job_GPU_Memory_Used Job Job GPU Memory Usage Megabytes Cluster,Namespace,Job,GPU
Job_GPU_Usage Job Job GPU Usage Percent Cluster,Namespace,Job,GPU
Job_GPU_Count Job Job GPU Count Count Cluster,Namespace,Job,GPU
Job_GPU_Memory_Usage Job Job GPU Memory Usage Percent Cluster,Namespace,Job,GPU
Namespace_CPUUsed Namespace Namespace CPU Usage Core Cluster,Namespace
Namespace_MemoryUsed Namespace Namespace Memory Usage Bytes(SI) Cluster,Namespace
Node_PodCount Node Node Pod Count Count Cluster,Node
Node_CPURequestUsage Node Node CPU Allocation Rate (request) Percent Cluster,Node
Node_MemoryRequestUsage Node Node Memory Allocation Rate (request) Percent Cluster,Node
Node_CPULimitUsage Node Node CPU Allocation Rate (limit) Percent Cluster,Node
Node_MemoryLimitUsage Node Node Memory Allocation Rate (limit) Percent Cluster,Node
Node_CPUUsage Node Node CPU Usage Percent Cluster,Node
Node_MemoryUsage Node Node Memory Usage Percent Cluster,Node
PersistentVolumeClaim_VolumeUsage PersistentVolumeClaim Persistent Volume Claim Capacity Usage Percent Cluster,Namespace,PersistentVolumeClaim

Object

The collected Volcengine VKE object data structure can be seen in 「Infrastructure - Custom」.

    {
    "fields": {
        "ClusterConfig": {},
        "CreateTime": "2024-04-07T06:13:08Z",
        "KubernetesConfig": {},
        "PodsConfig": {},
        "message": {}
    },
    "measurement": "volcengine_vke",
    "tags": {
        "ChargeType": "PostPaid",
        "ClusterId": "cco93ispooc7b6ohg00b0",
        "ClusterName": "test",
        "KubernetesVersion": "v1.26.10-vke.14",
        "RegionId": "cn-shanghai",
        "Status": "Running",
        "name": "cco93ispooc7b6ohg00b0"
        }
    }