Skip to content

Collector Configuration Manual for "Volcengine - Cloud Monitor"

Before reading this document, please read:

Tip

Before using this collector, you must install "Integration Core" and its accompanying third-party dependencies.

Tip

This collector supports multi-threading by default (five threads are enabled by default). If you need to change the thread pool size, you can set the environment variable COLLECTOR_THREAD_POOL_SIZE.

1. Configuration Structure

The configuration structure of this collector is as follows:

Field Type Required Description
targets list Required List of cloud monitor target configurations
Multiple configurations under the same namespace have a logical relationship of "AND"
targets[#].namespace str Required The cloud monitor namespace to be collected. Example: 'VCM_ECS'
See appendix for the complete list
targets[#].subnamespace str Required The cloud monitor sub-namespace to be collected. Example: 'acs_ecs_dashboard'
See appendix for the complete list
targets[#].metrics list Required List of cloud monitor metrics to be collected
See appendix for the complete list
targets[#].metrics[#] str Required Metric name

2. Configuration Examples

Specifying Specific Metrics

Collecting two metrics named CpuTotal and MemoryUsedSpace from ECS

collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['CpuSystem', 'CpuTotal', 'MemoryUsedSpace'],
        },
    ]
}

Wildcard Matching for Metrics

Metric names can use the * wildcard for matching.

In this example, the following metrics will be collected:

  • Metrics named MemoryUsedSpace
  • Metrics whose names start with Cpu
  • Metrics whose names end with Connection
  • Metrics whose names contain Used
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['MemoryUsedSpace', 'Cpu*', '*Connection', '*Used*'],
        },
    ],
}

Excluding Specific Metrics

Adding the "NOT" marker at the beginning indicates that the following metrics should be excluded.

In this example, the following metrics will not be collected:

  • Metrics named MemoryUsedSpace
  • Metrics whose names start with Cpu
  • Metrics whose names end with Connection
  • Metrics whose names contain Used
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['NOT', 'MemoryUsedSpace', 'Cpu*', '*Connection', '*Used*'],
        },
    ],
}

Multiple Filters to Specify Required Metrics

The same namespace can be specified multiple times, with metrics filtered sequentially from top to bottom.

In this example, the following filtering steps are performed on the metric names:

  1. Select all metrics whose names contain Cpu
  2. From the results of the previous step, exclude metrics named CpuTotal
collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['*Cpu*'],
        },
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['NOT', 'CpuTotal'],
        },
    ],
}

Configuring Filters (Optional)

This collector script supports custom filters, allowing users to filter target resources based on object attributes. The filter function returns True or False.

  • True: The target resource should be collected.
  • False: The target resource should not be collected.

When custom object collection is enabled, more object attributes can be filtered. For details, refer to the corresponding product's custom object collector documentation (coming soon...)

# Example: Enabling a filter to filter based on the InstanceId and RegionId attributes of the object. Configuration format is as follows:

def filter_instance(instance, namespace='VCM_ECS'):
    '''
    Collect metrics for instances with InstanceId i-xxxxxa, i-xxxxxb and RegionId cn-hangzhou
    '''
    instance_id = instance['tags'].get('InstanceId')
    status = instance['tags'].get('Status')
    if instance_id in ['i-xxxxxa', 'i-xxxxxb'] and status in ['RUNNING']:
        return True
    return False

from integration_core__runner import Runner
import integration_volcengine_monitor__main as main

@DFF.API('Volcengine-monitor ', timeout=3600, fixed_crontab="*/5 * * * *")
def run():
    Runner(main.DataCollector(account, collector_configs, filters=[filter_instance])).run()
Tip

When multiple filters are configured under the same namespace, all filters must be satisfied for the data to be reported.

3. Data Reporting Format

After data is successfully synchronized, it can be viewed in the "Metrics" section of TrueWatch.

Taking the following collector configuration as an example:

collector_configs = {
    'targets': [
        {
            'namespace': 'VCM_ECS',
            'subnamespace':'Instance',
            'metrics'  : ['CpuTotal', 'MemoryUsedSpace'],
        },
    ],
}

The reported data example is as follows:

{
  "measurement": "volcengine_VCM_ECS",
  "tags": {
    "ResourceID": "i-xxxx"
  },
  "fields": {
    "CpuTotal"       : 1.23,
    "MemoryUsedSpace": 1.23
  }
}
Tip

All metric values will be reported as float type.

4. Interaction with Custom Object Collectors

When other custom object collectors (such as ECS, mysql) are running in the same DataFlux Func, this collector will automatically attempt to match the tags.ResourceID field with the tags.name field in the custom objects.

Since custom object information is required for interaction in cloud monitor collectors, it is generally recommended to place the cloud monitor collector at the end of the list, for example:

# Create collectors
collectors = [
  main.DataCollector(account, collector_configs),
  monitor_main.DataCollector(account, monitor_configs)
]

When a successful match is made, the tags from the custom object (except for name) will be added to the tags of the monitoring data, enabling effects such as filtering cloud monitor metrics by instance name. The specific effect is as follows:

Assuming the original data collected by the cloud monitor is as follows:

{
  "measurement": "volcengine_VCM_ECS",
  "tags": {
    "ResourceID": "i-xxxx",
    "{key}": "{value}"
  },
  "fields": {
    "{metric}": "{metric_value}"
  }
}

At the same time, the custom object data collected by the Volcengine ECS collector is as follows:

{
  "measurement": "volcengine_VCM_ECS",
  "tags": {
    "name"      : "i-xxxx",
    "InstanceId": "i-xxxx",
    "RegionId"  : "cn-shanghai",
    "{key}": "{value}"
  },
  "fields": {
    "{key}": "{value}"
  }
}

Then, the final reported cloud monitor data will be as follows:

{
"measurement": "volcengine_VCM_ECS",
  "tags": {
    "instanceId": "i-xxxx",
    "RegionId"  : "cn-beijing",
    "{key}": "{value}"
  },
  "fields": {
    "{metric}": "{metric_value}"
  }
}

5. Cloud Monitor API Call Limits

  1. Volcengine has rate limits for the GetMetricData API: a primary account and its IAM accounts can call the GetMetricData API no more than 20 times per second, otherwise rate limiting will be triggered. (Currently, it is only rate limiting, and no fees are charged).

  2. How to avoid API rate limiting caused by multiple auto-triggered tasks executing simultaneously: Since the collector supports multi-threaded collection by default and there are cases where multiple auto-triggered tasks execute in parallel, it is easy to trigger API rate limiting. Here are two suggestions to mitigate API rate limiting:

  3. Set a small value for the environment variable COLLECTOR_THREAD_POOL_SIZE;

  4. Delay the execution of auto-triggered tasks to avoid simultaneous API calls at the same moment. Add the delayed_crontab parameter to the startup function decorator DFF.API(xxx). delayed_crontab is the delay execution parameter, in seconds. For example, the following configuration will execute the task at the 5th second of every minute:
    @DFF.API('Volcengine-ECS Collection', timeout=3600, fixed_crontab='* * * * *', delayed_crontab=5)
    def run():
        collectors = [
            main.DataCollector(account, collector_configs),
            monitor_main.DataCollector(account, monitor_configs),
        ]
        Runner(collectors).run()
    

Note: The above suggestions should be adjusted based on task execution time to find suitable parameters.

X. Appendix

Please refer to the official Volcengine documentation: