Skip to content

Alibaba Cloud SAE

Collect Metrics, Logs, and Tracing information from Alibaba Cloud SAE (Serverless App Engine).

Configuration

Applications deployed on SAE can integrate Tracing, Metrics, and Logs data through the following process:

  • Applications report Trace data to DataKit by integrating APM
  • Log data from applications can be collected via KafkaMQ and then consumed by DataKit
  • Metrics data from application containers are collected using Alibaba Cloud's monitoring API and reported to TrueWatch through the Function platform (DataFlux.f(x))
  • DataKit collects the corresponding data and uniformly processes and reports it to TrueWatch

Note: Deploying DataKit on SAE can save bandwidth.

Create DataKit Application

Create a DataKit application on SAE

  • Go to SAE, click Application List - Create Application.
  • Fill in the application information
    • Application Name
    • Select a namespace, if not available, create one
    • Select a vpc, if not available, create one
    • Select a security group: vswitch must match the NAT switch
    • Adjust the number of instances as needed
    • CPU 1 core, memory 1G
    • Click Next after completion
  • Add image: pubrepo.truewatch.com/datakit/datakit:1.31.0
  • Add environment variables, configure as follows:
{
  "ENV_DATAWAY": "https://openway.truewatch.com?token=tkn_xxx",
  "KAFKAMQ": "# {\"version\": \"1.22.7-1510\", \"desc\": \"do NOT edit this line\"}\n\n[[inputs.kafkamq]]\n  # addrs = [\"alikafka-serverless-cn-8ex3y7ciq02-1000.alikafka.aliyuncs.com:9093\",\"alikafka-serverless-cn-8ex3y7ciq02-2000.alikafka.aliyuncs.com:9093\",\"alikafka-serverless-cn-8ex3y7ciq02-3000.alikafka.aliyuncs.com:9093\"]\n  addrs = [\"alikafka-serverless-cn-8ex3y7ciq02-1000-vpc.alikafka.aliyuncs.com:9092\",\"alikafka-serverless-cn-8ex3y7ciq02-2000-vpc.alikafka.aliyuncs.com:9092\",\"alikafka-serverless-cn-8ex3y7ciq02-3000-vpc.alikafka.aliyuncs.com:9092\"]\n  # your kafka version:0.8.2 ~ 3.2.0\n  kafka_version = \"3.3.1\"\n  group_id = \"datakit-group\"\n  # consumer group partition assignment strategy (range, roundrobin, sticky)\n  assignor = \"roundrobin\"\n\n  ## kafka tls config\n   tls_enable = false\n\n  ## -1:Offset Newest, -2:Offset Oldest\n  offsets=-1\n\n\n  ## user custom message with PL script.\n  [inputs.kafkamq.custom]\n    #spilt_json_body = true\n    ## spilt_topic_map determines whether to enable log splitting for specific topic based on the values in the spilt_topic_map[topic].\n    #[inputs.kafkamq.custom.spilt_topic_map]\n     # \"log_topic\"=true\n     # \"log01\"=false\n    [inputs.kafkamq.custom.log_topic_map]\n      \"springboot-server_log\"=\"springboot_log.p\"\n    #[inputs.kafkamq.custom.metric_topic_map]\n    #  \"metric_topic\"=\"metric.p\"\n    #  \"metric01\"=\"rum_apm.p\"\n    #[inputs.kafkamq.custom.rum_topic_map]\n    #  \"rum_topic\"=\"rum_01.p\"\n    #  \"rum_02\"=\"rum_02.p\"\n",
  "SPRINGBOOT_LOG_P": "abc = load_json(_)\n\nadd_key(file, abc[\"file\"])\n\nadd_key(message, abc[\\"message\\"])\nadd_key(host, abc[\"host\"])\nmsg = abc[\"message\"]\ngrok(msg, \"%{TIMESTAMP_ISO8601:time} %{NOTSPACE:thread_name} %{LOGLEVEL:status}%{SPACE}%{NOTSPACE:class_name} - \\\\[%{NOTSPACE:method_name},%{NUMBER:line}\\\\] %{DATA:service_name} %{DATA:trace_id} %{DATA:span_id} - %{GREEDYDATA:msg}\")\n\nadd_key(topic, abc[\"topic\"])\n\ndefault_time(time,\"Asia/Shanghai\")",
  "ENV_GLOBAL_HOST_TAGS": "host=__datakit_hostname,host_ip=__datakit_ip",
  "ENV_HTTP_LISTEN": "0.0.0.0:9529",
  "ENV_DEFAULT_ENABLED_INPUTS": "dk,cpu,disk,diskio,mem,swap,system,hostobject,net,host_processes,container,ddtrace,statsd,profile"
}

Configuration Details:

  1. ENV_DATAWAY: Required, the gateway address for reporting to TrueWatch
  2. KAFKAMQ: Optional, KafkaMQ collector configuration, refer to: Kafka Collector Configuration File Introduction
  3. SPRINGBOOT_LOG_P: Optional, used in conjunction with KAFKAMQ, log pipeline script for splitting log data from Kafka
  4. ENV_GLOBAL_HOST_TAGS: Required, global tags for the collector
  5. ENV_HTTP_LISTEN: Required, DataKit port, IP must be 0.0.0.0 otherwise other pods cannot access it
  6. ENV_DEFAULT_ENABLED_INPUTS: Required, default enabled collectors

Tracing

To deploy applications on Alibaba Cloud SAE, APM needs to be introduced into the corresponding containers:

  • APM build package files can be uploaded to oss, or the APM build package can be integrated into the application's Dockerfile for building
  • Startup loading, the same as the steps for accessing APM in a regular environment.

Metrics

Install Func

It is recommended to activate TrueWatch Integration - Extensions - DataFlux Func (Automata)

If deploying Func manually, refer to Deploy Func Manually

Activate Script

Note: Please prepare the Alibaba Cloud AK in advance (for simplicity, you can directly grant global read-only permission ReadOnlyAccess)

Activate Script for Automata

  1. Log in to the TrueWatch console
  2. Click the 【Integration】 menu, select 【Cloud Account Management】
  3. Click 【Add Cloud Account】, select 【Alibaba Cloud】, fill in the required information on the interface, if the cloud account information has been configured before, ignore this step
  4. Click 【Test】, if the test is successful, click 【Save】, if the test fails, please check the relevant configuration information and retest
  5. Click the cloud account list in 【Cloud Account Management】, you can see the added cloud account, click the corresponding cloud account to enter the details page
  6. Click the 【Integration】 button on the cloud account details page, find Alibaba Cloud SAE in the Not Installed list, click the 【Install】 button, and the installation interface will pop up for installation.

Activate Script Manually

  1. Log in to the Func console, click 【Script Market】, enter the TrueWatch script market, search: integration_alibabacloud_sae_app, integration_alibabacloud_sae_instance

  2. Click 【Install】, then enter the corresponding parameters: Alibaba Cloud AK ID, AK Secret, and account name.

  3. Click 【Deploy Startup Script】, the system will automatically create a Startup script set and automatically configure the corresponding startup script.

  4. After activation, you can see the corresponding automatic trigger configuration in 「Management / Automatic Trigger Configuration」. Click 【Execute】 to execute immediately without waiting for the scheduled time. Wait a moment, you can view the execution task record and corresponding logs.

Verification

  1. In 「Management / Automatic Trigger Configuration」, confirm whether the corresponding task has the corresponding automatic trigger configuration, and you can also check the corresponding task record and log to check for exceptions
  2. In TrueWatch, check whether there is asset information in 「Infrastructure / Custom」
  3. In TrueWatch, check whether there is corresponding monitoring data in 「Metrics」

Metrics Introduction

Metric Unit Dimensions Description
cpu_Average % userId、appId Application CPU
diskIopsRead_Average Count/Second userId、appId Application Disk IOPS Read
diskIopsWrite_Average Count/Second userId、appId Application Disk IOPS Write
diskRead_Average Byte/Second userId、appId Application Disk IO Throughput Read
diskTotal_Average Kilobyte userId、appId Application Disk Total
diskUsed_Average Kilobyte userId、appId Application Disk Used
diskWrite_Average Byte/Second userId、appId Application Disk IO Throughput Write
instanceId_memoryUsed_Average MB userId、appId、instanceId Instance Memory Used
instance_cpu_Average % userId、appId、instanceId Instance CPU
instance_diskIopsRead_Average Count/Second userId、appId、instanceId Instance Disk IOPS Read
instance_diskIopsWrite_Average Count/Second userId、appId、instanceId Instance Disk IOPS Write
instance_diskRead_Average Byte/Second userId、appId、instanceId Instance Disk IO Throughput Read
instance_diskTotal_Average Kilobyte userId、appId、instanceId Instance Disk Total
instance_diskUsed_Average Kilobyte userId、appId、instanceId Instance Disk Used
instance_diskWrite_Average Byte/Second userId、appId、instanceId Instance Disk IO Throughput Write
instance_load_Average min userId、appId、instanceId Instance Average Load
instance_memoryTotal_Average MB userId、appId、instanceId Instance Total Memory
instance_memoryUsed_Average MB userId、appId、instanceId Instance Memory Used
instance_netRecv_Average Byte/Second userId、appId、instanceId Instance Received Bytes
instance_netRecvBytes_Average Byte userId、appId、instanceId Instance Total Received Bytes
instance_netRecvDrop_Average Count/Second userId、appId、instanceId Instance Received Packet Drop
instance_netRecvError_Average Count/Second userId、appId、instanceId Instance Received Error Packets
instance_netRecvPacket_Average Count/Second userId、appId、instanceId Instance Received Packets
instance_netTran_Average Byte/Second userId、appId、instanceId Instance Sent Bytes
instance_netTranBytes_Average Byte userId、appId、instanceId Instance Total Sent Bytes
instance_netTranDrop_Average Count/Second userId、appId、instanceId Instance Sent Packet Drop
instance_netTranError_Average Count/Second userId、appId、instanceId Instance Sent Error Packets
instance_netTranPacket_Average Count/Second userId、appId、instanceId Instance Sent Packets
instance_tcpActiveConn_Average Count userId、appId、instanceId Instance Active TCP Connections
instance_tcpInactiveConn_Average Count userId、appId、instanceId Instance Inactive TCP Connections
instance_tcpTotalConn_Average Count userId、appId、instanceId Instance Total TCP Connections
load_Average min userId、appId Application Average Load
memoryTotal_Average MB userId、appId Application Total Memory
memoryUsed_Average MB userId、appId Application Memory Used
netRecv_Average Byte/Second userId、appId Application Received Bytes
netRecvBytes_Average Byte userId、appId Application Total Received Bytes
netRecvDrop_Average Count/Second userId、appId Application Received Packet Drop
netRecvError_Average Count/Second userId、appId Application Received Error Packets
netRecvPacket_Average Count/Second userId、appId Application Received Packets
netTran_Average Byte/Second userId、appId Application Sent Bytes
netTranBytes_Average Byte userId、appId Application Total Sent Bytes
netTranDrop_Average Count/Second userId、appId Application Sent Packet Drop
netTranError_Average Count/Second userId、appId Application Sent Error Packets
netTranPacket_Average Count/Second userId、appId Application Sent Packets
tcpActiveConn_Average Count userId、appId Application Active TCP Connections
tcpInactiveConn_Average Count userId、appId Application Inactive TCP Connections
tcpTotalConn_Average Count userId、appId Application Total TCP Connections

Logs

Alibaba Cloud SAE provides Kafka to output logs to TrueWatch, the process is as follows:

  • Enable Kafka log reporting for SAE applications
  • DataKit enables KafkaMQ log collection, collects application Kafka log reporting Topic

Example:

[[inputs.kafkamq]]
  [inputs.kafkamq.custom]
    [inputs.kafkamq.custom.log_topic_map]
      "log_topic"="topic_name"
      "log01"="pipeline_name"