How to Enable Network Monitoring¶
Introduction¶
Whether running on-premises servers or in cloud environments, modern services and applications heavily rely on network performance. Network stability is critical to business systems. If a key business application fails and the program becomes unavailable, you may need to locate the fault point by analyzing logs, checking systems, accessing programs, viewing processes, debugging services, etc., which can cost a lot of valuable time.
Therefore, comprehensive visibility into the network has become a key part of monitoring application status and performance. However, as your applications scale and grow increasingly complex, achieving this visibility becomes challenging.
Prerequisites¶
- You need to first create a TrueWatch account;
- Install DataKit on your host;
Steps¶
Step 1: Enable the eBPF integration to collect network data¶
The ebpf collector gathers host network TCP/UDP connection information, bash execution logs, etc., including ebpf-net and ebpf-bash:
- ebpf-net:
- Data category: Network
- Composed of netflow and dnsflow, used for collecting host TCP/UDP connection statistics and DNS resolution information respectively;
- ebpf-bash:
- Data category: Logging
- Collects bash execution logs, including process ID, username, executed command, and timestamp;
Operating system support for the ebpf collector: linux/amd64
. Except CentOS 7.6+ and Ubuntu 16.04, other distributions require Linux kernel version higher than 4.0.0.
Configure the ebpf collector¶
Navigate to the conf.d/host
directory under the DataKit installation directory, copy ebpf.conf.sample
and rename it to ebpf.conf
. Example configuration:
[[inputs.ebpf]]
daemon = true
name = 'ebpf'
cmd = "/usr/local/datakit/externals/datakit-ebpf"
args = ["--datakit-apiserver", "0.0.0.0:9529"]
envs = []
## all supported plugins:
## - "ebpf-net":
## contains L4-network, dns collection
## - "ebpf-bash":
## log bash
##
enabled_plugins = ["ebpf-net"]
[inputs.ebpf.tags]
# some_tag = "some_value"
# more_tag = "some_other_value"
#############################
# Parameter Description (Marked * indicates required)
#############################
# --hostname : Hostname, this parameter changes the value of the host tag when uploading data from this collector. Priority: specified parameter > ENV_HOSTNAME value in datakit.conf (if not empty, automatically added at startup) > self-collected (default)
# --datakit-apiserver : DataKit API Server address, default 0.0.0.0:9529
# --log : Log output path, default DataKitInstallDir/externals/datakit-ebpf.log
# --log-level : Log level, default info
# --service : Default ebpf
By default, ebpf-bash is not enabled. To enable it, add "ebpf-bash" to the enabled_plugins configuration option;
After configuring, restart DataKit.
Step 2: Log in to TrueWatch to view detailed host network views¶
Once host network data collection is successful, it will be reported to the TrueWatch console. On the "Infrastructure" -> "Host" details page under the "Network" tab, you can view all network performance monitoring data within the workspace.
TrueWatch allows querying detailed network information for individual hosts. Currently supports network performance monitoring based on TCP and UDP protocols combined with incoming (inbound) and outgoing (outbound) directions, offering multiple combinations. It also provides seven types of network metric statistics for comprehensive real-time monitoring of network traffic. These include: received/sent bytes, TCP latency, TCP jitter, TCP connections, TCP retransmissions, and TCP closures.
Network Connection Analysis¶
TrueWatch supports viewing network connection data based on source IP/port
and destination IP/port
. You can customize adding metrics you care about and filter data based on fields. For example, if you want to query network connections where the destination port is 443 and view real-time network flow data for traffic analysis.
Step 3: Advanced Usage - Host Network Topology Map¶
Under "Infrastructure" -> "Host", click the small icon for the network topology map in the upper left corner to switch to view host network distribution. In the "Network Topology Map", you can visually query the network traffic between hosts in the current workspace, quickly analyze TCP latency, jitter, retransmissions, connections, and closures among different hosts.
- Time Widget: By default, it fetches data from the last 48 hours and does not support auto-refresh; manual refresh is needed to get new data.
- Search and Filter: You can quickly search for host names using keyword fuzzy matching or filter host nodes and their relationships using tags.
- Color Fill: Use "Fill" to customize the coloring of host nodes. The fill value's size and custom range determine the node color. Supported metrics include TCP latency, jitter, retransmissions, connections, and closures.
- Host Nodes:
- Host node icons are divided into regular hosts and cloud hosts; cloud hosts display as logos of cloud service providers.
- Node edge colors change according to the fill field value and custom ranges.
- Host nodes are connected via lines indicating network traffic, represented by bidirectional curves showing incoming/outgoing direction from source to destination host.
- Node sizes reflect inbound traffic volume.
- Line thickness reflects the amount of incoming and outgoing traffic data.
- Correlation Query: Clicking on a host icon enables correlation queries, supporting viewing host details, related logs, traces, and events.
- Custom Ranges: Turn on "Custom Range" to define custom legend color intervals for selected fill metrics. Legend colors are divided equally into five intervals based on maximum and minimum values, each interval assigned a distinct color. Connections and nodes outside the defined data range are grayed out.
- Mouse Hover: Hover over a host object node to view sent/received bytes, TCP latency, jitter, retransmissions, connections, and closures.
By observing the host network distribution and key performance indicators, you can understand current traffic conditions. Suppose an added managed service unintentionally consumes all your bandwidth—you can visualize bandwidth data and use TCP retransmission to quickly identify network issues in your infrastructure causing connectivity problems. Then, you can review relevant logs and request tracking to monitor traffic flow for troubleshooting.
More References¶
For more information on network performance monitoring, refer to: ebpf Collector