Skip to content

Hadoop HDFS NameNode

Collect HDFS namenode Metrics information.

Installation and Deployment

Since the NameNode is developed in the java LANGUAGE, it is possible to use the jmx-exporter method to collect Metrics information.

1. NameNode Configuration

1.1 Download jmx-exporter

Download address: https://github.com/prometheus/jmx_exporter

1.2 Download jmx Script

Download address: https://github.com/lrwh/jmx-exporter/blob/main/hadoop-hdfs-namenode.yml

1.3 Adjust NameNode Startup Parameters

Add the following to the startup parameters of namenode:

{JAVA_GC_ARGS} -javaagent:/opt/guance/jmx/jmx_exporter-1.0.1.jar=localhost:17107:/opt/guance/jmx/hadoop-hdfs-namenode.yml

1.4 Restart NameNode

2. DataKit Collector Configuration

2.1 Install DataKit

2.2 Configure Collector

The jmx-exporter can directly expose metrics url, so it can be collected directly through the prom collector.

Go to the conf.d/prom under the DataKit installation directory, copy prom.conf.sample to namenode.conf.

cp prom.conf.sample namenode.conf

Adjust the content of namenode.conf as follows:

  urls = ["http://localhost:17107/metrics"]
  source ="hdfs-namenode"
  [inputs.prom.tags]
    component = "hdfs-namenode"
  interval = "10s"

Other configurations can be adjusted as needed

, adjustment parameter description :

  • urls: The jmx-exporter Metrics address, fill in the Metrics url exposed by the corresponding component here
  • source: Collector alias, it's recommended to make distinctions
  • keep_exist_metric_name: Keep the metric name
  • interval: Collection interval
  • inputs.prom.tags: Add extra tags

3. Restart DataKit

Restart DataKit

Metrics

Hadoop Measurement Sets

NameNode Metrics are located under the Hadoop Measurement sets, here we mainly introduce the related NameNode Metrics descriptions.

Metrics Description Unit
namenode_add_block_ops Add block operation counts count
namenode_allow_snapshot_ops Allow snapshot operation counts count
namenode_block_capacity Block capacity byte
namenode_block_deletion_start_time Block deletion start time count
namenode_block_ops_batched Batched block operation counts count
namenode_block_ops_queued Queued block operation counts count
namenode_block_pool_used_space Used block pool space count
namenode_block_received_and_deleted_ops Received and deleted block operation counts count
namenode_blocks Block counts count
namenode_bytes_in_future_ecblock_groups Bytes in future EC block groups count
namenode_bytes_in_future_replicated_blocks Bytes in future replicated blocks count
namenode_bytes_with_future_generation_stamps Bytes with future generation stamps count
namenode_cache_capacity Cache capacity byte
namenode_cache_report_avg_time Cache report average time count
namenode_cache_report_num_ops Cache report operation counts count
namenode_cache_used Used cache count
namenode_capacity Capacity count
namenode_capacity_remaining Remaining capacity byte
namenode_capacity_remaining_gb Remaining capacity (GB) GB
namenode_capacity_total_gb Total capacity (GB) GB
namenode_capacity_used Used capacity byte
namenode_capacity_used_gb Used capacity (GB) GB
namenode_capacity_used_non_dfs Non-DFS used capacity GB
namenode_corrupt_blocks Corrupt blocks count
namenode_corrupt_ecblock_groups Corrupt EC block groups count
namenode_corrupt_replicated_blocks Corrupt replicated blocks count
namenode_create_file_ops Create file operation counts count
namenode_create_snapshot_ops Create snapshot operation counts count
namenode_create_symlink_ops Create symbolic link operation counts count
namenode_delete_file_ops Delete file operation counts count
namenode_delete_snapshot_ops Delete snapshot operation counts count
namenode_disallow_snapshot_ops Disallow snapshot operation counts count
namenode_distinct_version_count Distinct version counts count
namenode_distinct_versions Distinct versions count
namenode_dropped_pub_all Dropped pub_all count
namenode_elapsed_time Elapsed time ms
namenode_estimated_capacity_lost Estimated lost capacity byte
namenode_excess_blocks Excess blocks count
namenode_expired_heartbeats Expired heartbeats count
namenode_file_info_ops File info operation counts count
namenode_files File counts count
namenode_files_appended Appended file counts count
namenode_files_deleted Deleted file counts count
namenode_files_in_get_listing_ops File counts in get listing operations count
namenode_files_renamed Renamed file counts count
namenode_files_truncated Truncated file counts count
namenode_free Free count
namenode_fs_image_load_time File system image load time ms
namenode_fs_lock_queue_length File system lock queue length count
namenode_gc_count Garbage collection counts count
namenode_generate_edektime_avg_time Generate EDEK time average time ms
namenode_generate_edektime_num_ops Generate EDEK operation counts count
namenode_get_additional_datanode_ops Get additional data node operation counts count
namenode_highest_priority_low_redundancy_ecblocks Highest priority low redundancy EC blocks count
namenode_highest_priority_low_redundancy_replicated_blocks Highest priority low redundancy replicated blocks count
namenode_last_checkpoint_time Last checkpoint time ms
namenode_last_hatransition_time Last HA transition time ms
namenode_last_written_transaction_id Last written transaction ID count
namenode_list_snapshottable_dir_ops List snapshottable directory operation counts count
namenode_lock_queue_length Lock queue length count
namenode_low_redundancy_ecblock_groups Low redundancy EC block groups count
namenode_low_redundancy_replicated_blocks Low redundancy replicated blocks count
namenode_max_objects Max object counts count
namenode_millis_since_last_loaded_edits Milliseconds since last loaded edits ms
namenode_missing_blocks Missing blocks count
namenode_missing_ecblock_groups Missing EC block groups count
namenode_missing_repl_one_blocks Missing replication one blocks count
namenode_missing_replicated_blocks Missing replicated blocks count
namenode_missing_replication_one_blocks Missing replication one blocks count
namenode_nnstarted_time_in_millis Start time (milliseconds) ms
namenode_non_dfs_used_space Non-DFS used space count
namenode_num_active_clients Active client counts count
namenode_num_active_sinks Active sink data node counts count
namenode_num_active_sources Active source data node counts count
namenode_num_all_sinks All sink data node counts count
namenode_num_all_sources All source data node counts count
namenode_num_dead_data_nodes Dead data node counts count
namenode_num_decom_dead_data_nodes Decommissioned dead data node counts count
namenode_num_decom_live_data_nodes Decommissioned live data node counts count
namenode_num_decommissioning_data_nodes Decommissioning data node counts count
namenode_num_edit_log_loaded_avg_count Edit log loaded average counts count
namenode_num_edit_log_loaded_num_ops Edit log loaded operation counts count
namenode_num_encryption_zones Encryption zone counts count
namenode_num_entering_maintenance_data_nodes Entering maintenance data node counts count
namenode_num_files_under_construction Files under construction counts count
namenode_num_in_maintenance_dead_data_nodes Maintenance dead data node counts count
namenode_num_in_maintenance_live_data_nodes Maintenance live data node counts count
namenode_num_live_data_nodes Live data node counts count
namenode_num_stale_data_nodes Stale data node counts count
namenode_num_stale_storages Stale storage counts count
namenode_num_timed_out_pending_reconstructions Timed out pending reconstruction counts count
namenode_num_times_re_replication_not_scheduled Not scheduled re-replication counts count
namenode_number_of_missing_blocks Missing block counts count
namenode_number_of_missing_blocks_with_replication_factor_one Missing block counts with replication factor one count
namenode_number_of_snapshottable_dirs Snapshottable directory counts count
namenode_pending_data_node_message_count Pending data node message counts count
namenode_pending_deletion_blocks Pending deletion block counts count
namenode_pending_deletion_ecblocks Pending deletion EC block counts count
namenode_pending_deletion_replicated_blocks Pending deletion replicated block counts count
namenode_pending_reconstruction_blocks Pending reconstruction block counts count
namenode_pending_replication_blocks Pending replication block counts count
namenode_percent_block_pool_used Block pool used percentage percent
namenode_percent_complete Complete percentage percent
namenode_percent_remaining Remaining percentage percent
namenode_percent_used Used percentage percent
namenode_postponed_misreplicated_blocks Postponed misreplicated block counts count
namenode_publish_avg_time Publish average time ms
namenode_publish_num_ops Publish operation counts count
namenode_put_image_avg_time Put image average time ms
namenode_put_image_num_ops Put image operation counts count
namenode_rename_snapshot_ops Rename snapshot operation counts count
namenode_resource_check_time_avg_time Resource check average time ms
namenode_resource_check_time_num_ops Resource check operation counts count
namenode_safe_mode Safe mode count
namenode_safe_mode_count Safe mode counts count
namenode_safe_mode_elapsed_time Safe mode elapsed time count
namenode_safe_mode_percent_complete Safe mode complete percentage percent
namenode_safe_mode_time Safe mode time ms
namenode_saving_checkpoint Saving checkpoint count
namenode_saving_checkpoint_count Saving checkpoint counts count
namenode_saving_checkpoint_elapsed_time Saving checkpoint elapsed time ms
namenode_saving_checkpoint_percent_complete Saving checkpoint complete percentage count
namenode_scheduled_replication_blocks Scheduled replication block counts count
namenode_stale_data_nodes Stale data nodes count
namenode_storage_block_report_avg_time Storage block report average time ms
namenode_storage_block_report_num_ops Storage block report operation counts count
namenode_successful_re_replications Successful re-replications count
namenode_syncs_avg_time Syncs average time ms
namenode_syncs_num_ops Syncs operation counts count
namenode_tag_total_sync_times Tag total sync times count
namenode_timeout_re_replications Timeout re-replications count
namenode_total_blocks Total block counts count
namenode_total_ecblock_groups Total EC block group counts count
namenode_total_file_ops Total file operation counts count
namenode_total_load Total load count
namenode_total_replicated_blocks Total replicated block counts count
namenode_total_sync_count Total sync counts count
namenode_total_sync_times Total sync times count
namenode_transactions_avg_time Transactions average time ms
namenode_transactions_batched_in_sync Transactions batched in sync count
namenode_transactions_num_ops Transactions operation counts count
namenode_transactions_since_last_checkpoint Transactions since last checkpoint count
namenode_transactions_since_last_log_roll Transactions since last log roll count
namenode_under_replicated_blocks Under replicated block counts count
namenode_used Used count
namenode_volume_failures Volume failure counts count
namenode_warm_up_edektime_avg_time Warm up EDEK average time ms
namenode_warm_up_edektime_num_ops Warm up EDEK operation counts count