Cloud Dataproc is a managed Apache Spark and Apache Hadoop service so you can take advantage of open source data tools for batch processing, querying, streaming, and machine learning.
Cloud Dataproc automation helps you create clusters quickly, manage them easily, and save money by turning clusters off when you do not need them. With less time and money spent on administration, you can focus on your jobs and your data.
Setup
To set up the Google integration and discover the Google service,
go to Google Integration Discovery Profile and select GOOGLE/Dataproc Cluster
.
Supported metrics
OpsRamp Metric | Metric Display Name | Unit | Aggregation Type |
---|---|---|---|
google_dataproc_cluster_hdfs_datanodes Number of HDFS DataNodes that are running inside a cluster. | Cluster Hdfs Datanodes | Count | Average |
google_dataproc_cluster_hdfs_storage_capacity Capacity of HDFS system running on cluster in GB. | Cluster Hdfs Storage Capacity | Count | Average |
google_dataproc_cluster_hdfs_storage_utilization Percentage of HDFS storage currently used. | Cluster Hdfs Storage Utilization | Count | Average |
google_dataproc_cluster_hdfs_unhealthy_blocks Number of unhealthy blocks inside the cluster. | Cluster Hdfs Unhealthy Blocks | Count | Average |
google_dataproc_cluster_job_completion_time Amount of time that jobs took to complete from the time the user submits a job to the time Dataproc reports it is completed. | Cluster Job Completion Time | Count | Average |
google_dataproc_cluster_job_duration Amount of time that jobs have spent in a given state. | Cluster Job Duration | Count | Average |
google_dataproc_cluster_job_failed_count Number of jobs that have failed on a cluster. | Cluster Job Failed Count | Count | Average |
google_dataproc_cluster_job_running_count Number of jobs that are running on a cluster. | Cluster Job Running Count | Count | Average |
google_dataproc_cluster_job_submitted_count Number of jobs submitted to a cluster. | Cluster Job Submitted Count | Count | Average |
google_dataproc_cluster_operation_completion_time Amount of time that operations took to complete from the time the user submits a operation to the time Dataproc reports it is completed. | Cluster Operation Completion Time | Count | Average |
google_dataproc_cluster_operation_duration Amount of time that operations have spent in a given state. | Cluster Operation Duration | Count | Average |
google_dataproc_cluster_operation_failed_count Number of operations that have failed on a cluster. | Cluster Operation Failed Count | Count | Average |
google_dataproc_cluster_operation_running_count Number of operations that are running on a cluster. | Cluster Operation Running Count | Count | Average |
google_dataproc_cluster_operation_submitted_count Number of operations submitted to a cluster. | Cluster Operation Submitted Count | Count | Average |
google_dataproc_cluster_yarn_allocated_memory_percentage Percentage of YARN memory is allocated. | Cluster Yarn Allocated Memory Percentage | Count | Average |
google_dataproc_cluster_yarn_apps Number of active YARN applications. | Cluster Yarn Apps | Count | Average |
google_dataproc_cluster_yarn_containers Number of YARN containers. | Cluster Yarn Containers | Count | Average |
google_dataproc_cluster_yarn_memory_size YARN memory size in GB. | Cluster Yarn Memory Size | Count | Average |
google_dataproc_cluster_yarn_nodemanagers Number of YARN NodeManagers running inside cluster. | Cluster Yarn Nodemanagers | Count | Average |
google_dataproc_cluster_yarn_pending_memory_size Current memory request, in GB, that is pending to be fulfilled by the scheduler. | Cluster Yarn Pending Memory Size | Count | Average |
google_dataproc_cluster_yarn_virtual_cores Number of virtual cores in YARN. | Cluster Yarn Virtual Cores | Count | Average |
Event support
- Not supported