Cloud Dataflow is a fully-managed service for transforming and enriching data in stream (real-time) and batch (historical) modes with equal reliability and expressiveness – no more complex workarounds or compromises needed. With its serverless approach to resource provisioning and management, you have access to virtually limitless capacity to solve your biggest data processing challenges, while paying only for what you use.
Cloud Dataflow unlocks transformational use cases across industries, including:
- Check Clickstream, Point-of-Sale, and segmentation analysis in retail.
- Check Fraud detection in financial services.
- Check Personalized user experience in gaming.
- Check IoT analytics in manufacturing, healthcare, and logistics.
Setup
To set up the Google integration and discover the Google service,
go to Google Integration Discovery Profile and select GOOGLE/Dataflow Job
.
Supported metrics
OpsRamp Metric | Metric Display Name | Unit | Aggregation Type |
---|---|---|---|
google_dataflow_job_current_num_vcpus Number of vCPUs currently being used by this Dataflow job. This is the current number of workers times the number of vCPUs per worker. | Current number of vCPUs in use | Count | Average |
google_dataflow_job_data_watermark_age Age (time since event timestamp) of the most recent item of data fully processed by the pipeline. | Data watermark age | Seconds | Average |
google_dataflow_job_elapsed_time Duration that the current run of this pipeline is in the Running state so far, in seconds. When a run completes, this stays at the duration of that run until the next run starts. | Elapsed time | Seconds | Average |
google_dataflow_job_element_count Number of elements added to the pcollection so far. | Element count | Count | Average |
google_dataflow_job_error_count Number of errors that happened so far. | Error count | Count | Average |
google_dataflow_job_estimated_byte_count Estimated number of bytes added to the pcollection so far. Dataflow calculates the average encoded size of elements in a pcollection and mutiplies it by the number of elements. | Estimated byte count | Bytes | Average |
google_dataflow_job_is_failed Has this job failed. | Failed | Count | Average |
google_dataflow_job_status Current state of this pipeline (for example: RUNNING, DONE, CANCELLED, FAILED, ...). Not reported while the pipeline is not running. | Status | String | Average |
google_dataflow_job_system_lag Current maximum duration that an item of data is awaiting processing, in seconds. | System lag | Seconds | Average |
google_dataflow_job_total_memory_usage_time Total GB seconds of memory allocated to this Dataflow job. | Total memory usage time | GB.seconds | Average |
google_dataflow_job_total_pd_usage_time Total GB seconds for all persistent disk used by all workers associated with this Dataflow job. | Total PD usage time | GB.seconds | Average |
google_dataflow_job_total_vcpu_time Total vCPU seconds used by this Dataflow job. | Total vCPU time | Seconds | Average |
google_dataflow_job_billable_shuffle_data_processed Billable bytes of shuffle data processed by this Dataflow job, in bytes Sampled every 60 seconds. | Job Billable Shuffle Data Processed | Bytes | Average |
google_dataflow_job_total_shuffle_data_processed Total bytes of shuffle data processed by this Dataflow job. | Job Total Shuffle Data Processed | Bytes | Average |
google_dataflow_job_total_streaming_data_processed Total bytes of shuffle data processed by this Dataflow job. | Job Total Streaming Data Processed | Bytes | Average |
google_dataflow_job_user_counter A user-defined counter metric. | Job User Counter | Count | Average |
google_dataflow_job_current_shuffle_slots Current shuffle slots used by this Dataflow job. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. | Current shuffle slots in use | Count | Average |
google_dataflow_job_elements_produced_count Number of elements produced by each PTransform. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. | Elements Produced | Count | Average |
google_dataflow_job_estimated_bytes_produced_count Estimated total byte size of elements produced by each PTransform. Sampled every 60 seconds. After sampling, data is not visible for up to 180 seconds. | Estimated Bytes Produced | Count | Average |
Event support
- Supported
- Configurable in OpsRamp Google Integration Discovery Profile.