The Azure Batch AI service has retired.
The at-scale training capabilities of Batch AI are available in Azure Machine Learning service. In addition to many other machine learning capabilities, the Azure Machine Learning service includes a cloud-based managed compute target for training and batch scoring machine learning models. The Azure Machine Learning service is a generally available service. This means that it includes a committed SLA and various support plans to choose from. Pricing for using Azure infrastructure through the Batch AI service or through the Azure Machine Learning service should not vary, only the cost of the underlying compute is charged in both cases.
Use the Azure Public cloud integration to discover and collect metrics against Azure Batch AI Workspaces.
External reference
Setup
To set up the Azure integration and discover the Azure service, go to Azure Integration Discovery Profile and select Machine Learning Services Workspaces
.
Event support
- Supported: Azure events for Azure Machine Learning Services Workspaces.
- Configure Azure Events in OpsRamp Azure Integration Discovery Profile.
Supported metrics
OpsRamp Metric | Metric Display Name | Unit | Aggregation Type |
---|---|---|---|
azure_ml_services_workspaces_Active_Cores Number of active cores. | Active Cores | Count | Average |
azure_ml_services_workspaces_Active_Nodes Number of active nodes. | Active Nodes | Count | Average |
azure_ml_services_workspaces_Cancel_Requested_Runs Number of runs where cancel was requested for this workspace. | Cancel Requested Runs | Count | Total |
azure_ml_services_workspaces_Cancelled_Runs Number of runs cancelled for this workspace. | Cancelled Runs | Count | Total |
azure_ml_services_workspaces_Completed_Runs Number of runs completed successfully for this workspace. | Completed Runs | Count | Total |
azure_ml_services_workspaces_CpuUtilization Percentage of memory utilization on a CPU node. | CpuUtilization | Count | Average |
azure_ml_services_workspaces_Errors Number of run errors in this workspace. | Errors | Count | Total |
azure_ml_services_workspaces_Failed_Runs Number of runs failed for this workspace. | Failed Runs | Count | Total |
azure_ml_services_workspaces_Finalizing_Runs Number of runs entered finalizing state for this workspace. | Finalizing Runs | Count | Total |
azure_ml_services_workspaces_GpuUtilization Percentage of memory utilization on a GPU node. | GpuUtilization | Count | Average |
azure_ml_services_workspaces_Idle_Cores Number of idle cores. | Idle Cores | Count | Average |
azure_ml_services_workspaces_Idle_Nodes Number of idle nodes. | Idle Nodes | Count | Average |
azure_ml_services_workspaces_Leaving_Cores Number of leaving cores. | Leaving Cores | Count | Average |
azure_ml_services_workspaces_Model_Deploy_Failed Number of model deployments that failed in this workspace. | Model Deploy Failed | Count | Total |
azure_ml_services_workspaces_Model_Deploy_Started Number of model deployments started in this workspace. | Model Deploy Started | Count | Total |
azure_ml_services_workspaces_Model_Deploy_Succeeded Number of model deployments that succeeded in this workspace. | Model Deploy Succeeded | Count | Total |
azure_ml_services_workspaces_Model_Register_Failed Number of model registrations that failed in this workspace. | Model Register Failed | Count | Total |
azure_ml_services_workspaces_Model_Register_Succeeded Number of model registrations that succeeded in this workspace. | Model Register Succeeded | Count | Total |
azure_ml_services_workspaces_Not_Responding_Runs Number of runs not responding for this workspace. | Not Responding Runs | Count | Total |
azure_ml_services_workspaces_Not_Started_Runs Number of runs in Not Started state for this workspace. | Not Started Runs | Count | Total |
azure_ml_services_workspaces_Preempted_Cores Number of preempted cores. | Preempted Cores | Count | Average |
azure_ml_services_workspaces_Preempted_Nodes Number of preempted nodes. | Preempted Nodes | Count | Average |
azure_ml_services_workspaces_Preparing_Runs Number of runs that are preparing for this workspace. | Preparing Runs | Count | Total |
azure_ml_services_workspaces_Provisioning_Runs Number of runs that are provisioning for this workspace. | Provisioning Runs | Count | Total |
azure_ml_services_workspaces_Queued_Runs Number of runs that are queued for this workspace. | Queued Runs | Count | Total |
azure_ml_services_workspaces_Quota_Utilization_Percentage Percent of quota utilized. | Quota Utilization Percentage | Count | Average |
azure_ml_services_workspaces_Started_Runs Number of runs running for this workspace. | Started Runs | Count | Total |
azure_ml_services_workspaces_Starting_Runs Number of runs started for this workspace. | Starting Runs | Count | Total |
azure_ml_services_workspaces_Total_Cores Number of total cores. | Total Cores | Count | Average |
azure_ml_services_workspaces_Total_Nodes Number of total nodes. | Total Nodes | Count | Average |
azure_ml_services_workspaces_Unusable_Cores Number of unusable cores. | Unusable Cores | Count | Average |
azure_ml_services_workspaces_Unusable_Nodes Number of unusable nodes. | Unusable Nodes | Count | Average |
azure_ml_services_workspaces_Warnings Number of run warnings in this workspace. | Warnings | Count | Total |