Azure Batch AI Workspaces

The Azure Batch AI service has retired.

The at-scale training capabilities of Batch AI are available in Azure Machine Learning service. In addition to many other machine learning capabilities, the Azure Machine Learning service includes a cloud-based managed compute target for training and batch scoring machine learning models. The Azure Machine Learning service is a generally available service. This means that it includes a committed SLA and various support plans to choose from. Pricing for using Azure infrastructure through the Batch AI service or through the Azure Machine Learning service should not vary, only the cost of the underlying compute is charged in both cases.

Use the Azure Public cloud integration to discover and collect metrics against Azure Batch AI Workspaces.

External reference

Setup

To set up the Azure integration and discover the Azure service, go to Azure Integration Discovery Profile and select Machine Learning Services Workspaces.

Event support

Supported: Azure events for Azure Machine Learning Services Workspaces.
Configure Azure Events in OpsRamp Azure Integration Discovery Profile.

Supported metrics

OpsRamp Metric	Metric Display Name	Unit	Aggregation Type
azure_ml_services_workspaces_Active_Cores Number of active cores.	Active Cores	Count	Average
azure_ml_services_workspaces_Active_Nodes Number of active nodes.	Active Nodes	Count	Average
azure_ml_services_workspaces_Cancel_Requested_Runs Number of runs where cancel was requested for this workspace.	Cancel Requested Runs	Count	Total
azure_ml_services_workspaces_Cancelled_Runs Number of runs cancelled for this workspace.	Cancelled Runs	Count	Total
azure_ml_services_workspaces_Completed_Runs Number of runs completed successfully for this workspace.	Completed Runs	Count	Total
azure_ml_services_workspaces_CpuUtilization Percentage of memory utilization on a CPU node.	CpuUtilization	Count	Average
azure_ml_services_workspaces_Errors Number of run errors in this workspace.	Errors	Count	Total
azure_ml_services_workspaces_Failed_Runs Number of runs failed for this workspace.	Failed Runs	Count	Total
azure_ml_services_workspaces_Finalizing_Runs Number of runs entered finalizing state for this workspace.	Finalizing Runs	Count	Total
azure_ml_services_workspaces_GpuUtilization Percentage of memory utilization on a GPU node.	GpuUtilization	Count	Average
azure_ml_services_workspaces_Idle_Cores Number of idle cores.	Idle Cores	Count	Average
azure_ml_services_workspaces_Idle_Nodes Number of idle nodes.	Idle Nodes	Count	Average
azure_ml_services_workspaces_Leaving_Cores Number of leaving cores.	Leaving Cores	Count	Average
azure_ml_services_workspaces_Model_Deploy_Failed Number of model deployments that failed in this workspace.	Model Deploy Failed	Count	Total
azure_ml_services_workspaces_Model_Deploy_Started Number of model deployments started in this workspace.	Model Deploy Started	Count	Total
azure_ml_services_workspaces_Model_Deploy_Succeeded Number of model deployments that succeeded in this workspace.	Model Deploy Succeeded	Count	Total
azure_ml_services_workspaces_Model_Register_Failed Number of model registrations that failed in this workspace.	Model Register Failed	Count	Total
azure_ml_services_workspaces_Model_Register_Succeeded Number of model registrations that succeeded in this workspace.	Model Register Succeeded	Count	Total
azure_ml_services_workspaces_Not_Responding_Runs Number of runs not responding for this workspace.	Not Responding Runs	Count	Total
azure_ml_services_workspaces_Not_Started_Runs Number of runs in Not Started state for this workspace.	Not Started Runs	Count	Total
azure_ml_services_workspaces_Preempted_Cores Number of preempted cores.	Preempted Cores	Count	Average
azure_ml_services_workspaces_Preempted_Nodes Number of preempted nodes.	Preempted Nodes	Count	Average
azure_ml_services_workspaces_Preparing_Runs Number of runs that are preparing for this workspace.	Preparing Runs	Count	Total
azure_ml_services_workspaces_Provisioning_Runs Number of runs that are provisioning for this workspace.	Provisioning Runs	Count	Total
azure_ml_services_workspaces_Queued_Runs Number of runs that are queued for this workspace.	Queued Runs	Count	Total
azure_ml_services_workspaces_Quota_Utilization_Percentage Percent of quota utilized.	Quota Utilization Percentage	Count	Average
azure_ml_services_workspaces_Started_Runs Number of runs running for this workspace.	Started Runs	Count	Total
azure_ml_services_workspaces_Starting_Runs Number of runs started for this workspace.	Starting Runs	Count	Total
azure_ml_services_workspaces_Total_Cores Number of total cores.	Total Cores	Count	Average
azure_ml_services_workspaces_Total_Nodes Number of total nodes.	Total Nodes	Count	Average
azure_ml_services_workspaces_Unusable_Cores Number of unusable cores.	Unusable Cores	Count	Average
azure_ml_services_workspaces_Unusable_Nodes Number of unusable nodes.	Unusable Nodes	Count	Average
azure_ml_services_workspaces_Warnings Number of run warnings in this workspace.	Warnings	Count	Total