HPE Serviceguard

Introduction

HPE Serviceguard ensures the resilience of mission-critical applications by organizing multiple nodes into an enterprise cluster. Specifically designed for Linux environments and ProLiant servers, it protects against software and hardware failures by monitoring server health and responding swiftly. Serviceguard for Linux allows clustering of HP ProLiant server products with shared storage solutions, enabling configurations from 2-node SCSI to 2 to 16-node Fibre Channel setups.

Key benefits

It Discovers HPE Serviceguard components.
Publishes relationships between resources for a topological view and ease of maintenance.
Provides metrics related to job scheduling time and status.
Generates concern alerts for each metric to notify administrators of resource issues.

Supported Target Version

Serviceguard Supported versions : A.12.80.05 , A.15.30.01

Prerequisites

OpsRamp Classic Gateway (Linux) 15.0.0 and above.
OpsRamp Nextgen Gateway 15.0.0 and above.
Note: OpsRamp recommends using the latest Gateway version for full coverage of recent bug fixes, enhancements, etc.
All Serviceguard nodes must use the same credentials.
Jetty-Sgmgr.Service must enabled on all the Serviceguard nodes.
To monitor the HPE Serviceguard we are using the Serviceguard Rest API (SGRAPI).
To utilize the Serviceguard Rest API (SGRAPI) for monitoring HPE Serviceguard, the following conditions must be met:
- Installation of Serviceguard for Linux A.15.XX.XX is required.
- Ports 5511, https port (default port is 5522, if not configured during installation or upgrade), and 5301 must be available for SGRAPI usage.
- Access to HPE Serviceguard necessitates valid IP address/hostname, API credentials, and SSH credentials.
If the primary node is down, Jetty-Sgmgr.Service must be enabled on the secondary nodes in order to perform API calls to it.
For non-root users: Update the following lines in “etc/sudoers” file to provide access for non-root user to execute the ServiceGuard Node Serial Number , Product Name and System UUID commands. Cmnd_Alias DMIDECODE_CMDS = /bin/cat /sys/class/dmi/id/product_serial, /bin/cat /sys/class/dmi/id/product_name, /usr/sbin/dmidecode -s system-uuid

{username} ALL=(ALL) NOPASSWD: DMIDECODE_CMDS

Example for opsramp non-root user: opsramp ALL=(ALL) NOPASSWD: DMIDECODE_CMDS

Hierarchy of HPE Serviceguard

· HPE Serviceguard Cluster
· HPE Serviceguard Node
· HPE Serviceguard Node Interface
· HPE Serviceguard Package
· HPE Serviceguard Package Dependency
· HPE Serviceguard Volume Group
· HPE Serviceguard File System

Supported Metrics

Click here to view the supported metrics

Native Type	Metric Name	Display Name	Metric Lable	Units	Application Version	Description
HPE Serviceguard Cluster	hpe_serviceguard_cluster_Status	HPE Serviceguard Cluster Status	Availability		1.0.0	HPE Serviceguard cluster status.Possible values are - up : 0, down : 1, starting : 2, halting : 3, detached : 4, partially_down : 5, unknown : 6
	hpe_serviceguard_quorumserver_Status	HPE Serviceguard Quorumserver Status	Availability		1.0.0	HPE Serviceguard quorum server status. Possible values are - up : 0, down : 1, unknown :2
	hpe_serviceguard_quorumserver_State	HPE Serviceguard Quorumserver State	Availability		1.0.0	HPE Serviceguard quorum server state. Possible values are - running : 0, Unsupported Version : 1, Access Denied : 2, unknown : 3 , Error : 4
HPE Serviceguard Node	hpe_serviceguard_node_Status	HPE Serviceguard Node Status	Availability		1.0.0	HPE Serviceguard Node status.Possible values are - up : 0, down : 1, starting : 2, halting : 3, detached : 4, partially_down : 5, unknown : 6
	hpe_serviceguard_node_State	HPE Serviceguard Node State	Availability		1.0.0	The status of the host hardware component. Possible values are - GREEN : 0, YELLOW : 1, RED : 2, UNKNOWN : 3
	hpe_serviceguard_node_licenseValidity	HPE Serviceguard Node License Validity	Availability	Days	1.0.0	HPE Serviceguard node license days to expiry
HPE Serviceguard Node Interface	hpe_serviceguard_nodeInterface_Status	HPE Serviceguard Node Interface Status	Availability		1.0.0	HPE Serviceguard node interface status. Possible values are - up : 0, down : 1, unknown : 2
HPE serviceguard Package	hpe_serviceguard_package_autorun	HPE Serviceguard Package Auto Run	Availability		3.0.0	HPE Serviceguard package Auto Run. Possible values are - ENABLED : 0, DISABLED - 1
	hpe_serviceguard_package_switching_node_status	HPE Serviceguard Package Switching Node Status	Availability		3.0.0	HPE Serviceguard Package Switching Node Status. Possible values are - ENABLED : 0, DISABLED - 1
	hpe_serviceguard_package_Status	HPE Serviceguard Package Status	Availability		1.0.0	HPE Serviceguard package status.Possible values are - up : 0, down : 1, start_wait : 2 , starting : 3, halting : 4, halt_wait : 5, failing : 6, fail_wait : 7, relocate_wait : 8, reconfiguring : 9, reconfigure_wait : 10, detached : 11, unknown : 12
	hpe_serviceguard_package_State	HPE Serviceguard Package State	Availability		1.0.0	HPE Serviceguard package state. Possible values are - Starting : 0, start_wait : 1, running : 2, halting : 3, halt_wait : 4, halted : 5, halt_aborted : 6, failing : 7 , fail_wait : 8, failed : 9, relocate_wait : 10, maintenance : 11, detached : 12, reconfiguring : 13, reconfigure_wait : 14, unknown : 15, blocked : 16, changing : 17
	hpe_serviceguard_package_serviceStatus	HPE Serviceguard Package Service Status	Availability		1.0.0	HPE Serviceguard service status. Possible values are - Up : 0 , Down : 1, Unknown :2
	hpe_serviceguard_package_Summary	HPE Serviceguard Package Summary	Availability		1.0.0	HPE Serviceguard package summary. Possible values are - ok : 0, critical : 1, starting : 2, degraded : 3 , unknown : 4
HPE Serviceguard File System	hpe_serviceguard_filesystem_UsedSpace	HPE Serviceguard FileSystem Used Space	Usage	MB	1.0.0	HPE Serviceguard FileSystem Used Space
	hpe_serviceguard_filesystem_Utilisation	HPE Serviceguard FileSystem Utilisation	Usage	%	1.0.0	HPE Serviceguard FileSystem Utilisation
	hpe_serviceguard_filesystem_FreeSpace	HPE Serviceguard FileSystem Free Space	Usage	MB	1.0.0	HPE Serviceguard FileSystem Free Space
	hpe_serviceguard_filesystem_inodesUsed	HPE Serviceguard FileSystem Inodes Used	Usage	count	1.0.0	HPE Serviceguard FileSystem Inodes Used
	hpe_serviceguard_filesystem_inodesFree	HPE Serviceguard FileSystem Inodes Free	Usage	count	1.0.0	HPE Serviceguard FileSystem Inodes Free
	hpe_serviceguard_filesystem_inodesUtilisation	HPE Serviceguard FileSystem Inodes Utilisation	Usage	%	1.0.0	HPE Serviceguard FileSystem Inodes Utilisation

Default Monitoring Configurations

HPE Serviceguard has default Global Device Management Policies, Global Templates, Global Monitors and Global Metrics in OpsRamp. You can customize these default monitoring configurations as per your business requirement by cloning respective Global Templates and Global Device Management Policies. It is recommended to clone them before installing the application to avoid noise alerts and data.

Default Global Device Management Policies
You can find the Device Management Policy for each Native Type at Setup > Resources > Device Management Policies. Search with suggested names in global scope:
{appName nativeType - version}
Ex: hpe-serviceguard HPE Serviceguard Cluster - 1(i.e, appName = hpe-serviceguard, nativeType = HPE Serviceguard Cluster , version = 1)
Default Global Templates
You can find the Global Templates for each Native Type at Setup > Monitoring > Templates. Search with suggested names in global scope. Each template adheres to the following naming convention:
{appName nativeType 'Template' - version}
Ex: hpe-serviceguard HPE Serviceguard Cluster Template- 1(i.e, appName = hpe-serviceguard, nativeType = HPE Serviceguard Cluster, version = 1)
Default Global Monitors
You can find the Global Monitors for each Native Type at Setup > Monitoring > Monitors. Search with suggested names in global scope. Each Monitors adheres to the following naming convention:
{monitorKey appName nativeType - version}
Ex: HPE Serviceguard Cluster Monitor hpe-serviceguard HPE Serviceguard Cluster 1 (i.e, monitorKey =HPE Serviceguard Cluster Monitor , appName = hpe-serviceguard, nativeType = HPE Serviceguard Cluster, version = 1)

Configure and Install the HPE Serviceguard Integration

From All Clients, select a client.
Navigate to Setup > Account.
Select the Integrations tab.
The Installed Integrations page, where all the installed integrations are displayed. Click + ADD on the Installed Integrations page.
If you do not have any installed applications, you will be navigated to the Available Integrations page. The Available Integrations page displays all the available applications along with the newly created application with the version.
Note: Search for the application using the search option available. Alternatively, use the All Categories option to search.
Click ADD in the HPE Serviceguard application.
Note: Select the version from the drop down menu.
In the Configurations page, click + ADD. The Add Configuration page appears.
Enter the following BASIC INFORMATION:

Field Name	Description
Name	Enter the name for the configuration.
Serviceguard Node IP Address/Host Name	IP Address/Host Name for the Serviceguard.
API Port	API Port Note: By default port is 5522.
API Credentials	Select the Credential from the drop-down list. (Optional): Click + Add to create a credential. The ADD CREDENTIAL window is displayed. Enter the following information. Name: Credential name. Description: Brief description of the credential. User Name: User name. Password: Password. Confirm Password: Confirm password
SSH Port	SSH Port Note: By default port is 22.
SSH Credentials	Select the Credential from the drop-down list. (Optional): Click + Add to create a credential. The ADD CREDENTIAL window is displayed. Enter the following information. Name: Credential name. Description: Brief description of the credential. User Name: User name. Password: Password. Confirm Password: Confirm password

Notes:

By default the Is Secure checkbox is selected.
Ip Address/Host Name and Port should be accessible from Gateway.
Select the following:
- App Failure Notifications: if turned on, you will be notified in case of an application failure that is, Connectivity Exception, Authentication Exception.
- Alert Configuration: map alert configuration for third party alerts into OpsRamp.
- Alert On Root Resource: Checking this will generate event polling alerts on root resource.
  - Below are the default values set for:
    - Alert Severity: Possible values of Alert Severity filter configuration property are [“CRITICAL”,“ERROR”,“WARNING”,“DEGRADED”]
    - Alert Severity Mapping: Provides alert severity mapping configuration. Default values for Alert Severity Mapping configuration are {“DEGRADED”:“Critical”,“CRITICAL”:“Critical”,“ERROR”:“Warning”,“WARNING”:“Warning”}.

Select the following Custom Attribute:


Field Name	Description
Custom Attribute	Select the custom attribute from the drop down list box.
Value	Select the value from the drop down list box.

Note: The custom attribute that you add here will be assigned to all the resources that are created by the integration. You can add a maximum of five custom attributes (key and value pair).

In the RESOURCE TYPE section, select:
- ALL: All the existing and future resources will be discovered.
- SELECT: You can select one or multiple resources to be discovered.
In the DISCOVERY SCHEDULE section, select recurrence pattern to add one of the following patterns:
- Minutes
- Hourly
- Daily
- Weekly
- Monthly
Click ADD.

Now the configuration is saved and displayed on the configurations page after you save it.
Note: From the same page, you may Edit and Remove the created configuration.

Under the ADVANCED SETTINGS, Select the Bypass Resource Reconciliation option, if you wish to bypass resource reconciliation when encountering the same resources discovered by multiple applications.
Note: If two different applications provide identical discovery attributes, two separate resources will be generated with those respective attributes from the individual discoveries.
Click NEXT.
(Optional) Click +ADD to create a new collector. You can either use the pre-populated name or give the name to your collector.
Select an existing registered profile.

Click FINISH.

The integration is installed and displayed on the INSTALLED INTEGRATION page. Use the search field to find the installed integration.

Modify the Configuration

See Modify an Installed Integration or Application article.
Note: Select HPE Serviceguard.

View the HPE Serviceguard Details

To discover resources for HPE Serviceguard:

Navigate to Infrastructure > Search > HIGH AVAILABILITY > HPE Serviceguard. The HPE Serviceguard page is displayed
Select the application on the HPE Serviceguard page
The RESOURCE page appears from the right.
Click the ellipsis (…) on the top right and select View Details.

Navigate to the Attributes tab to view the discovery details.

Click the Metrics tab to view the metric details for HPE Serviceguard.

View resource metrics

To confirm HPE Serviceguard monitoring, review the following:

Metric graphs: A graph is plotted for each metric that is enabled in the configuration.
Alerts: Alerts are generated for metrics that are configured as defined for integration.

Resource Filter Input Keys

HPE Serviceguard application resources are filtered and discovered based on below keys.

Click here to view the Supported Input Keys

Resource Type	Supported Input Key
All Types	resourceName
	hostName
	aliasName
	dnsName
	ipAddress
	macAddress
	os
	make
	model
HPE Serviceguard Cluster	Cluster Type
	Quorum Server Name
	Site Aware
HPE Serviceguard Node	ServiceGuard Version
	Site Name
	Operating System Flavor
	CPU Architecture
	ServiceGuard Manager Build Version
	Root Resource UUID
	Root Resource IPAddress
	Root Resource Name
	Root Resource HostName
HPE Serviceguard Node Interface	HeartBeat
	IP Address
	Subnet
	Root Resource UUID
	Root Resource IPAddress
	Root Resource Name
	Root Resource HostName
HPE Serviceguard Package	Package Type
	Package Description
	Style
	Root Resource UUID
	Root Resource IPAddress
	Root Resource Name
	Root Resource HostName
HPE Serviceguard Package Dependency	Dependency Location
	Dependee Package
	Root Resource UUID
	Root Resource IPAddress
	Root Resource Name
	Root Resource HostName
HPE Serviceguard VolumeGroup	vgchange cmd
	Root Resource UUID
	Root Resource IPAddress
	Root Resource Name
	Root Resource HostName
HPE Serviceguard File System	File System Directory
	File System Type
	File Sytem Mount
	Root Resource UUID
	Root Resource IPAddress
	Root Resource Name
	Root Resource HostName

Supported Alert Custom Macros

Customize the alert subject and description with the following macros so that it can generate alerts accordingly.
Supported macros keys:

Click here to view the alert subject and description with macros

${resource.name}

${resource.ip}

${resource.mac}

${resource.aliasname}

${resource.os}

${resource.type}

${resource.dnsname}

${resource.alternateip}

${resource.make}

${resource.model}

${resource.serialnumber}

${resource.systemId}

${parent.resource.name}

${Custom attributes on the resource}

Risks, Limitations And Assumptions

The integration can manage critical/recovery failure alerts for the following two scenarios when the user activates App Failure Notifications in the settings:
- Connectivity Exception
- Authentication Exception
HPE Serviceguard will send any duplicate/repeat failure alert notification for every 6 hours.
HPE Serviceguard can control monitoring pause/resume actions based on above alerts. Metrics can be used to monitor HPE Serviceguard resources and can generate alerts based on the threshold values.
HPE Serviceguard Event/Alert polling will be started only if the user enables Event Polling in configuration. Possible values of Alert Severity Filter configuration are “DEGRADED”,“WARNING”,“CRITICAL”,“ERROR”. Based on these values alerts will be filtered and user can customize this at any point of time.
We have given default mappings to map Serviceguard alerts Severity with OpsRamp Severities as part of Alert Severity Mapping configuration. User can modify them as per their use-case at any point of time from application configuration page. Possible Severities are Critical, Warning. Provided below default severity mapping in configuration {“DEGRADED”:“Critical”,“WARNING”:“Warning”,“CRITICAL”:“Critical”,“ERROR”:“Warning”}
In Alert polling, by default alert will be raised on root resource. User can make the alert populated on its actual resource by unchecking the checkbox(Alert on root Resource).
We have provided 5522 and 22 as default Port for API and SSH respectively. Users can modify this value from configuration page if requires.
Component level thresholds can be configured on each resource level.
The Template Applied Time will only be displayed if the collector profile (Classic and NextGen Gateway) is version 18.1.0 or higher.
The minimum supported version for the option to get the latest snapshot metric and Full discovery is Nextgen-15.0.0.
HPE Serviceguard supports both Classic Gateway and NextGen Gateway.

Version History


Application Version	Bug fixes / Enhancements
4.0.0	Support added to Root Resource UUID as a custom attribute Default availability,threshold metric definition changes Changes made to the metric data prescision value.
3.0.1	Added code support for Get Target Response command for each native type. On demand latest snapshot support and Activity logger changes. Validation of the Serviceguard version A.15.30.01.
3.0.0	Added Support for two metrics "hpe_serviceguard_package_autorun" and "hpe_serviceguard_package_switching_node_status" in the NativeType HPE Serviceguard Package.
2.0.0	Added support for Serviceguard Node Serial Number, Model and BIOS/DMI code. Error state has been added to the Quorum server state metric in Serviceguard Monitoring. If the primary node is unable to make the API calls, support for performing the calls with the other nodes is provided.
1.0.0	Initial SDK2.0 application Discovery and Monitoring Implementations.

References

HPE Serviceguard REST API Reference Guide