Introduction
HPE Serviceguard ensures the resilience of mission-critical applications by organizing multiple nodes into an enterprise cluster. Specifically designed for Linux environments and ProLiant servers, it protects against software and hardware failures by monitoring server health and responding swiftly. Serviceguard for Linux allows clustering of HP ProLiant server products with shared storage solutions, enabling configurations from 2-node SCSI to 2 to 16-node Fibre Channel setups.
Key benefits
- It Discovers HPE Serviceguard components.
- Publishes relationships between resources for a topological view and ease of maintenance.
- Provides metrics related to job scheduling time and status.
- Generates concern alerts for each metric to notify administrators of resource issues.
Supported Target Version |
---|
Serviceguard Manager version: B.12.80.05 |
Serviceguard version: A.12.80.05 |
Prerequisites
- OpsRamp Classic Gateway 15.0.0 and above.
- OpsRamp Nextgen Gateway 15.0.0 and above.
Note: OpsRamp recommends using the latest Gateway version for full coverage of recent bug fixes, enhancements, etc. - To monitor the HPE Serviceguard we are using the Serviceguard Rest API (SGRAPI).
- To utilize the Serviceguard Rest API (SGRAPI) for monitoring HPE Serviceguard, the following conditions must be met:
- Installation of Serviceguard for Linux A.15.XX.XX is required.
- Ports 5511, https port (default port is 5522, if not configured during installation or upgrade), and 5301 must be available for SGRAPI usage.
- Access to HPE Serviceguard necessitates valid IP address/hostname, API credentials, and SSH credentials.
- If the primary node is down, Jetty-Sgmgr.Service must be enabled on the secondary nodes in order to perform API calls to it.
- For non-root users: Update the following lines in “etc/sudoers” file to provide access for non-root user to execute the ServiceGuard Node Serial Number , Product Name and System UUID commands.
Cmnd_Alias DMIDECODE_CMDS = /bin/cat /sys/class/dmi/id/product_serial, /bin/cat /sys/class/dmi/id/product_name, /usr/sbin/dmidecode -s system-uuid
{username} ALL=(ALL) NOPASSWD: DMIDECODE_CMDS
Example for opsramp non-root user: opsramp ALL=(ALL) NOPASSWD: DMIDECODE_CMDS
Hierarchy of HPE Serviceguard
· HPE Serviceguard Cluster
· HPE Serviceguard Node
· HPE Serviceguard Node Interface
· HPE Serviceguard Package
· HPE Serviceguard Package Dependency
· HPE Serviceguard Volume Group
· HPE Serviceguard File System
Supported Metrics
Click here to view the supported metrics
Native Type | Metric Name | Display Name | Metric Lable | Units | Application Version | Description |
---|---|---|---|---|---|---|
HPE Serviceguard Cluster | hpe_serviceguard_cluster_Status | HPE Serviceguard Cluster Status | Availability | 1.0.0 | HPE Serviceguard cluster status.Possible values are - up : 0, down : 1, starting : 2, halting : 3, detached : 4, partially_down : 5, unknown : 6 | |
hpe_serviceguard_quorumserver_Status | HPE Serviceguard Quorumserver Status | Availability | 1.0.0 | HPE Serviceguard quorum server status. Possible values are - up : 0, down : 1, unknown :2 | ||
hpe_serviceguard_quorumserver_State | HPE Serviceguard Quorumserver State | Availability | 1.0.0 | HPE Serviceguard quorum server state. Possible values are - running : 0, Unsupported Version : 1, Access Denied : 2, unknown : 3 , Error : 4 | ||
HPE Serviceguard Node | hpe_serviceguard_node_Status | HPE Serviceguard Node Status | Availability | 1.0.0 | HPE Serviceguard Node status.Possible values are - up : 0, down : 1, starting : 2, halting : 3, detached : 4, partially_down : 5, unknown : 6 | |
hpe_serviceguard_node_State | HPE Serviceguard Node State | Availability | 1.0.0 | The status of the host hardware component. Possible values are - GREEN : 0, YELLOW : 1, RED : 2, UNKNOWN : 3 | ||
hpe_serviceguard_node_licenseValidity | HPE Serviceguard Node License Validity | Availability | Days | 1.0.0 | HPE Serviceguard node license days to expiry | |
HPE Serviceguard Node Interface | hpe_serviceguard_nodeInterface_Status | HPE Serviceguard Node Interface Status | Availability | 1.0.0 | HPE Serviceguard node interface status. Possible values are - up : 0, down : 1, unknown : 2 | |
HPE serviceguard Package | hpe_serviceguard_package_autorun | HPE Serviceguard Package Auto Run | Availability | 3.0.0 | HPE Serviceguard package Auto Run. Possible values are - ENABLED : 0, DISABLED - 1 | |
hpe_serviceguard_package_switching_node_status | HPE Serviceguard Package Switching Node Status | Availability | 3.0.0 | HPE Serviceguard Package Switching Node Status. Possible values are - ENABLED : 0, DISABLED - 1 | ||
hpe_serviceguard_package_Status | HPE Serviceguard Package Status | Availability | 1.0.0 | HPE Serviceguard package status.Possible values are - up : 0, down : 1, start_wait : 2 , starting : 3, halting : 4, halt_wait : 5, failing : 6, fail_wait : 7, relocate_wait : 8, reconfiguring : 9, reconfigure_wait : 10, detached : 11, unknown : 12 | ||
hpe_serviceguard_package_State | HPE Serviceguard Package State | Availability | 1.0.0 | HPE Serviceguard package state. Possible values are - Starting : 0, start_wait : 1, running : 2, halting : 3, halt_wait : 4, halted : 5, halt_aborted : 6, failing : 7 , fail_wait : 8, failed : 9, relocate_wait : 10, maintenance : 11, detached : 12, reconfiguring : 13, reconfigure_wait : 14, unknown : 15, blocked : 16, changing : 17 | ||
hpe_serviceguard_package_serviceStatus | HPE Serviceguard Package Service Status | Availability | 1.0.0 | HPE Serviceguard service status. Possible values are - Up : 0 , Down : 1, Unknown :2 | ||
hpe_serviceguard_package_Summary | HPE Serviceguard Package Summary | Availability | 1.0.0 | HPE Serviceguard package summary. Possible values are - ok : 0, critical : 1, starting : 2, degraded : 3 , unknown : 4 | ||
HPE Serviceguard File System | hpe_serviceguard_filesystem_UsedSpace | HPE Serviceguard FileSystem Used Space | Usage | MB | 1.0.0 | HPE Serviceguard FileSystem Used Space |
hpe_serviceguard_filesystem_Utilisation | HPE Serviceguard FileSystem Utilisation | Usage | % | 1.0.0 | HPE Serviceguard FileSystem Utilisation | |
hpe_serviceguard_filesystem_FreeSpace | HPE Serviceguard FileSystem Free Space | Usage | MB | 1.0.0 | HPE Serviceguard FileSystem Free Space | |
hpe_serviceguard_filesystem_inodesUsed | HPE Serviceguard FileSystem Inodes Used | Usage | count | 1.0.0 | HPE Serviceguard FileSystem Inodes Used | |
hpe_serviceguard_filesystem_inodesFree | HPE Serviceguard FileSystem Inodes Free | Usage | count | 1.0.0 | HPE Serviceguard FileSystem Inodes Free | |
hpe_serviceguard_filesystem_inodesUtilisation | HPE Serviceguard FileSystem Inodes Utilisation | Usage | % | 1.0.0 | HPE Serviceguard FileSystem Inodes Utilisation |
Default Monitoring Configurations
HPE Serviceguard has default Global Device Management Policies, Global Templates, Global Monitors and Global Metrics in OpsRamp. You can customize these default monitoring configurations as per your business requirement by cloning respective Global Templates and Global Device Management Policies. It is recommended to clone them before installing the application to avoid noise alerts and data.
Default Global Device Management Policies
You can find the Device Management Policy for each Native Type at Setup > Resources > Device Management Policies. Search with suggested names in global scope:
{appName nativeType - version}
Ex: hpe-serviceguard HPE Serviceguard Cluster - 1(i.e, appName = hpe-serviceguard, nativeType = HPE Serviceguard Cluster , version = 1)
Default Global Templates
You can find the Global Templates for each Native Type at Setup > Monitoring > Templates. Search with suggested names in global scope. Each template adheres to the following naming convention:
{appName nativeType 'Template' - version}
Ex: hpe-serviceguard HPE Serviceguard Cluster Template- 1(i.e, appName = hpe-serviceguard, nativeType = HPE Serviceguard Cluster, version = 1)
Default Global Monitors
You can find the Global Monitors for each Native Type at Setup > Monitoring > Monitors. Search with suggested names in global scope. Each Monitors adheres to the following naming convention:
{monitorKey appName nativeType - version}
Ex: HPE Serviceguard Cluster Monitor hpe-serviceguard HPE Serviceguard Cluster 1 (i.e, monitorKey =HPE Serviceguard Cluster Monitor , appName = hpe-serviceguard, nativeType = HPE Serviceguard Cluster, version = 1)
Configure and Install the HPE Serviceguard Integration
- From All Clients, select a client.
- Navigate to Setup > Account.
- Select the Integrations tab.
- The Installed Integrations page, where all the installed integrations are displayed. Click + ADD on the Installed Integrations page.
- If you do not have any installed applications, you will be navigated to the Available Integrations page. The Available Integrations page displays all the available applications along with the newly created application with the version.
Note: Search for the application using the search option available. Alternatively, use the All Categories option to search. - Click ADD in the HPE Serviceguard application.
Note: Select the version from the drop down menu. - In the Configurations page, click + ADD. The Add Configuration page appears.
- Enter the following BASIC INFORMATION:
Field Name | Description |
---|---|
Name | Enter the name for the configuration. |
Serviceguard Node IP Address/Host Name | IP Address/Host Name for the Serviceguard. |
API Port | API Port Note: By default port is 5522. |
API Credentials | Select the Credential from the drop-down list. (Optional): Click + Add to create a credential. The ADD CREDENTIAL window is displayed. Enter the following information.
|
SSH Port | SSH Port Note: By default port is 22. |
SSH Credentials | Select the Credential from the drop-down list. (Optional): Click + Add to create a credential. The ADD CREDENTIAL window is displayed. Enter the following information.
|
Notes:
- By default the Is Secure checkbox is selected.
- Ip Address/Host Name and Port should be accessible from Gateway.
- Select the following:
- App Failure Notifications: if turned on, you will be notified in case of an application failure that is, Connectivity Exception, Authentication Exception.
- Alert Configuration: map alert configuration for third party alerts into OpsRamp.
- Alert On Root Resource: Checking this will generate event polling alerts on root resource.
- Below are the default values set for:
- Alert Severity: Possible values of Alert Severity filter configuration property are [“CRITICAL”,“ERROR”,“WARNING”,“DEGRADED”]
- Alert Severity Mapping: Provides alert severity mapping configuration. Default values for Alert Severity Mapping configuration are {“DEGRADED”:“Critical”,“CRITICAL”:“Critical”,“ERROR”:“Warning”,“WARNING”:“Warning”}.
- Below are the default values set for:
- Select the following Custom Attribute:
Field Name | Description |
---|---|
Custom Attribute | Select the custom attribute from the drop down list box. |
Value | Select the value from the drop down list box. |
Note: The custom attribute that you add here will be assigned to all the resources that are created by the integration. You can add a maximum of five custom attributes (key and value pair).
- In the RESOURCE TYPE section, select:
- ALL: All the existing and future resources will be discovered.
- SELECT: You can select one or multiple resources to be discovered.
- In the DISCOVERY SCHEDULE section, select recurrence pattern to add one of the following patterns:
- Minutes
- Hourly
- Daily
- Weekly
- Monthly
- Click ADD.
Now the configuration is saved and displayed on the configurations page after you save it.
Note: From the same page, you may Edit and Remove the created configuration.
Under the ADVANCED SETTINGS, Select the Bypass Resource Reconciliation option, if you wish to bypass resource reconciliation when encountering the same resources discovered by multiple applications.
Note: If two different applications provide identical discovery attributes, two separate resources will be generated with those respective attributes from the individual discoveries.
Click NEXT.
(Optional) Click +ADD to create a new collector. You can either use the pre-populated name or give the name to your collector.
Select an existing registered profile.
- Click FINISH.
The integration is installed and displayed on the INSTALLED INTEGRATION page. Use the search field to find the installed integration.
Modify the Configuration
See Modify an Installed Integration or Application article.
Note: Select HPE Serviceguard.
View the HPE Serviceguard Details
To discover resources for HPE Serviceguard:
- Navigate to Infrastructure > Search > HIGH AVAILABILITY > HPE Serviceguard. The HPE Serviceguard page is displayed
- Select the application on the HPE Serviceguard page
- The RESOURCE page appears from the right.
- Click the ellipsis (…) on the top right and select View Details.
- Navigate to the Attributes tab to view the discovery details.
- Click the Metrics tab to view the metric details for HPE Serviceguard.
View resource metrics
To confirm HPE Serviceguard monitoring, review the following:
- Metric graphs: A graph is plotted for each metric that is enabled in the configuration.
- Alerts: Alerts are generated for metrics that are configured as defined for integration.
Resource Filter Input Keys
HPE Serviceguard application resources are filtered and discovered based on below keys.
Click here to view the Supported Input Keys
Resource Type | Resource Type |
---|---|
All Types | resourceName |
hostName | |
aliasName | |
dnsName | |
ipAddress | |
macAddress | |
os | |
make | |
model | |
HPE Serviceguard Cluster | Cluster Type |
Quorum Server Name | |
Site Aware | |
HPE Serviceguard Node | ServiceGuard Version |
Site Name | |
Operating System Flavor | |
CPU Architecture | |
ServiceGuard Manager Build Version | |
HPE Serviceguard Node Interface | HeartBeat |
IP Address | |
Subnet | |
HPE Serviceguard Package | Package Type |
Package Description | |
Style | |
HPE Serviceguard Package Dependency | Dependency Location |
Dependee Package | |
HPE Serviceguard VolumeGroup | vgchange cmd |
HPE Serviceguard File System | File System Directory |
File System Type | |
File Sytem Mount |
Supported Alert Custom Macros
Customize the alert subject and description with the following macros so that it can generate alerts accordingly.
Supported macros keys:
Click here to view the alert subject and description with macros
${resource.name}
${resource.ip}
${resource.mac}
${resource.aliasname}
${resource.os}
${resource.type}
${resource.dnsname}
${resource.alternateip}
${resource.make}
${resource.model}
${resource.serialnumber}
${resource.systemId}
${parent.resource.name}
${Custom attributes on the resource}
Risks, Limitations And Assumptions
- The integration can manage critical/recovery failure alerts for the following two scenarios when the user activates App Failure Notifications in the settings:
- Connectivity Exception
- Authentication Exception
- HPE Serviceguard will send any duplicate/repeat failure alert notification for every 6 hours.
- HPE Serviceguard can control monitoring pause/resume actions based on above alerts. Metrics can be used to monitor HPE Serviceguard resources and can generate alerts based on the threshold values.
- HPE Serviceguard Event/Alert polling will be started only if the user enables Event Polling in configuration. Possible values of Alert Severity Filter configuration are “DEGRADED”,“WARNING”,“CRITICAL”,“ERROR”. Based on these values alerts will be filtered and user can customize this at any point of time.
- We have given default mappings to map Serviceguard alerts Severity with OpsRamp Severities as part of Alert Severity Mapping configuration. User can modify them as per their use-case at any point of time from application configuration page. Possible Severities are Critical, Warning. Provided below default severity mapping in configuration {“DEGRADED”:“Critical”,“WARNING”:“Warning”,“CRITICAL”:“Critical”,“ERROR”:“Warning”}
- In Alert polling, by default alert will be raised on root resource. User can make the alert populated on its actual resource by unchecking the checkbox(Alert on root Resource).
- We have provided 5522 and 22 as default Port for API and SSH respectively. Users can modify this value from configuration page if requires.
- Component level thresholds can be configured on each resource level.
- No support of showing activity logs.
- The Template Applied Time will only be displayed if the collector profile (Classic and NextGen Gateway) is version 18.1.0 or higher.
- The minimum supported version for the option to get the latest snapshot metric and Full discovery is Nextgen-15.0.0.
- HPE Serviceguard supports both Classic Gateway and NextGen Gateway.
Version History
Application Version | Bug fixes / Enhancements |
---|---|
3.0.0 | Added Support for two metrics "hpe_serviceguard_package_autorun" and "hpe_serviceguard_package_switching_node_status" in the NativeType HPE Serviceguard Package. |
2.0.0 |
|
1.0.0 | Initial SDK2.0 application Discovery and Monitoring Implementations. |