Introduction
Monitors MSLync2013_FrontEndServers metrics.
Discovery with the agent
Collector Type: Agent
Category: Application Monitors
Application Name: Microsoft Lync 2013 Front End Servers
Global Template Name : Microsoft Lync 2013 DotNet v4 - FrontEnd Servers
Pre-requisites : For Lync monitors need Microsoft .NET Framework 4.
Collected Metrics
Metric Name | Display Name | Description |
---|---|---|
dbStoreQueueLatency | dbStoreQueueLatency | This component monitor returns the average time, in milliseconds, that a request is held in the queue of the BackEnd Database Server. If the topology is healthy, this counter averages less than 100 ms. Occasional spikes are acceptable. |
dbStoreQueueDepth | dbStoreQueueDepth | The average number of database requests waiting to be executed. The backend might be busy and is unable to respond to requests quickly.This might be a temporary condition. |
totalMessagesInAllQueues | totalMessagesInAllQueues | The size of the queue will vary depending on load. Verify that the queue is not increasing unbounded. Establish a baseline for the counter, and monitor the counter to ensure that it does not exceed that baseline. |
SIP_503ResponseRate | SIP_503ResponseRate | This component monitor returns the rate of 503 responses generated by the server, per second. The 503 code corresponds to the server being unavailable. On a healthy server, you should not receive this code at a steady rate. |
SIP_504ResponseRate | SIP_504ResponseRate | This component monitor returns the rate of 504 responses generated by the server, per second. A few 504 responses to clients (for clients disconnecting abruptly) is to be expected, but this counter mainly indicates connectivity issues with other servers. |
SIP_ConnectionsActive | SIP_ConnectionsActive | This component monitor returns the number of established connections that are currently active. A connection is considered established when peer credentials are verified (e.g. via MTLS), or the peer receives a 2xx response. |
SIP_TLSConnectionsActive | SIP_TLSConnectionsActive | This component monitor returns the number of established TLS connections that are currently active. A TLS connection is considered established when the peer certificate, and possibly the host name, are verified for a trust relationship. |
SIP_SendsOutstanding | SIP_SendsOutstanding | This component monitor returns the number of messages that are currently present in the outgoing queues. If you receive error message 504, investigate the results from this counter. Doing so will indicate which servers are having problems. |
SIP_AvgOutgoingQueueDelay | SIP_AvgOutgoingQueueDelay | This component monitor returns the average time, in seconds, that messages have been delayed in outgoing queues. |
SIP_FlowControlledConnectionsDropped | SIP_FlowControlledConnectionsDropped | This component monitor returns the total number of connections dropped because of excessive flowcontrol. You will need to baseline this counter by testing and monitoring the server's health. The returned value should be as low as possible. |
SIP_AvgFlowControlDelay | SIP_AvgFlowControlDelay | This component monitor returns the average delay, in seconds, in message processing when the socket is flowcontrolled. You will need to baseline this counter by testing and monitoring the server's health. The returned value should be as low as possible. |
SIP_IncomingRequestRate | SIP_IncomingRequestRate | This component monitor returns the rate of received requests, per second. You will need to baseline this counter by testing and monitoring the user load. |
SIP_IncomingMessageRate | SIP_IncomingMessageRate | This component monitor returns the rate of received messages, per second. You will need to baseline this counter by testing and monitoring the user load. |
SIP_EventsInProcessing | SIP_EventsInProcessing | This component monitor returns the number of SIP transactions, or dialog state change events, that are currently being processed. You will need to baseline this counter by testing and monitoring the user load. |
SIP_500ResponseRate | SIP_500ResponseRate | This component monitor returns the rate of 500 responses generated by the server, per second. This can indicate that there is a server component that is not functioning correctly. |
SIP_AvgHoldingTimeForIncomingMessage | SIP_AvgHoldingTimeForIncomingMessage | This component monitor returns the average time that the server held the incoming messages currently being processed. |
SIP_AddressSpaceUsage | SIP_AddressSpaceUsage | This component monitor returns the percentage of available address space currently in use by the server process. The returned value should be as low as possible. |
SIP_PageFileUsage | SIP_PageFileUsage | This component monitor returns the percentage of available page file space currently in use by the server process. The returned value should be as low as possible. |
SIP_IncomingMessagesTimedOut | SIP_IncomingMessagesTimedOut | The number of incoming messages currently being held by the server for processing for more than the maximum tracking interval. This server is too busy and is unable to process user requests in timely fashion. |
IM_NumberOfActiveConferences | IM_NumberOfActiveConferences | This component monitor returns the number of active instant messaging conferences. You will need to baseline this counter by testing and monitoring the user load. |
IM_NumberOfConnectedIMUsers | IM_NumberOfConnectedIMUsers | This component monitor returns the number of connected instant messaging users in all conferences. You will need to baseline this counter by testing and monitoring the user load. |
IM_WithThrottledSIPConnections | IM_WithThrottledSIPConnections | This component monitor returns the number of throttled Sip connections. If the value is greater than ten, it could indicate that Peer is not processing requests in a timely fashion. This can happen if the peer machine is overloaded. |
IM_MCUHealthState | IM_MCUHealthState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. |
IM_MCUDrainingState | IM_MCUDrainingState | This component monitor returns the current draining status of the MCU. Possible values: 0 = Not requesting to drain. 1 = Requesting to drain. 2 = Draining. When a server is drained, it stops taking new connections and calls. |
User_services_DBStoreSprocLatency | User_services_DBStoreSprocLatency | This component monitor returns the average time, in milliseconds, it takes to execute a stored procedure call. A healthy state is considered to be less than 100 ms. Server health decreases as latency increases to 12 seconds, when server throttling begins. |
User_services_NumberOfFailedHTTPConnections | User_services_NumberOfFailedHTTPConnections | This component monitor returns the rate of connection attempt failures, per second. You will need to baseline this counter by testing and monitoring the server's health. |
Memory_PagesPerSec | Memory_PagesPerSec | If a page has to be retrieved from the disk instead of from the memory, there is a negative impact to performance; the rate at which pages in memory are swapped with those in the disk needs to be below a 500 pages per second. |
AVMCU_NumberofAudiovideoconferences | AVMCU_NumberofAudiovideoconferences | Number of audiovideo conferences. Ideally it should be evenly distributed across all frontend servers. |
ASMCU_NumberOfApplicationSharingConferences | ASMCU_NumberOfApplicationSharingConferences | Number of applicationsharing conferences. Ideally it should be evenly distributed across all frontend servers. |
DATAMCU_HealthState | DATAMCU_HealthState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. The current health of the data sharing MCU. 0 = Normal. 1 = Loaded. 2 = Full. 3 = Unavailable. |
DATAMCU_DrainingState | DATAMCU_DrainingState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. The current draining status of the data sharing MCU. 0 = Not requesting to drain. 1 = Required. |
DataMCU_EstimatedConferenceWorkitemsLoad | DataMCU_EstimatedConferenceWorkitemsLoad | The estimated time to process all pending items on the session queues measured in milliseconds. |
DataMCU_StateOfSessionQueues | DataMCU_StateOfSessionQueues | The state of the session queues. It indicates if the Data MCU is over loaded. |
DATAMCU_NumberOfDataSharingConferences | DATAMCU_NumberOfDataSharingConferences | Number of datasharing conferences. Ideally it should be evenly distributed across all frontend servers. |
ApplicationSharingMCU_HealthState | ApplicationSharingMCU_HealthState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. The current health of the application sharing MCU. 0 = Normal. 1 = Loaded. 2 = Full. |
ApplicationSharingMCU_DrainingState | ApplicationSharingMCU_DrainingState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. The current draining status of the application sharing MCU. 0 = Not requesting to drain. |
AudioVideoMCU_HealthState | AudioVideoMCU_HealthState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. The current health of the audiovideo MCU. 0 = Normal. 1 = Loaded. 2 = Full. 3 = Unavailable. |
AudioVideoMCU_DrainingState | AudioVideoMCU_DrainingState | The Multipoint Conferencing Units (MCU) health counters give an indication of the overall system health; these should be 0 at all times, indicating normal operation. The current draining status of the audiovideo MCU. 0 = Not requesting to drain. |
AddressBook_SearchResponseTime | AddressBook_SearchResponseTime | The average processing time for a address book search request in milliseconds. It could be due to backend database performance issues. Verify CPU load on backend database machine. Upgrade hardware if needed. |
AddressBook_SearchFailureRate | AddressBook_SearchFailureRate | The persecond rate of failed address book search requests. It could be due to backend database performance issues. Verify backend database is running and accessible. |
SIP_ConnectionsRefusedDueToServerOverload | SIP_ConnectionsRefusedDueToServerOverload | The persecond rate of the connections that were refused with Service Unavailable response because the server was overloaded. If the problem persists, please ensure that hardware and software requirements for this server meets the user usage characteristic. |
ExpandDistributionList_ResponseTimeInms | ExpandDistributionList_ResponseTimeInms | Average processing time for a successful request to be completed in milliseconds. It indicates if there are any Active Directory performance issues. |
ExpandDistributionList_SOAPExceptionRate | ExpandDistributionList_SOAPExceptionRate | The persecond rate of Soap exceptions. |
AddressBookFileDownload_FailedRequestsPerSecond | AddressBookFileDownload_FailedRequestsPerSecond | The persecond rate of failed Address Book file requests. High rate of failure can be caused by authentication issues or network connectivity issues |
LSCommunicatorWebApp_FailedDataCollaborationAuthenticationRequestsPerSecond | LSCommunicatorWebApp_FailedDataCollaborationAuthenticationRequestsPerSecond | The number of failed Data Collaboration authentication request per second. Attempts to authenticate incoming client connections for data collaboration failed. This may indicate a network attack. |
LSCommunicatorWebApp_NumberOfDataCollaborationConnectionFailuresWithDataCollaborationServers | LSCommunicatorWebApp_NumberOfDataCollaborationConnectionFailuresWithDataCollaborationServers | The number of Data Collaboration connection failures with Data Collaboration servers. Connection closed by local party or remote party or network issues. Check availability of Web Conferencing Server servers. |
LSCommunicatorWebApp_ThrottledClientDataCollaborationConnectionsPerSecond | LSCommunicatorWebApp_ThrottledClientDataCollaborationConnectionsPerSecond | The number of Data Collaboration client connections closed due to throttling per second. Client Data Collaboration was closed because client failed to read data in a timely manner. This may indicate a network failure or organized attack. |
CallPark_FailedCallParkRequests | CallPark_FailedCallParkRequests | The total number of park requests that failed. |
CallPark_FailedRequestsBecauseNoOrbitIsAvailable | CallPark_FailedRequestsBecauseNoOrbitIsAvailable | The total number of park requests failed because no orbit available. Consider adding more orbits using management console or the Power Shell commands to manage orbit ranges. |
CallPark_FailedTransfersToFallbackURI | CallPark_FailedTransfersToFallbackURI | The total number of failed fallback attempts. The fallback destination might not be reachable. |
AudioVideoConferencing_NumberOfOccasionsConferenceProcessingIsDelayed | AudioVideoConferencing_NumberOfOccasionsConferenceProcessingIsDelayed | Number of occasions conference processing is delayed. This issue may occur if the Audio Video Conferencing server is overloaded, or is not getting enough CPU resources to process audio in real time. |
SIP_MessagesPerSecondDroppedDueToUnknownDomain | SIP_MessagesPerSecondDroppedDueToUnknownDomain | The persecond rate of messages that could not be routed because the message domain is not configured and does not appear to belong to a federated partner. The Access Edge Server received SIP messages with an unknown domain. |
IMMCU_ThrottledSIPConnections | IMMCU_ThrottledSIPConnections | The number of throttled Sip connections . Peer is not processing requests in a timely fashion.This can happen if the peer machine is overloaded. |