Note
Access to this page requires authorization. You can try signing in or .
Access to this page requires authorization. You can try .
Standard load balancer diagnostics with metrics, alerts, and resource health
Azure Load Balancer exposes the following diagnostic capabilities:
Multi-dimensional metrics and alerts: Provides multi-dimensional diagnostic capabilities through Azure Monitor for Azure Load Balancer configurations. You can monitor, manage, and troubleshoot your standard load balancer resources.
Resource health: The Resource Health status of your load balancer is available in the Resource health page under Monitor. This automatic check informs you of the current availability of your load balancer resource.
This article provides a quick tour of these capabilities, and it offers ways to use them for a standard load balancer.
Multi-dimensional metrics
Azure Load Balancer provides multi-dimensional metrics via the Azure Metrics in the Azure portal, and it helps you get real-time diagnostic insights into your load balancer resources. Please note that multi-dimensional metrics are not supported for Basic Load Balancers
The various load balancer configurations provide the following metrics:
| Metric | Resource type | Description | Recommended aggregation |
|---|---|---|---|
| Data Path Availability | Public and internal load balancer | A load balancer continuously uses the data path from within a region to the load balancer frontend, to the network that supports your VM. As long as healthy instances remain, the measurement follows the same path as your application's load-balanced traffic. The data path in use is validated. The measurement is invisible to your application and doesn’t interfere with other operations. | Average |
| Health Probe Status | Public and internal load balancer | A load balancer uses a distributed health-probing service that monitors your application endpoint's health according to your configuration settings. This metric provides an aggregate or per-endpoint filtered view of each instance endpoint in the load balancer pool. You can see how load balancer views the health of your application, as indicated by your health probe configuration. | Average |
| SYN Count | Public and internal load balancer | A load balancer doesn’t terminate Transmission Control Protocol (TCP) connections or interact with TCP or User Data-gram Packet (UDP) flows. Flows and their handshakes are always between the source and the VM instance. To better troubleshoot your TCP protocol scenarios, you can make use of SYN packets counters to understand how many TCP connection attempts are made. The metric reports the number of TCP SYN packets that were received. | Sum |
| Source Network Address Translation (SNAT) Connection Count | Public load balancer | A load balancer reports the number of outbound flows that are masqueraded to the Public IP address frontend. SNAT ports are an exhaustible resource. This metric can give an indication of how heavily your application is relying on SNAT for outbound originated flows. Counters for successful and failed outbound SNAT flows are reported. The counters can be used to troubleshoot and understand the health of your outbound flows. | Sum |
| Allocated SNAT Ports | Public load balancer | A load balancer reports the number of SNAT ports allocated per backend instance | Average. |
| Used SNAT Ports | Public load balancer | A load balancer reports the number of SNAT ports that are utilized per backend instance. | Average |
| Byte Count | Public and internal load balancer | A load balancer reports the data processed per front end. You may notice that the bytes aren’t distributed equally across the backend instances. This is expected as the Azure Load Balancer algorithm is based on flows | Sum |
| Packet Count | Public and internal load balancer | A load balancer reports the packets processed per front end. | Sum |
Note
Bandwidth-related metrics such as SYN packet, byte count, and packet count will not capture any traffic to an internal load balancer via a UDR (eg. from an NVA or firewall).
Max and min aggregations are not available for the SYN count, packet count, SNAT connection count, and byte count metrics. Count aggregation is not recommended for Data path availability and health probe status. Use average instead for best represented health data.
Note
Data Path Availability metric may take up to 10 minutes to appear in Azure Monitor metrics after a load balancer is created or updated.
View your load balancer metrics in the Azure portal
The Azure portal exposes the load balancer metrics via the Metrics page. This page is available on both the load balancer's resource page for a particular resource and the Azure Monitor page.
Note
Azure Load Balancer does not send health probes to deallocated virtual machines. When virtual machines are deallocated, the load balancer will stop reporting metrics for that instance. Metrics that are unavailable will appear as a dashed line in Portal, or display an error message indicating that metrics cannot be retrieved.
To view the metrics for your load balancer resources:
Go to the metrics page and do either of the following tasks:
On the load balancer's resource page, select the metric type in the drop-down list.
On the Azure Monitor page, select the load balancer resource.
Set the appropriate metric aggregation type.
Optionally, configure the required filtering and grouping.
Optionally, configure the time range and aggregation. By default time is displayed in UTC.
Note
Time aggregation is important when interpreting certain metrics as data is sampled once per minute. If time aggregation is set to five minutes and metric aggregation type Sum is used for metrics such as SNAT allocation, your graph will display five times the total allocated SNAT ports.
Recommendation: When analyzing metric aggregation type Sum and Count, we recommend using a time aggregation value that is greater than one minute.
Retrieve multi-dimensional metrics programmatically via APIs
For API guidance for retrieving multi-dimensional metric definitions and values, see Azure Monitoring REST API walkthrough. These metrics can be written to a storage account by adding a diagnostic setting for the 'All Metrics' category.
Common diagnostic scenarios and recommended views
Is the data path up and available for my load balancer frontend?
Are the backend instances for my load balancer responding to probes?
How do I check my outbound connection statistics?
How do I check my SNAT port usage and allocation?
How do I check inbound/outbound connection attempts for my service?
How do I check my network bandwidth consumption?
How do I diagnose my load balancer deployment?
Configure alerts for multi-dimensional metrics
Azure Load Balancer supports easily configurable alerts for multi-dimensional metrics. Configure custom thresholds for specific metrics to trigger alerts with varying levels of severity to empower a no touch resource monitoring experience.
To configure alerts:
Go to the alert page for the load balancer
Create new alert rule
Configure alert condition (Note: to avoid noisy alerts, we recommend configuring alerts with the Aggregation type set to Average, looking back on a five-minute window of data, and with a threshold of 95%)
(Optional) Add action group for automated repair
Assign alert severity, name, and description that enables intuitive reaction
Inbound availability alerting
Note
If your load balancer's backend pools are empty, the load balancer will not have any valid data paths to test. As a result, the data path availability metric will not be available, and any configured Azure Alerts on the data path availability metric will not trigger.
To alert for inbound availability, you can create two separate alerts using the data path availability and health probe status metrics. Customers may have different scenarios that require specific alerting logic, but the below examples are helpful for most configurations.
Using data path availability, you can fire alerts whenever a specific load-balancing rule becomes unavailable. You can configure this alert by setting an alert condition for the data path availability and splitting by all current values and future values for both frontend port and frontend IP address. Setting the alert logic to be less than or equal to 0 will cause this alert to be fired whenever any load-balancing rule becomes unresponsive. Set the aggregation granularity and frequency of evaluation according to your desired evaluation.
With health probe status, you can alert when a given backend instance fails to respond to the health probe for a significant amount of time. Set up your alert condition to use the health probe status metric and split by backend IP address and backend port, using the Average aggregation type. This ensures that you can alert separately for each individual backend instance’s ability to serve traffic on a specific port.
Outbound availability alerting
For outbound availability, you can configure two separate alerts using the SNAT connection count and used SNAT port metrics.
To detect outbound connection failures, configure an alert using SNAT connection count and filtering to Connection State = Failed. Use the Total aggregation. Then, you can split this by backend IP address set to all current and future values to alert separately for each backend instance experiencing failed connections. Set the threshold to be greater than zero or a higher number if you expect to see some outbound connection failures.
With used SNAT ports, you can alert on a higher risk of SNAT exhaustion and outbound connection failure. Ensure you’re splitting by backend IP address and protocol when using this alert. Use the Average aggregation. Set the threshold to be greater than a percentage of the number of ports you’ve allocated per instance that you determine is unsafe. For example, configure a low severity alert when a backend instance uses 75% of its allocated ports. Configure a high severity alert when it uses 90% or 100% of its allocated ports.
Resource health status
Health status for the standard load balancer resources is exposed via the existing Resource health under Monitor > Service health. It’s evaluated every two minutes by measuring data path availability that determines whether your frontend load-balancing endpoints are available.
| Resource health status | Description |
|---|---|
| Available | Your standard load balancer resource is healthy and available. |
| Degraded | Your standard load balancer has platform or user initiated events impacting performance. The metric for data path availability has reported less than 90% but greater than 25% health for at least two minutes. With this status, you experience moderate to severe performance effect. See Support and troubleshooting for Azure Load Balancer to determine whether there are user initiated events impacting your availability. |
| Unavailable | Your standard load balancer resource isn’t healthy. The metric for data path availability has reported less the 25% health for at least two minutes. With this status, you experience significant performance effect or lack of availability for inbound connectivity. There may be user or platform events causing unavailability. See Support and troubleshooting for Azure Load Balancer to determine whether there are user initiated events impacting your availability. |
| Unknown | Health status for your load balancer resource hasn’t been updated or hasn’t received information for data path availability for the last 10 minutes. This state should be transient and will reflect correct status as soon as data is received. |
To view the health of your public standard load balancer resources:
Select Monitor > Service health.
Select Resource health, and then make sure that Subscription ID and Resource type = load balancer are selected.
In the list, select the load balancer resource to view its historical health status.
A generic description of a resource health status is available in the resource health documentation.
Resource health alerts
Azure Resource Health alerts can notify you in near real-time when the health state of your Load balancer resource changes. It's recommended that you set resource health alerts to notify you when your Load balancer resource is in a Degraded or Unavailable state.
When you create Azure resource health alerts for Load balancer, Azure sends resource health notifications to your Azure subscription. You can create and customize alerts based on:
- The subscription affected
- The resource group affected
- The resource type affected (Load balancer)
- The specific resource (any Load balancer resource you choose to set up an alert for)
- The event status of the Load balancer resource affected
- The current status of the Load balancer resource affected
- The previous status of the Load balancer resource affected
- The reason type of the Load balancer resource affected
You can also configure who the alert should be sent to:
- A new action group (that can be used for future alerts)
- An existing action group
For more information on how to set up these resource health alerts, see:
Next steps
- Learn about Network Analytics.
- Learn about using Insights to view these metrics preconfigured for your load balancer.
- Learn more about Standard load balancer.
Feedback
Was this page helpful?
