Notice: Due to size constraints and loading performance considerations, scripts referenced in blog posts are not attached directly. To request access, please complete the following form: Script Request Form Note: A Google account is required to access the form.
Disclaimer: I do not accept responsibility for any issues arising from scripts being run without adequate understanding. It is the user's responsibility to review and assess any code before execution. More information

Proactive Exchange Queue Monitoring: Building a Smart Alert System for Hybrid Environments

This post derived from a silent failure, I detected  where email queue issues can quickly escalate from minor delays to full-blown communication breakdowns. After dealing with one too many situations where queues silently backed up over weekends, I decided to build a robust PowerShell monitoring solution that would proactively alert me to problems before users started complaining.

The Challenge

Exchange Server queue monitoring presents several unique challenges:

  • Multiple servers - Modern Exchange deployments typically span multiple servers
  • Different queue types - Not all queues are created equal (shadow redundancy vs. actual delivery queues)
  • Dynamic discovery - Servers come and go, and hardcoding server names isn't scalable
  • Actionable alerts - Generic "queue is full" alerts don't help with troubleshooting

I needed a script that would intelligently monitor only the queues that matter and provide detailed diagnostic information when problems occur.

The Email Alert Paradox (And Why Hybrid Saves Us)

At first glance, using email alerts to notify about email problems seems like a classic catch-22. If your email system is down, how can it send alerts about being down? This is a legitimate concern for organizations running purely on-premises Exchange.

However, in a hybrid Exchange environment, this paradox doesn't apply. Here's why:

In a hybrid setup:

  • Exchange Online mailboxes remain fully operational and can receive emails
  • On-premises Exchange servers handle outbound mail flow through send connectors
  • Queue issues typically affect outbound delivery, not inbound mail reception

This means that even when my on-premises Exchange servers have queue problems preventing outbound email delivery, my Exchange Online mailboxes can still receive the monitoring alerts. The script monitors the health of send connectors and outbound queues while leveraging the cloud infrastructure for reliable alert delivery.

This architectural advantage makes email-based alerting not just viable, but actually quite elegant for hybrid environments.

Notificatoin Email

When the queue does go above 200 messages on a relevant queue you will receive an alert something like this:

Script Flow Overview

My approach centers around four key components:

  1. Dynamic Exchange server discovery
  2. Intelligent queue filtering
  3. Detailed error analysis
  4. Professional alerting with mobile-responsive design

Let me walk through how I implemented each piece.

Dynamic Server Discovery

Rather than maintaining a static list of Exchange servers, I built the script to automatically discover all relevant servers in the organization:

function Get-ExchangeServers {
    try {
        Write-Log "Discovering Exchange servers..."
        $exchangeServers = Get-ExchangeServer | Where-Object {$_.ServerRole -match 
        "Mailbox|Hub|Edge"}
        Write-Log "Found $($exchangeServers.Count) Exchange servers"
        return $exchangeServers
    }
    catch {
        Write-Log "ERROR: Failed to get Exchange servers - $($_.Exception.Message)"
        return $null
    }
}

This approach automatically adapts to environment changes and ensures I'm always monitoring the complete Exchange topology.

Intelligent Queue Filtering

One of the biggest lessons I learned early on was that not all queues deserve monitoring. Shadow redundancy queues, for example, are internal Exchange mechanisms that can show high message counts during normal operation. I needed to focus only on actual outbound delivery queues:

# Get all queues on the server, focusing on outbound queues only
$queues = Get-Queue -Server $ServerName | Where-Object {
    # Include only actual sending queues
    ($_.DeliveryType -eq "DnsConnectorDelivery" -or 
     $_.DeliveryType -eq "SmartHostConnectorDelivery") -and
    # Exclude internal Exchange queues
    $_.DeliveryType -ne "ShadowRedundancy" -and
    $_.DeliveryType -ne "Heartbeat" -and
    $_.DeliveryType -ne "Undefined" -and
    # Exclude system queues
    $_.NextHopDomain -notmatch "^(Submission|Unreachable|Poison|Shadow)$" -and
    $_.Identity -notmatch "Shadow"
}

This filtering logic eliminates false positives while ensuring I capture genuine email delivery problems.

Deep Queue Analysis

When a queue exceeds the threshold, I don't just want to know that there's a problem - I want to know what the problem is. The script performs detailed analysis using Exchange's Format-List output to extract error information:

function Get-QueueErrorInfo {
    param(
        [string]$ServerName,
        [string]$QueueIdentity
    )
    
    $errorInfo = @{
        HasErrors = $false
        ErrorDetails = @()
    }
    
    try {
        # Get detailed queue information
        $queueDetails = Get-Queue -Server $ServerName -Identity $QueueIdentity | 
        Format-List | Out-String
        
        # Define error patterns to search for
        $errorPatterns = @("failure", "error", "certificate", "retry", "suspended", "paused")
        
        foreach ($pattern in $errorPatterns) {
            if ($queueDetails -match $pattern) {
                $errorInfo.HasErrors = $true
                # Extract lines containing the error pattern
                $lines = $queueDetails -split "`n" | Where-Object { $_ -match $pattern }
                foreach ($line in $lines) {
                    if ($line.Trim() -ne "") {
                        $errorInfo.ErrorDetails += $line.Trim()
                    }
                }
            }
        }
    }
    catch {
        Write-Log "WARNING: Could not get detailed info for queue $QueueIdentity 
        on $ServerName"
    }
    
    return $errorInfo
}

This function searches for specific keywords that indicate problems and extracts the relevant diagnostic information automatically.

Multi-Version Exchange Support

One challenge I encountered was ensuring the script works across different Exchange versions. I implemented a comprehensive PowerShell environment initialization function:

function Initialize-ExchangePowerShell {
    Write-Log "Initializing Exchange PowerShell environment..."
    
    # Check if Exchange cmdlets are already available
    if (Get-Command Get-ExchangeServer -ErrorAction SilentlyContinue) {
        Write-Log "Exchange cmdlets already available"
        return $true
    }
    
    # Try different Exchange PowerShell loading methods
    $exchangeLoaded = $false
    
    # Method 1: Exchange 2010
    if (Get-PSSnapin -Registered -Name Microsoft.Exchange.Management.PowerShell.E2010 
        -ErrorAction SilentlyContinue) {
        Add-PSSnapin Microsoft.Exchange.Management.PowerShell.E2010
        Write-Log "Exchange 2010 snap-in loaded successfully"
        $exchangeLoaded = $true
    }
    
    # Method 2: Exchange 2013+
    if (!$exchangeLoaded -and (Get-PSSnapin -Registered -Name 
        Microsoft.Exchange.Management.PowerShell.SnapIn -ErrorAction SilentlyContinue)) {
        Add-PSSnapin Microsoft.Exchange.Management.PowerShell.SnapIn
        Write-Log "Exchange 2013+ snap-in loaded successfully"
        $exchangeLoaded = $true
    }
    
    return $exchangeLoaded
}

This ensures the script works whether I'm running it on Exchange 2010, 2013, 2016, 2019, or 2022.

Professional Mobile-Responsive Alerting

When problems occur, I need alerts that work equally well on my phone and desktop. I built a clean, card-based HTML email design:

function Create-EmailBody {
    param([array]$ProblematicQueues)
    
    $totalQueues = $ProblematicQueues.Count
    $totalMessages = ($ProblematicQueues | Measure-Object -Property MessageCount -Sum)
    .Sum
    
    $html = @"
<!DOCTYPE html>
<html>
<head>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <style>
        .container {
            max-width: 800px;
            margin: 0 auto;
            background: white;
            border-radius: 8px;
            box-shadow: 0 2px 10px rgba(0,0,0,0.1);
        }
        .header {
            background: #d73502;
            color: white;
            padding: 20px;
            text-align: center;
        }
        @media (max-width: 600px) {
            .queue-info { grid-template-columns: 1fr; }
        }
    </style>
</head>
<!-- Rest of HTML structure -->
"@
    return $html
}

The design uses CSS Grid for responsive layouts and includes specific error detection sections that only appear when issues are found.

Real-World Output

When I run the script in my environment, I get clean, informative logging that shows exactly what's happening:

[2025-06-22 16:35:01] Starting Exchange Queue Monitor
[2025-06-22 16:35:01] Initializing Exchange PowerShell environment...
[2025-06-22 16:35:01] Exchange cmdlets already available
[2025-06-22 16:35:01] Discovering Exchange servers...
[2025-06-22 16:35:01] Found 6 Exchange servers
[2025-06-22 16:35:01] Checking queues on server: EXCHANGE01
[2025-06-22 16:35:01] Checking queues on server: EXCHANGE02
[2025-06-22 16:35:01] Checking queues on server: EXCHANGE03
[2025-06-22 16:35:01] Checking queues on server: EXCHANGE04
[2025-06-22 16:35:01] Checking queues on server: EXCHANGE05
[2025-06-22 16:35:01] Checking queues on server: EXCHANGE06
[2025-06-22 16:35:02] No queues exceed the threshold of 200 messages. All systems normal.
[2025-06-22 16:35:02] Exchange Queue Monitor completed successfully

This gives me confidence that the script is working correctly and monitoring my entire Exchange infrastructure.

Key Benefits

After implementing this solution in my hybrid Exchange environment, I've seen several major improvements:

Proactive Problem Detection

Instead of waiting for user complaints, I now get immediate alerts when delivery issues begin to develop on the on-premises infrastructure.

Hybrid-Optimized Alerting

The solution leverages the reliability of Exchange Online for alert delivery while monitoring on-premises queue health, eliminating the email alert paradox.

Reduced False Positives

By filtering out shadow redundancy and internal queues, I only get alerts for genuine problems that require investigation.

Mobile Accessibility

The responsive email design means I can effectively triage issues even when I'm not at my desk.

Threshold Tuning

The default 200-message threshold works well for most environments, but you may need to adjust based on your normal email volumes and delivery patterns.

Log Management

The script creates logs in its own directory, making it easy to track performance and troubleshoot issues. I keep about 30 days of logs for trend analysis.

SMTP Configuration

Make sure to configure the SMTP settings for your environment the defaults in the script will be used, but if you want to customise them then you can use this syntax:

.\ExchangeQueueMonitor.ps1 -SMTPServer "smtp.bear.local" -FromAddress 
"alerts@bythepowerofgreyskull.com" -ToAddresses @("lee@bythepowerofgreyskull.com")

Conclusion

Building this Exchange queue monitoring solution has transformed how I manage email infrastructure in my hybrid environment. Instead of reactive firefighting when send connectors fail, I now have proactive visibility into potential issues before they impact users.

The hybrid architecture provides a significant advantage here - while I'm monitoring the health of on-premises outbound mail flow, I can rely on Exchange Online's infrastructure to deliver the alerts reliably. This eliminates the classic "email alerts about email being down" problem that plagues purely on-premises environments.

Previous Post Next Post

نموذج الاتصال