Exchange Hybrid Mail Loops: When Your Email Goes on a Magical Mystery Adventure

When managing Exchange hybrid environments you will be aware of the connection between Exchange Online and Exchange On-Premises and that fact that this requires a hybrid connector and mail transport, however what happens when you need to add a new domain to that hybrid connector using Hybrid Connection Wizard (HCW) which informs you it has been added, but it has not been added as its internally failed........lets start the story there with the domains we are talking about:

bythepowerofgreyskull.com (primary domain, working perfectly)
pokebearswithsticks.com (working fine)
idobeleiveinfairies.com (the failed intra-connected domain)

All domains are configured as Internal Relay in Exchange Online, all are included in my hybrid connector, and mail flow works seamlessly—except for one domain that was driving me absolutely insane.

The Problem: Mail Loops

When I added idobeleiveinfairies.com to my existing hybrid environment, everything appeared to be configured correctly. The domain was added to the outbound connector, set as Internal Relay, and I had proper MailUser objects in Exchange Online representing my on-premises mailboxes.

But when I tested mail delivery, I got this frustrating result:

Reason: [{LED=554 5.4.14 Hop count exceeded - possible mail loop ATTR34}]
Message sent to mxa-srv57445.pricklybear.com at 84.77.11.299 using TLS1.2 with AES256

The mail was routing to PricklyBear (our third-party fictitious email filter) which was then being sent to Exchange Online only to lookup the MX record and be sent back to PricklyBear. What the hell?

Down the Rabbit Hole of Troubleshooting

Was a problem with the PricklyBear service, the thought did go though my mind as after all, bythepowerofgreyskull.com and pokebearswithsticks.com worked perfectly, so why would idobeleiveinfairies.com behave differently?

Theory 1: Third-Party MX Records Break Hybrid

I spent hours researching whether having Proofpoint as the MX record was incompatible with Exchange hybrid. I found plenty of documentation about third-party filters, but nothing definitive about this specific scenario, and this solution did support hybrid mail flow - so why did this domain fail?

Theory 2: Domain Configuration Issues

I verified everything multiple times:

Get-AcceptedDomain idobeleiveinfairies.com
# Result: DomainType = InternalRelay ✓

Get-OutboundConnector "Office 365 to On-Prem"
# Result: RecipientDomains includes idobeleiveinfairies.com ✓
# RecipientDomains: {idobeleiveinfairies.com, pokebearswithsticks.com, bythepowerofgreyskull.com}
# RouteAllMessagesViaOnPremises: False
# ConnectorType: OnPremises
# SmartHosts: {hybrid.bythepowerofgreyskull.com}

Get-MailUser lee.test@idobeleiveinfairies.com
# Result: RecipientType = MailUser ✓
# ExternalEmailAddress = SMTP:lee.test@idobeleiveinfairies.com

Everything looked correct, but that ExternalEmailAddress was suspicious. It was pointing back to the same domain, which could explain the loop.

Theory 3: Enhanced Filtering Will Fix Everything

Convinced that the issue was with how PricklyBear was forwarding mail to Exchange Online, I enabled Enhanced Filtering for Connectors on our PricklyBear inbound connector:

Set-InboundConnector -Identity "Proofpoint" `
  -EFSkipIPs "84.77.11.299"
  -EFUsers "lee.test@idobeleiveinfairies.com"

I waited the recommended 15-30 minutes for propagation, but the mail still looped. Enhanced Filtering was supposed to help Exchange Online better understand the true sender of messages, but it had zero impact on the routing decisions.

Theory 4: The MX Record Test

I then decided to test whether the problem was specifically with PricklyBear as the MX record. I temporarily changed the MX record for idobeleiveinfairies.com from PricklyBear to Exchange Online Protection.

The result: Even with EOP as the MX record, I still got the same 5.4.14 mail loop error!

This was a revelation. The problem wasn't PricklyBear at all—it was something fundamental in the Exchange Online configuration.

The Message Trace Deep Dive

To understand what was actually happening, I pulled detailed message traces and headers. Here's what I found in a failed message:

HOP	TIME (UTC)	FROM	TO	WITH
1	7/4/2025 6:16:19 AM	mail-yw1-x112d.google.com	AM3PEPF0000A78E.mail.protection.outlook.com	Microsoft SMTP Server
2	7/4/2025 6:16:19 AM	AM3PEPF0000A78E.eurprd04.prod.outlook.com	AS4PR10CA0030.outlook.office365.com	Microsoft SMTP Server
3	7/4/2025 6:16:24 AM	AM0PR83CU005.outbound.protection.outlook.com	AMS1EPF0000004E.mail.protection.outlook.com	Microsoft SMTP Server
4	7/4/2025 6:16:27 AM	AS8PR03CU001.outbound.protection.outlook.com	AMS0EPF0000019C.mail.protection.outlook.com	Microsoft SMTP Server

The message was bouncing around within Exchange Online's infrastructure—multiple outbound.protection.outlook.com hops showed that EOP was trying to deliver externally multiple times. There were no on-premises server hops anywhere in the chain.

The key headers told the story:

X-MS-Exchange-CrossTenant-AuthAs: Anonymous - EOP treating the message as external
Multiple ARC-Seal entries - Message being processed repeatedly
No hybrid connector usage - Zero evidence of on-premises routing

Back to Basics : Comparing What Actually Works

Eventually, I stopped focusing on the broken domain and started examining what was different between the working and non-working domains.

When I checked the MailUser objects

# Working domain users
Get-MailUser lee.croucher@bythepowerofgreyskull.com | Select PrimarySmtpAddress,ExternalEmailAddress
# Result: RecipientType = MailUser, but ExternalEmailAddress = (blank)

# Wait, that's wrong - let me check if it's actually a mailbox
Get-Mailbox lee.croucher@bythepowerofgreyskull.com | Select PrimarySmtpAddress,ExternalEmailAddress
# Result: This works! It's actually a cloud mailbox, not a MailUser!

# Broken domain users  
Get-MailUser lee.test@idobeleiveinfairies.com | Select PrimarySmtpAddress,ExternalEmailAddress
# Result: ExternalEmailAddress = SMTP:lee.test@idobeleiveinfairies.com

I had been comparing a cloud mailbox to an on-premises MailUser!

This sent me down another rabbit hole where I started questioning why new on-premises mailboxes were creating MailUser contacts instead of proper RemoteMailboxes. I was convinced the entire hybrid configuration was broken.

But then I realized the truth: MailUser contacts are the correct behavior for on-premises mailboxes that haven't been migrated yet. The issue wasn't the object type—it was the ExternalEmailAddress pointing back to itself, creating the loop.

The Inconsistency That Made No Sense

Here's what was driving me absolutely crazy:

bythepowerofgreyskull.com = All working domains had mostly cloud mailboxes
pokebearswithsticks.com = Same thing—cloud mailboxes working fine
idobeleiveinfairies.com = Had on-premises MailUsers with loop-causing ExternalEmailAddress

When Exchange Online tried to deliver mail to the MailUser, it looked at the ExternalEmailAddress, saw it pointed to the same domain, did an MX lookup, found Proofpoint (or EOP, when I tested), and sent the mail back to where it came from. Loop city.

But why did idobeleiveinfairies.com have this problem when I manually added it to an existing hybrid connector? The other domains were configured properly via the Hybrid Configuration Wizard (HCW), but I had manually added this one later.

Root Cause: A "Successful" Hybrid Configuration that actually failed

But why did idobeleiveinfairies.com have the ExternalEmailAddress pointing back to itself while the working domains didn't? The answer lay in how the domains were configured.

The working domains (bythepowerofgreyskull.com and pokebearswithsticks.com) were configured via the Hybrid Configuration Wizard originally. idobeleiveinfairies.com was manually added to the connector later.

That's when I found the smoking gun in my HCW logs from when I tried to add the domain properly:

[2025-07-04 14:32:15.847] [INFO] Starting Hybrid Configuration Update...
[2025-07-04 14:32:16.102] [INFO] Validating on-premises connectivity...
[2025-07-04 14:32:17.256] [INFO] Validating Exchange Online connectivity...
[2025-07-04 14:32:18.445] [INFO] Updating connectors configuration...
[2025-07-04 14:32:19.778] [INFO] Adding domain 'idobeleiveinfairies.com' to outbound connector...
[2025-07-04 14:32:20.123] [INFO] Connector update completed successfully.
[2025-07-04 14:32:20.389] [INFO] Updating Organization Relationship configuration...
[2025-07-04 14:32:45.567] [WARNING] Operation timeout occurred while updating Organization Relationship table for domain 'idobeleiveinfairies.com'
[2025-07-04 14:32:45.568] [WARNING] Timeout: The operation did not complete within the expected timeframe (25 seconds)
[2025-07-04 14:32:45.569] [WARNING] Continuing with remaining configuration tasks...
[2025-07-04 14:32:46.234] [INFO] Updating Hybrid Configuration object...
[2025-07-04 14:32:47.891] [INFO] Configuration validation completed.
[2025-07-04 14:32:48.445] [SUCCESS] Hybrid Configuration Wizard completed successfully.
[2025-07-04 14:32:48.446] [INFO] Some warnings were encountered during configuration. Review log for details.

There it was! The HCW reported "success" but had actually failed to update the Organization Relationship table due to a timeout. This meant:

✅ The domain was added to the connector (that's why it appeared in my configuration)
❌ The Organization Relationship mapping was incomplete
❌ Exchange Online didn't know how to properly resolve recipients for this domain
❌ MailUser objects got created with incorrect ExternalEmailAddress values

What the Hybrid Configuration Wizard Actually Does

The HCW does far more than just add domains to connectors. When it works properly, it:

Creates Federation Trust between organizations (if Exchange 2010 exists)
Establishes Organization Relationships for cross-premises features
Configures OAuth authentication between environments
Updates the HybridConfiguration Active Directory object that controls recipient resolution
Sets up proper domain routing logic for Exchange Online
Configures proper MailUser ExternalEmailAddress values to point to .mail.onmicrosoft.com routing addresses

The Frustrating Reality

The most maddening part of this entire experience was that the HCW reported success when it had actually failed. I spent days troubleshooting what I thought was a configuration issue, when it was actually a failed wizard run disguised as a successful one.

If the HCW had properly failed and reported an error, I would have known to investigate and re-run it. Instead, it gave me a false sense of security with its "successful" completion.

The Fix

Re-running the Hybrid Configuration Wizard properly resolved the issue. The Organization Relationship timeout didn't occur on the second run, and idobeleiveinfairies.com finally got the proper recipient resolution configuration it needed.

Final Thoughts

This experience taught me never to trust a "successful" HCW run without thoroughly reviewing the logs. Microsoft's tooling should be more transparent about partial failures, especially ones that leave environments in a broken state.

And most importantly: read the damn logs. The answer was sitting there the entire time, hidden behind a deceptive success message and buried under layers of assumptions about what "should" work.

Exchange Hybrid Mail Loops: When Your Email Goes on a Magical Mystery Adventure

نموذج الاتصال