In my previous post, I shared how a simple
chkdsk
command completely destroyed a production domain controller with the "Unmountable Boot Volume" blue screen sealed its fate, I was left with a dead server that Active Directory still thought was alive and well.This is the story of how I performed a complete metadata cleanup to remove all traces of that failed domain controller from Active Directory.
The Problem: Zombie Domain Controller
When a domain controller dies unexpectedly (hardware failure, corruption, or in my case, filesystem disaster), Active Directory doesn't automatically know it's gone. The domain continues to:
- Reference the dead DC in replication topology
- Maintain DNS records pointing to the failed server
- Keep the computer object in the Domain Controllers OU
- Preserve site and service references
These "zombie" references cause several problems:
- Slow authentication as clients try to contact the dead DC
- Replication errors and warnings
- DNS resolution issues
- General Active Directory health problems
Preparation (collect relevant data)
Before starting the cleanup process, I gathered critical information:
1. Verify the DC is truly dead
No point of return here - my server showed "Unmountable Boot Volume" and wouldn't boot.
2. Check FSMO roles
netdom query fsmo
Fortunately, my failed DC (fuzzybear) didn't hold any FSMO roles. If it had, I would need to seize those roles first.
3. Identify other healthy DCs
netdom query /domain:bear.local dc
I had multiple other domain controllers, so the domain was stable.
4. Gather server details
- Server name: fuzzybear.bear.local
- Domain: bear.local
- Site: New York
- IP Address: [noted from DNS records]
Zombie 🧟 Domain Controller Cleanup Process
‼️ Warning : Please ensure before proceeding the domain controller will not come back online, you will be removing the domain controller from active directory by following this information - they should only be followed if a graceful demotion cannot occur.
Here's the step-by-step process I used to completely remove all traces of the dead domain controller:
Step 1: NTDSUTIL Metadata Cleanup
This is the core of the cleanup process. NTDSUTIL removes the domain controller from Active Directory's replication topology:
ntdsutil
metadata cleanup
connections
connect to server bearclaws.bear.local
quit
select operation target
list domains
select domain 0
list sites
select site 0
list servers in site
select server 1
quit
remove selected server
quit
quit
What this does:
- Connects to a healthy DC (bearclaws)
- Selects the target domain and site
- Lists servers in the site and selects the failed one
- Removes the server from AD replication metadata
Step 1A: Remove Computer Object from Domain Controllers OU
Sometimes NTDSUTIL doesn't remove the computer object, so I verified and cleaned it manually:
dsrm "CN=fuzzybear,OU=Domain Controllers,DC=bear,DC=local" -NoPrompt
Or through Active Directory Users and Computers GUI:
- Navigate to Domain Controllers OU
- Delete the "BearClaws" computer object
- Select "This DC is permanently offline..."
Step 1B: Verify Sites and Services Cleanup
Check Active Directory Sites and Services to ensure the server object is gone:
- Sites → New York → Servers
- Verify "fuzzybear" is removed
- If still present, delete it manually
Step 2: DNS Records Cleanup
This was the most time-consuming part. DNS records don't get automatically cleaned up by NTDSUTIL, so I had to remove them manually.
PowerShell Method (Recommended):
# Clean up _msdcs zone records
Get-DnsServerResourceRecord -ZoneName "_msdcs.bear.local" | Where-Object {
$_.RecordData.IPv4Address -eq "10.85.66.266" -or
$_.RecordData.NameServer -eq "bearclaws.bear.local." -or
$_.RecordData.DomainName -eq "bearclaws.bear.local."
} | Remove-DnsServerResourceRecord -ZoneName "_msdcs.bear.local" -Force
# Clean up main domain zone
Get-DnsServerResourceRecord -ZoneName "bear.local" | Where-Object {
$_.RecordData.IPv4Address -eq "10.85.66.266" -or
$_.RecordData.DomainName -eq "bearclaws.bear.local."
} | Remove-DnsServerResourceRecord -ZoneName "bear.local" -Force
# Clean up reverse lookup zone
Get-DnsServerResourceRecord -ZoneName "1.0.10.in-addr.arpa" | Where-Object {
$_.RecordData.DomainName -eq "bearclaws.bear.local."
} | Remove-DnsServerResourceRecord -ZoneName "1.0.10.in-addr.arpa" -Force
Step 3: Force Active Directory Replication
After all the cleanup, I forced replication across all domain controllers to ensure the changes propagated:
repadmin /syncall /AdeP
Command breakdown:
/syncall
= Synchronize with all replication partners/A
= All partitions (domain, configuration, schema)/e
= Enterprise-wide (entire forest)/P
= Push mode (initiate replication)
Step 4: Verification and Health Checks
Finally, I verified that all traces of the dead DC were gone:
# Check overall AD health
dcdiag /v
# Check replication status
repadmin /replsummary
# Verify no references to old DC
repadmin /showrepl
# Check domain controller list
netdom query /domain:bear.local dc
Interested in automating the “Cleanup” ?
Here's the complete PowerShell script I created not that this should be a frequent activity but this will complete the actions in the correct order:
# Complete Domain Controller Cleanup Script
param(
[Parameter(Mandatory=$true)]
[string]$FailedDCName,
[Parameter(Mandatory=$true)]
[string]$FailedDCIP,
[Parameter(Mandatory=$true)]
[string]$DomainName,
[Parameter(Mandatory=$true)]
[string]$HealthyDCName
)
Write-Host "Starting cleanup for failed DC: $FailedDCName" -ForegroundColor Yellow
# Step 1: Remove computer object from Domain Controllers OU
Write-Host "Removing computer object from Domain Controllers OU..."
-ForegroundColor Green
try {
$dn = "CN=$FailedDCName,OU=Domain Controllers," + (Get-ADDomain).DistinguishedName
Remove-ADObject -Identity $dn -Confirm:$false
Write-Host "Computer object removed successfully" -ForegroundColor Green
} catch {
Write-Host "Computer object not found or already removed" -ForegroundColor Yellow
}
# Step 2: DNS Cleanup
Write-Host "Cleaning up DNS records..." -ForegroundColor Green
# Clean _msdcs zone
Write-Host "Cleaning _msdcs zone records..." -ForegroundColor Cyan
Get-DnsServerResourceRecord -ZoneName "_msdcs.$DomainName" | Where-Object {
$_.RecordData.IPv4Address -eq $FailedDCIP -or
$_.RecordData.NameServer -eq "$FailedDCName.$DomainName." -or
$_.RecordData.DomainName -eq "$FailedDCName.$DomainName."
} | Remove-DnsServerResourceRecord -ZoneName "_msdcs.$DomainName" -Force
# Clean main domain zone
Write-Host "Cleaning main domain zone records..." -ForegroundColor Cyan
Get-DnsServerResourceRecord -ZoneName $DomainName | Where-Object {
$_.RecordData.IPv4Address -eq $FailedDCIP -or
$_.RecordData.DomainName -eq "$FailedDCName.$DomainName."
} | Remove-DnsServerResourceRecord -ZoneName $DomainName -Force
# Clean reverse lookup zone (assuming standard /24 subnet)
$octets = $FailedDCIP.Split('.')
$reverseZone = "$($octets[2]).$($octets[1]).$($octets[0]).in-addr.arpa"
$lastOctet = $octets[3]
Write-Host "Cleaning reverse lookup zone: $reverseZone" -ForegroundColor Cyan
try {
Get-DnsServerResourceRecord -ZoneName $reverseZone | Where-Object {
$_.RecordData.DomainName -eq "$FailedDCName.$DomainName."
} | Remove-DnsServerResourceRecord -ZoneName $reverseZone -Force
} catch {
Write-Host "Reverse lookup zone not found or no records to clean"
-ForegroundColor Yellow
}
# Step 3: Force replication
Write-Host "Forcing Active Directory replication..." -ForegroundColor Green
Start-Process -FilePath "repadmin" -ArgumentList "/syncall /AdeP" -Wait -NoNewWindow
# Step 4: Verification
Write-Host "Running verification checks..." -ForegroundColor Green
Write-Host "Domain Controllers in domain:" -ForegroundColor Cyan
netdom query /domain:$DomainName dc
Write-Host "Replication summary:" -ForegroundColor Cyan
repadmin /replsummary
Write-Host "Cleanup completed for $FailedDCName" -ForegroundColor Green
Write-Host "Please verify no references remain and monitor replication health"
-ForegroundColor Yellow
Usage:
.\Cleanup-FailedDC.ps1 -FailedDCName "fuzzybear"
-FailedDCIP "
10.85.66.266" -DomainName "bear.local" -HealthyDCName "bearclaws"
Common Issues and Troubleshooting
1. NTDSUTIL Errors
If you get "Invalid DN Syntax" or "Object not found" errors:
- The DC might already be partially cleaned
- Try the GUI method instead (ADUC and Sites & Services)
- Verify you're connected to the right healthy DC
2. DNS Record Persistence
Some DNS records might not delete automatically:
- Check for scavenging settings
- Manually verify each zone
- Use dnscmd for stubborn records
3. Replication Errors
After cleanup, you might see temporary replication errors:
- This is normal as AD adjusts to the topology change
- Monitor with
repadmin /replsummary
- Errors should clear within hours
Building the Replacement
After the cleanup was complete, I built a new domain controller to replace the failed one, once it has been built, it’s a member server by joining it to the domain, we then need to firstly, add the required role:
# Install AD DS role first
Install-WindowsFeature -Name AD-Domain-Services -IncludeManagementTools
Then we need the command to make the meber server a domain controller:
# Promote to domain controller
Install-ADDSDomainController `
-DomainName "bear.local" `
-InstallDns `
-Credential (Get-Credential) `
-SiteName "New York" `
-NoGlobalCatalog:$false `
-DatabasePath "C:\Windows\NTDS" `
-LogPath "D:\NTDS" `
-SysvolPath "E:\SYSVOL" `
-SafeModeAdministratorPassword (Read-Host -AsSecureString "Enter DSRM Password") `
-Force
Specific partner required for replication?
If all your domain controllers are healthy then there is no real need for this command parameter as it really doesn’t matter what you replicate from, the domain controller should have the same consistent information.
However, consider for a moment, not all the domain controllers are healthy and you have some "degraded" domain controllers.
If you are in the situation is like the one mentioned above I would highly recommend you need to specify a known working domain controller to replicate from, if you wish to do that, you need to specify the following parameter when running the command to promote the domain controller:
-ReplicationSourceDC "bearclaws.bear.local" `
Lessons Learned
Let’s review the lessons learn from the exercise because learning is just as important as running commands.
1. Cleanup is Critical
Leaving dead domain controller references in AD causes ongoing problems. Clean them up immediately.
2. DNS is Often Forgotten
NTDSUTIL doesn't clean DNS records. This is a manual process that's easy to overlook but critical for proper operation.
3. Automation Saves Time
Creating a reusable script makes future cleanups much faster and less error-prone.
4. Verification is Essential
Always verify your cleanup worked. Run health checks and monitor replication to ensure everything is working properly.
Conclusion
Forcibly removing a dead domain controller from Active Directory requires careful attention to detail. The process involves:
- NTDSUTIL metadata cleanup - Removes core AD references
- Computer object removal - Cleans up the Domain Controllers OU
- DNS record cleanup - Removes all DNS references (most time-consuming)
- Replication - Ensures changes propagate
- Verification - Confirms the cleanup worked
While this process seems complex, it's essential for maintaining a healthy Active Directory environment in the event of an unexpected disaster.
The silver lining of my chkdsk disaster was tuning and refining the cleanup process. Now I have a solid understanding of how to properly remove failed domain controllers and the automation to do it effectively.