When the time comes to rebuild your domain controllers—whether for a major OS upgrade, hardware refresh, or infrastructure modernization—the process can feel daunting. I've created an interactive runbook that walks you through this complex procedure step by step, ensuring minimal downtime and maximum reliability.
Visuals of the Runbook
This is the interactive runbook, you can see all the sections in a menu style system:
Then if we expand the "Quick Index" options you can see clickable links:
Then we can look at the FSMO transfer and NTP configuration commands as below:
Then finally for the rebuild instructions for each Domain Controller you can see that below:
Runbook : Is there a live interactive version?
Yes, there is you can view that version here
Why Domain Controller Rebuilds Are Complex
Domain controllers are the backbone of your Active Directory environment. They handle authentication, authorization, DNS services, and critical FSMO (Flexible Single Master Operations) roles. A poorly planned rebuild can result in authentication failures, broken trust relationships, and significant business disruption.
The interactive runbook I've developed addresses these challenges by providing a structured, collapsible format where you can focus on exactly the section you need without getting overwhelmed by the entire process.
Understanding FSMO Role Distribution
In the runbook, you'll notice that one domain controller (GRIMCLAW in our example) holds all five FSMO roles initially. This centralized approach simplifies the rebuild process, but your environment may be different. You might have FSMO roles distributed across multiple domain controllers for redundancy or performance reasons.
Before starting any rebuild, document your current FSMO role distribution:
- Schema Master (forest-wide)
- Domain Naming Master (forest-wide)
- RID Master (domain-specific)
- PDC Emulator (domain-specific)
- Infrastructure Master (domain-specific)
Use netdom query fsmo to identify which servers currently hold these roles, as this will determine your rebuild sequence.
Critical: PDC Emulator and Time Synchronization
One aspect that cannot be overlooked is the relationship between the PDC Emulator role and time synchronization. The PDC Emulator must always be configured as the authoritative time source for your domain. This is not optional—it's a fundamental requirement for Active Directory to function correctly.
The PDC Emulator serves as the authoritative time source because:
- It handles password changes and account lockouts domain-wide
- Kerberos authentication is extremely time-sensitive (typically allowing only 5 minutes of clock skew)
- All other domain controllers and domain members should synchronize their time from the PDC Emulator
Failure to properly configure time synchronization will result in:
- Clock drift across your domain
- Kerberos authentication failures
- Account lockout inconsistencies
- Event log errors related to time synchronization
- Potential replication issues
In the runbook, you'll see that when FSMO roles are moved temporarily during the rebuild process, the NTP configuration must also be transferred. When the PDC Emulator role moves back to GRIMCLAW, the authoritative time configuration must move with it. This ensures continuous, accurate time synchronization throughout the rebuild process.
Time Configuration Best Practices:
- Configure the PDC Emulator to sync with reliable external NTP sources
- Ensure all other DCs sync from the PDC Emulator (or via domain hierarchy)
- Use Group Policy to configure domain members to use the PDC Emulator for time
- Monitor for time drift and Kerberos-related authentication errors
Remember: every time you move the PDC Emulator role, you must also reconfigure the authoritative time source to follow it.
Critical Pre-Planning: Document Everything Non-AD Integrated
One of the most overlooked aspects of domain controller rebuilds is documenting services and configurations that aren't integrated with Active Directory. These components won't automatically replicate and will be lost during the rebuild if not properly documented.
DNS Zones to Document:
- Standard conditional forwarding zones
- Standard forward lookup zones (non-AD integrated)
- Standard reverse lookup zones (non-AD integrated)
- Any custom DNS records or configurations
File Shares and Local Services:
- Local file shares created directly on domain controllers (though this isn't recommended practice)
- Custom applications or services installed locally
- Non-standard network configurations
- Certificate templates or custom PKI configurations
- Any scheduled tasks or local group policies
Security Configurations:
- Local security policies that differ from domain defaults
- Firewall rules specific to the server
- Any third-party agents or monitoring software
The One-at-a-Time Approach: Minimizing Risk
The runbook emphasizes rebuilding domain controllers one at a time, and this approach is crucial for maintaining service availability. Here's why this methodology works:
- Service Continuity: With multiple domain controllers remaining online, authentication and DNS services continue uninterrupted. Client systems automatically failover to available domain controllers.
- Validation Points: After each rebuild, you can thoroughly test replication, DNS resolution, and FSMO role functionality before proceeding to the next server.
- Rollback Capability: If issues arise with a newly rebuilt domain controller, you can demote it and restore from backup without affecting the remaining infrastructure.
- Gradual Risk Management: By spreading the work across multiple maintenance windows, you reduce the blast radius of any potential issues.
The FSMO Role Recycling Strategy
The runbook introduces what I call "FSMO Role Recycling"—temporarily moving all FSMO roles to a secondary domain controller during the primary rebuild, then moving them back once the primary is validated. This ensures:
- No FSMO roles are lost during the rebuild process
- Critical forest and domain operations continue uninterrupted
- Role ownership returns to your intended primary server
- You maintain control over role placement rather than relying on automatic seizure
Interactive Runbook Benefits
The collapsible, section-based format of the runbook offers several advantages:
- Focused Execution: Expand only the section you're currently working on, reducing cognitive load and the chance of skipping steps.
- Role-Based Views: Different team members can focus on their specific responsibilities without being overwhelmed by the entire process.
- Validation Checkpoints: Each section includes validation steps to ensure success before proceeding.
- Copy-Paste Commands: PowerShell commands are formatted for easy copying, reducing transcription errors.
Planning Your Rebuild Project
Before diving into the technical steps, consider these planning elements:
- Maintenance Windows: Budget 4-6 hours per domain controller, though the actual rebuild time may be shorter. The additional time accounts for validation, unexpected issues, and rollback if necessary.
- Communication: Inform stakeholders about potential brief service interruptions during FSMO role transfers, even though these should be transparent to end users.
- Backup Strategy: Ensure recent system state backups exist for all domain controllers before beginning any work.
- Testing Environment: If possible, practice the procedure in a lab environment that mirrors your production setup.
Post-Rebuild Considerations
Once all domain controllers are rebuilt, you can consider additional improvements:
- Raise domain and forest functional levels to take advantage of newer features
- Implement any new security hardening measures
- Update DNS aging and scavenging policies
- Review and optimize replication topology
- Update monitoring and backup procedures for the new infrastructure
Conclusion
Domain controller rebuilds don't have to be overwhelming projects. With proper planning, comprehensive documentation of non-AD integrated components, and a methodical one-at-a-time approach, you can modernize your infrastructure while maintaining business continuity.
The interactive runbook provides the technical framework, but success ultimately depends on thorough preparation and understanding your environment's unique requirements. Take time to document everything, plan your maintenance windows carefully, and don't rush the validation steps.
Remember: in infrastructure work, slow and methodical wins over fast and risky every time.