Friday, May 16, 2014

New DC: DFSr trying to replicate with the wrong server

I recently had a problem with a newly promoted domain controller not finishing its initial synchronization of the SYSVOL partition via DFS-r.  Checking the event logs showed a series of connections, 5004 event, followed by 5014:

The DFS Replication service is stopping communication with partner FARFARAWAY for replication group Domain System Volume due to an error. The service will retry the connection periodically. 

DFSR Event ID 5014

Additional Information: 
Error: 1726 (The remote procedure call failed.) 
Connection ID: C526A9D5-6694-4A4D-AF89-EA943200461F 
Replication Group ID: 79993E0B-57C0-49CA-9BA0-FE0D62ABB93E

The server it was trying to connect to is not in the same site, nor the next closest site.  It was several sites away and poorly accessible due to bad network connectivity.  So the RPC failures were the obvious result.  Checking AD sites and services, the connector for the domain controller showed it connecting to a server in the next closest site, which is what was desired.  After a few reboots and playing around with sites and services, it still kept trying to connect to this far away domain controller.  After some digging around in the registry, I found the distant domain controller's name in:

HKLM\System\CurrentControlSet\Services\NTDS\Parameters

under a key called "Src Root Domain Srv".  This is the initial domain controller that dcpromo tries to use to do the initial replication of the domain controller.  After manually editing this and restarting DFSr, it connected to the server that I wanted it to, and finished the synchronization.

For DFS-R, there is also another entry that you will find in HKLM\System\CurrentControlSet\Services\DFSR\Parameters\SysVols\Seeding Sysvols\(domain-name), called Parent Computer.  Updating this to a desired replication partner and restarting the dfs-r service will cause the machine to try to replicate with the specified machine.

This server had been build by an unattended file that was generated by script.  Checking that, I realized that it wasn't providing a value for: ReplicationSourceDC.  So the dcpromo job was probably just grabbing any random RWDC in the list to use as a partner.  So to fix it, I just added a manual discovery process of the nearest domain controller to provide a value to the file:

$srcdc = Get-ADDomainController -writable -ForceDiscover -discover|select -expand hostname

No comments:

Post a Comment