Since domain controller access is pretty heavily restricted for security reasons, there are often cases in large organizations where the teams who receive alerts on domain controllers may not be the team that manages them. The receiving team could be a helpdesk, server team, monitoring team, etc. With some basic tools that are part of the windows administrative tools and other microsoft provided support tools, you can perform a wide range of tests on the machine that will give you a pretty good picture of if it is functional.
First of all, DCDIAG.exe is a good command line tool for checking any domain controller's status. Even though some of the tests will not work with a non-privileged account, you can still see some of the most important status results for the server. For non privileged accounts, they can use this to avoid the access denied failures (or they can just visually ignore those):
dcdiag /s:DCServer01 /skip:frsevent /skip:kccevent /skip:systemlog /skip:sysvolcheck /skip:netlogons /skip:replications /skip:services /skip:dfsrevent
If the advertising test passes, the domain controller is likely functioning pretty well (at least soon after a boot). Other than that, you can test LDAP responsiveness. Using portqry.exe will test the port (389 and/or 3268) and dump out the server capabilities that are advertised on connection. If you don't have portqry.exe, ldp.exe (gui tool) or powershell can be used to connect. You can check to see if the SYSVOL and netlogon shares are available with a simple: dir \\dcserver01\sysvol or dir \\dcserver01\netlogon command.
For hardware, it depends on manufacturer. Personally my experience is with dell servers and open manage. Wherever I go, there seems to be this idea embedded in the heads of server support teams that they can't check hardware on a domain controller because they are not domain admins. OMSA allows anyone to log on with user level access and you can see the status of the components.
SCOM monitoring and other tools can provide details of active directory events and errors. Although some of these are based on single events and don't automatically close when everything is fine. They are an additional level of allowing status information to be available to lower level teams.
No comments:
Post a Comment