Wednesday, July 18, 2018

Browser video problem, media_error_unknown, ssl_error_rx_record_too_long

My wife recent brought her macbook to me with a problem that suddenly popped up. Pretty much every website that had embedded video in it, other than youtube, was suddenly not working. The media player plugins were showing various errors, one of which was media_error_unknown. I opened chrome's developer tools for a better look at the errors. There were a few that indicated there might be a plugin or extension that was causing problems with some of the content. I tried disabling a few extensions like ad blockers and other security related plugins to see if that helped, but it didn't. I found some suggestions that pointed to proxies, but no proxy was configured. Avast was doing some web filtering, but turning that off didn't help either. Checked a few other browsers besides chrome and all had the same issue. When opening the media's link directly, it gave a browser "site is insecure" type of ssl error page. Lastly I thought dns would be a good place to check as our home wifi pushes out cleanbrowsing.org's dns servers. Usually these filtering companies redirect blocked ssl to some other ssl site that would have an invalid cert. So after switching her dns to static with google's 8.8.8.8, everything was up and running again. So likely it was a mistaken classification of a content delivery network that cause this problem for news web sites and other normal content sites.

Thursday, April 5, 2018

FIM / MIM checking PCNS events for a specific user

The following script can be used along with some previous functions that I have written, AD object meta data check, and time functions.  This will look at the user's last password set time and the domain controller that the change was recorded on.  It will take the change time from the AD metadata for that last password reset, and use it to remotely search the domain controller's application log for the PCNS (password change notification service) events that match the user's SamAccountName.  Of course it will need to be run with an account that has remote WMI permissions to the domain controller, which will typically be domain admin unless you made some wmi permissions modifications to the cimv2 portion of the wmi namespace.

param (

                $samaccountname

)



#put some . link here for the time functions and meta data check if its not already in your profile



function get-PCNSEvents-inrange([string]$server,$time,$seconds,$username) {

                $myTimeRange = wmitime-timerange $time $seconds

                $filter = "logfile='application' and timegenerated>='" + $mytimerange[0] + "' and timegenerated <= '" + $mytimerange[1] + "' and sourcename='PCNSSVC'"

                $results = gwmi -computer $server win32_ntlogevent -filter $filter | 
      where {$_.message -match $username} |select -last 1 -exp message

                return $results

}



try {

                $pwdChangeEvent = show-adobjmeta -type user -name $samaccountname | where {$_.attribute -eq "unicodePwd"}

                if ($pwdChangeEvent -eq $null) {Throw "Cannot find user in active directory"}

                $eventtime = dt-toWMITime $pwdChangeEvent.ChangeTime

                $server = $pwdChangeEvent.originator.split(",")[0].replace("CN=","")

                $event = get-PCNSEvents-inrange $server $eventtime 10 $samaccountname

                if ($event -eq $null) {throw "No events found on domain controller around the time of the last password change."}

                $event

} catch { $_}

Thursday, March 29, 2018

IIS certificate completion failure 0x80094004

I had this error brought up to me recently as someone was trying to install a certificate that was issued and sent along with CA chain links.  Often certificate providers will include multiple formats as well as certificate chain information, which can lead to some confusion among application owners that aren't so familiar with all aspects of PKI.  Unfortunately this is usually necessary as many different systems have different requirements for setting up certificates, some of which include the need to manually import the chain.

This particular error suggests something is wrong with a detail on the certificate.  After discussing with the application team that was doing the install, it turned out they were trying to complete the request with one of the certificate authority certificates instead of the certificate that was issued for the CSR request.  After clarifying the matter, they were able to install the correct certificate without any issue.


Sunday, November 12, 2017

Linux mint 18.2 and Windows 10 dual boot

I recently bought a new desktop machine that came with windows 10 preinstalled.  I had asked the company doing to the build to create a 500gb partition for windows on one of the two hard drives for this build.  This would leave a significant amount of space on both drives to allow some cross disk partitioning for Linux and other uses.  Its been a while since I had to select a distro, and ended up picking Mint Linux.

Starting up the install process was fine with the bootable CD.  I partitioned swap and /tmp in two partitions on Drive#1 (same disk as windows) and another for / on drive #2.  All was fine until the bootloader was going to write and grub-installed failed.  I tried selecting a few other locations for where to put the loader, but it either didn't like those either, or the install process hung at some point.

I double checked the bios to ensure fast boot and secure boot were off (both were from the start).  Devices were set to UEFI boot mode in the bios as well.  After some further checking, I found the windows partitions were built in mbr/bios boot mode and that was screwing up grub.  I did the quick google search and found this article to be pretty straight forward for conversion to uefi.

I went back to windows, booted into recovery and tried the steps, but the validate came back with an error for the disk.  Looking around some more it seemed like the linux partitions that were created during the mint installation attempt may be the problem.  So I booted back into the live mint/install state and used the partition tools to wipe out all the linux partitions.  Booted back into windows, then recovery, tried steps again and everything worked fine this time.

One more reboot, back to mint installer, set up the partitions again and everything was smooth and successful.

Friday, October 13, 2017

Opensuse 42.2 to 42.3 upgrade booting into emergency mode

A few days ago I ran the normal "zypper up" on my 42.2 system and received a lovely update from nvidia for their G03 driver on the 4.4.27-2 kernel.  After this, I noticed vlc stopped working due to some plugin.  I rebooted my machine and no longer had X windows starting up.  The error was something about the nvidia driver not being able to load.  Many hours of trying to get that fixed, and try alternatives like Nouveau only got me to a graphic interface that couldn't seem to do more than a 800x600 resolution.  So I noticed there was a new distro update with different nvidia driver and kernel, so I gave that a try. 

Upgrade went ok and as usual (for the past 10 distro updates this machine has gone through), I expected some problems.  Usually its my bootloader pointing to the wrong drive and not being able to start up, but not this time.  For the first time, I had my machine booting into emergency mode with no real obvious errors from the output on the screen.  All file systems (root, /tmp, swap, windows partitions) were mounted as RW.  I had no networking, but otherwise could do pretty much anything form the command line except for starting a graphic interface.  Checking the output of the "journalctl -xb" command showed 2 errors for systemd targets, one for local-fs.target and one that I can't remember at the moment, but I think it was usb related or something else that looked like file systems. After googling around, I couldn't find anything specific to this problem, though a few mentioned looking at fstab for partition issues. I found one line in there which seemed suspicious given the errors in systemd

usbfs               /proc/bus/usb       usbfs       auto,devmode=0666     0 0

I commented this out, rebooted and all was good for boot.  Later installing the nvidia driver and downgrading the kernel to the version that the nvidia driver was created for resolved my original problem.  Spent some more time recreating my desktop environment for the new KDE version.  All together it was about a 12 hour recovery process.  So thanks Nvidia, you guys are awesome.

Thursday, October 5, 2017

Master list of Domain Join errors

This article is a collection of error messages from the domain join process, windows event viewer and general observations.  All of these were tested on a windows 2012R2 server joining to a single domain controller 2012R2 over a simulated router.  The domain is testforest.local and domain controller IP 10.1.1.50.  Various ports were blocked for each test and the results are recorded below.



Main Error Message on client: "An Active Directory Domain Controller (AD DC) for the domain 'test.local' could not be contacted.  Ensure that the domain name is typed correctly"



Situation: No functional dns.  That means, the client has no dns IP's configured, they are not valid dns server IP's, they are not accessible to this client, etc.

Sub Error Message when Details are expanded:

Note: This information is intended for a network administrator.  If you are not your network's administrator, notify the administrator that you received this information, which has been recorded in the file C:\Windows\debug\dcdiag.txt.

The following error occurred when DNS was queried for the service location (SRV) resource record used to locate an Active Directory Domain Controller (AD DC) for domain "testforest.local":

The error was: "This operation returned because the timeout period expired."
(error code 0x000005B4 ERROR_TIMEOUT)

The query was for the SRV record for _ldap._tcp.dc._msdcs.testforest.local

The DNS servers used by this computer for name resolution are not responding. This computer is configured to use DNS servers with the following IP addresses:

10.1.1.50

Verify that this computer is connected to the network, that these are the correct DNS server IP addresses, and that at least one of the DNS servers is running.

Steps to perform: Ensure the client is pointing to a valid dns server that can resolve this active directory domain.  Use of nslookup as a troubleshooting tool, or nltest /dnsgetdc: will help test connectivity.



Situation:  a RODC is accessible, however a RW domain controller is not accessible.  Your machine may be at a branch office with a local RODC that is handling dns queries, however the link connecting back to a writable domain controller is down.  Additionally this error could come up if the client has a functioning dns server to query that does provide answers, but due to some connectivity problem, the machine can't connect to a domain controller.

Sub Error Message when Details are expanded:

DNS was successfully querie for the service location (SRV) resource record used to locate a domain controller for domain "testforest.local":

The query was for the SRV record _ldap._tcp.dc._msdcs.testforest.local

The following domain controllers were identified by the query:
forest1dc1.testforest.local

However no domain controllers could be contacted.



Situation: Functional dns server, however the server doesn't cover this zone.  This means, the DNS server is accessible and is providing answers, however it cannot resolve anything in this Active Directory zone.  It does not host the zone, it does not forward to another server than can answer, nor does it do any recursion to find the answer.


Sub Error Message when Details are expanded:
Note: This information is intended for a network administrator.  If you are not your network's administrator, notify the administrator that you received this information, which has been recorded in the file C:\Windows\debug\dcdiag.txt.

The following error occurred when DNS was queried for the service location (SRV) resource record used to locate an Active Directory Domain Controller (AD DC) for domain "testforest2.local":

The error was: "DNS server failure."
(error code 0x0000232A RCODE_SERVER_FAILURE)

The query was for the SRV record for _ldap._tcp.dc._msdcs.testforest2.local

Common causes of this error include the following:

- The DNS servers used by this computer contain incorrect root hints. This computer is configured to use DNS servers with the following IP addresses:

10.1.1.50

- One or more of the following zones contains incorrect delegation:

testforest2.local
local
. (the root zone)

 Steps to Perform: 1) Ensure that the name typed in for the domain name on the client is the correct name, 2) check DNS infrastructure to find a server that is capable of resolving the active directory domain's dns zone.



Situation: Port 389 blocked (LDAP udp/tcp) 

Sub Error Message when Details are expanded:

Note: This information is intended for a network administrator.  If you are not your network's administrator, notify the administrator that you received this information, which has been recorded in the file C:\Windows\debug\dcdiag.txt.

DNS was successfully queried for the service location (SRV) resource record used to locate a domain controller for domain "testforest.local":

The query was for the SRV record for _ldap._tcp.dc._msdcs.testforest.local

The following domain controllers were identified by the query:
forest1dc1.testforest.local




## This ends the above section where the primary error message is domain controller could not be contacted.  In all three of these cases, there will be no prompt for credentials.


Error:  the RPC Server is unavailable

Situation: Block of port 135.  

What is seen:  User is prompted for credentials.  Domain join is slow but works eventually with a welcome to the domain error.  After the success, it may pop up "Changing the primary domain dns name of this computer to "" failed.  The name will remain "testforest.local".




Error:  Extremely slow domain join and everything else (boot up, logon, etc)


Situation: kerberos blocked (port 88 with DROP by firewall)

What is seen: Domain join still works but it is much slower, boot up is very slow, logons are very slow, GP update is very slow

Causes errors in system log
-lsasrv 6038  Microsoft Windows Server has detected NTLM authentication is presently being used between clients and this server....

-GroupPolicy 1055  Windows could not resolve the computer name

-TerminalServices-RemoteConnectionManager  1067   The RD Session Host server cannot register 'TERMSRV' Service Principal Name to be use for server authentication.  The following error occured: The system cannot contact a domain controller to service the authentication request.

-DNS CLient Events 8019.  The system failed to register host (A or AAAA) resource recortapter with settings:...

In the application log
-Winlogon 6006 GPClient errors


Situation: Kerberos blocked with icmp reject (port unreachable), same slowness


Error:  none

Situation: port 137 is blocked

What is seen:  prompts for cred, no problem in domain join, works quickly, no issues.



Situation: port 445 blocked

What is seen: Domain join works quickly, Boot speed is fine, and logon speed is fine. Gpupdate seems to work over port 137/139 (further blocking these ports breaks group policy with eventID 1096 in system log).  TCP 139 is the primary backup to 445 though the other ports may be required to get the connection started


Situation: port 3268  (AD global catalog) blocked

What is seen: No problem, fast join, no obvious problems after join



Situation: All ICMP traffic is blocked

What is seen: Join is fast, boot is fine, logon is fine.  Nothing significant seen here.  Firewall didn't catch any pkt drop.



Situation: Clock time of machine doesn't match domain controller (large skew >5min)

What is seen:  No problem in domain join.  System reboot, logon are all fine.  Clock time sync's after domain join reboot.

Error: "An Active Directory Domain Controller (AD DC) for the domain 'test.local' could not be contacted.  Ensure that the domain name is typed correctly"


 Sub error message in Details:

Note: This information is intended for a network administrator.  If you are not your network's administrator, notify the administrator that you received this information, which has been recorded in the file C:\Windows\debug\dcdiag.txt.

The following error occurred when DNS was queried for the service location (SRV) resource record used to locate an Active Directory Domain Controller (AD DC) for domain "testforest.local":

The error was: "This operation returned because the timeout period expired."
(error code 0x000005B4 ERROR_TIMEOUT)

The query was for the SRV record for _ldap._tcp.dc._msdcs.testforest.local

The DNS servers used by this computer for name resolution are not responding. This computer is configured to use DNS servers with the following IP addresses:

10.1.1.50

Verify that this computer is connected to the network, that these are the correct DNS server IP addresses, and that at least one of the DNS servers is running.

Situation:  all dynamic ports above 1023 dropped in both directions.

Causes: dropped dns traffic on return.  If return traffic/dns is working.... domain join is fine, boot is slow, logon is slow

System log:

Group policy 1053.  The processing of Group Policy failed.  Windows could not resolve the user name.  This could be caused by ...

Group policy 1055.  The processing of Group policy failed.  Windows could not resolve the computer name.  This could be caused by ...

TerminalServices-RemoteConnection Manager 1067   The RD Session Host server cannot register 'TERMSRV' Service Principal Name to be used for server authentication. The following error occured: The RPC server is unavailable.
.

Service control manager 7022  The Network Location Awareness service hung on starting.

Windows Remote Management 10154

The WinRM service failed to create the following SPNs: WSMAN/Slave1.testforest.local; WSMAN/Slave1.

Additional Data
 The error received was 1722: %%1722.

User Action
 The SPNs can be created by an administrator using setspn.exe utility.

Application Log - winlogon 6006  GPClient taking a long time