Tuesday, August 13, 2013

Kerberos SPN configuration errors for dummies

In a previous article, I had written about the problem of duplicate Kerberos SPN's (Service principal names) and how to identify them.  Since then, I notice a recurring theme in my life where application and database people typically don't understand authentication configurations at all.  As a result, accounts get swapped out, configuration changes are made without any thoughts to what will work, and so on.  In the end the whole application environment may have downgraded itself to NTLM or just stopped working altogether.  So, I thought I would take another shot at trying to simplify kerberos interactions for the typical application web server talking to a database server.

First of all, lets understand what kerberos is doing for us.  Authentication, is how we identify ourselves.  In the example WEB Server->SQL Server, it could be:

1) a service account on the webserver that is logging into the SQL server
2) The end user (at the browser) authenticating to the webserver and the webserver is set to log into the SQL server on the user's behalf (delegation)

Authentication uses protocols to ensure that the various applications and servers are all speaking the same language.  Typically this is NTLM, NTLMv2, or Kerberos v5.  Here we will focus on kerberos.

The way kerberos works is, you have a "Service" that you want to access.  This "Service" has a type and a host machine that it runs on.  Example:

1) Web service on machine Server1.mydomain.com.  In Kerberos SPN format:   HTTP/server1.mydomain.com
2) MSSQL service on machine ServerSQL1.mydomain.com.  In Kerberos SPN Format:    MSSqlSvc/ServerSQL1.mydomain.com

There are other variations that include port numbers and domain names, but to keep things simple we will stick to standard ports and windows services here.

So what is the SPN used for?  Lets look at it in less technical terms first:

John wants to call Amy on the phone.  Amy wants to ensure that the people who call her are really who they say they are.  To enable John to meet Amy's requirements, he calls Amy through a phone Operator.  The phone operator has a list of names (account), phone numbers (service) and passwords (secrets/keys) for everyone that calls through their system, including John and Amy.  John tells the operator his password and the number he is calling, the operator looks up the phone number and the operator gives him a temporary code to use for his conversation.  John gets through to Amy on the phone and tells her the code.  Amy uses a special program that takes her password and decrypts the temporary code that John got from the operator.  If she can decrypt the code, she knows that she is talking to John.

And now for the technical terms.  When a client connects to the service, they are told that they need to authenticate.  The Client connects to a KDC (Kerberos Key Distribution Center) and to request a ticket.  In the windows world, the KDC is a domain controller (Active Directory).  During a user's logon (or an application starting running under a service account), the user will log into the KDC to get a Ticket Granting Ticket (TGT).  When it wants to connect to a service, the user will sent a request to the KDC for a Service Ticket.  The KDC will look through its database to see what account holds the SPN for the service that the user wants to connect to.  If it can find one, it will issue a ticket that is encrypted to both the Requestor and the Account with the SPN.  The user will then take this ticket, send it back to the application that they are connecting to and the application will review the ticket to grant/deny access.  (see the previous article for the step by step)

The problem can come in at this point in several ways.  If the SPN was set up on the wrong account...then the ticket is encrypted to the wrong person.

Back to the non-technical example:

When John calls the operator, let us assume there was some bad information in the operators list of names and passwords.  The Operator then provides a temporary code that works for Susan.  When John gives this code to Amy, Amy can not decrypt the code and will have to reject the phone call.

In another form of this problem, if more than one person have the same phone number (duplicate SPN in kerberos), the operator may look up the wrong name.

To solve these problems, it is important to know


  1. What accounts (users or computer objects) are in use
    1. What service they run on
    2. What servers they are configured on
    3. Do they run services on non standard ports
  2. Is there delegation from one service to connect to another service (double hop)
  3. How does authentication from from end to end (have a diagram or documentation as many of the support people you end up working with do not know anything about your application)
From here you can search for the SPN's that would be in use to look for duplicates.  While searching for duplicates, you can find where the SPN's are assigned.  If the SPN's are assigned to the wrong accounts, then obviously it won't work.  Make sure you get things in the right places, and try to avoid changing things once it is set up and working.  Document, Document, Document, and update the document.  Avoid running multiple services and application environments on the same account.

Symptoms of duplication SPN's  (the same SPN was set on more than one account)
1) Log events on domain controllers pointing out the duplicate SPN
2) SCOM alerts from the AD management pack for the duplicate SPN alerts in #1
3) Application running NTLM authentication when it was configured for kerberos
4) Application working some times, and giving access denied at other times

Symptoms of incorrectly assigned SPN  (the SPN was assigned to the wrong account)
1) Authentication fails all the time.

Symptoms of missing SPN  (the SPN was never assigned, was misspelled, not entered correctly, or the SPN name doesn't match how users access the service [ex: short name vs fully qualified domain name])
1) Authentication fails completely
2) Authentication is using NTLM

Tools to use:
1) Queryspn.vbs script from microsoft
2) Setspn
3) Event viewer on multiple machines
4) klist (to view kerberos ticket, or lack of one after connecting to an application)
5) fiddler or some other similar web debugging tool that can show authentication details in the packets to show protocol type
6) Netmon to view kerberos KDC interactions to find any errors (SPN not found, encryption type not supported, etc)
7) Increased debug logging in microsoft OS.  Can turn on kerberos debugging for all machines involved.

1 comment: