Wednesday, March 12, 2014

Parsing DNS Debug logs (microsoft)

I have played around a few times with methods of parsing the ugly data lines that come with Microsoft DNS Server's DNS debug log. Due to the differences in types of queries, there is no fixed number of "columns" defined by spaces. Since this is the delimiter, it causes issues in parsing. Besides that, there is the messed up hostnames in the query values that replace the periods with a parenthesis and length of chars in the following value. As logs can get quite large, trying to parse these with powershell can have mixed results. Sometimes it works ok, other times you watch the process grow to several GB of memory utilization and nothing is happening. So, to find a better way, I thought I would dust off the old Unix Shells by Example book and use some gnuwin32 versions of grep, awk and sed to take care of this file. In order to get down to the raw information that I care about, I'm looking at queries received by the server, the source IP, type of record being searched, and the hostname being looked up. To get this I came up with this to transform to csv output:


   grep.exe Rcv c:\temp\dns.log |grep " Q " | gawk -v OFS="," "{print $8,$14,$15}"| sed -n "s/([0-9]*)/./gp"|sed -n "s/\,\./,/gp"|sed -n "s/\.$//gp"

The $8,$14,$15 numbers represent text columns and you may need to adjust this based on output. Also the number of columns may be inconsistent as the data that shows up between the brackets is not always consistent in the log. You can use notepad++ to do a regex find/replace using \[.*\] to clear this out first. Once columns are aligned this output can be dumped to the script, but if you try to put a redirector to dump to text, it will do it, however it seems grep will give you an infinite loop of errors.  So to work around that, you can split this up into two commands.

First use grep:
   grep.exe Rcv c:\temp\dns.log |grep " Q " > temp.txt

Then:
   gawk -v OFS="," "{print $8,$14,$15}" temp.txt | sed -n "s/([0-9]*)/./gp" | sed -n "s/\,\./,/gp" | sed -n "s/\.$//gp" >output.csv


If you want to add the name of the dns server, you can put an extra sed command right before the output rediection
   sed -n "s/^/%computername%,/gp"
if you run it locally, otherwise put in text or some other defined variable there


Additionally you can play with the output, such as looking for source IP's
   awk -v FS="," "{print $1}" output.csv|sort |uniq -c
To get a list of unique client IP's and number of queries


Don't try to run this in powershell. Run in cmd or as a bat file, collect the csv and then you can import to powershell to play around with grouping or whatever you might want to do to see client behavior or records being queried.  If your file is large (I was testing with 200MB), you still won't want to try import-csv in powershell or your machine will grind to a halt.

You can use powershell to try to convert your source IP addresses to hostnames with reverse dns. Copy the text, dump to a variable, split by new-line, run through a foreach loop with: [net.dns]::GetHostByAddress($_).hostname

Additional reference and tools:
1) Gnuwin32 utilities, *nix tools for windows:  http://gnuwin32.sourceforge.net
2) Parsing logs other DNS logs  http://isc.sans.edu/diary/A+Poor+Man%27s+DNS+Anomaly+Detection+Script/13918
3) Reasons why this can be important: https://media.defcon.org/DEF%20CON%2021/DEF%20CON%2021%20video%20and%20slides/DEF%20CON%2021%20Hacking%20Conference%20Presentation%20By%20Robert%20Stucke%20-%20DNS%20May%20Be%20Hazardous%20to%20Your%20Health%20-%20Video%20and%20Slides.m4v

1 comment:

  1. I tried using this again on a dns debug log from a 2012R2 system and the fields don't seem to line up the same anymore. So you need to adjust the print $8,$14,$15 to $9,$15,$16

    ReplyDelete