I still think, DNS logs are one of the most overlooked resources for intrusion and malware detection. Frequently, command and control servers will use specific top level domains or host names, and due to short TTL values, infected hosts will frequently query DNS servers for these names. Additionally, DNS servers are overlooked choke points, which are as valuable to collect network wide data as firewalls and routers connecting the network to the internet. In this diary, I would like to introduce a simple shell script to answer one question that in my opinion is quite useful to detect anomalous DNS queries: Which are the top 10 new host names that we looked up today. First, you need DNS query logs, there are two ways to collect them: you could either enable query logging in your DNS server, or you could just use tcpdump on the DNS server to collect the logs. Query logging works fine for me, but it can put too much strain on a very busy name server. Running tcpdump on the name server, or a sensor monitoring the name server, may work better. We do not have to capture every single query for this technique to work. First, we need to summarize past queries. In my case, the query logs are rotated hourly, and saved in files with names like "query.log.*" (* is a number). A sample line from my query logs:
To extract the host names, and summarize them, I use the following script:
This will sort the output by hostname (sort -k2 sorts by the second column), which becomes important later. Next, I apply the same procedure to the current log: cat query.log| sed -e 's/.*query: //' | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > newlog Now, we need to find all entries in "newlog", that are not included in "oldlog". To do so, we use the bash command "join", which works pretty much like the SQL command join, but uses the two text files as input. It is important that the "join" column (the host name) is sorted, which was the reason for the -k 2 argument earlier. join -1 2 -2 2 -a 2 oldlog newlog > combined -a 2 will include all records from newlog that are not found in oldlog. "combined" now includes lines from both files, as well as the lines only found in "newlog". We need to remove the lines found in both files (which are identified by having two numbers): cat combined | egrep -v '.* [0-9]+ [0-9]+$' | sort -nr -k2 | head -10 In the end, we sort the host names by frequency, and return the top 10. To summarize the script for simple "copy/paste".I broke some lines up to a
The file "suspects" will now include the top 10 suspect domains. For added credit: add the ability to keep a whitelist. ------ |
Johannes 4479 Posts ISC Handler Aug 16th 2012 |
Thread locked Subscribe |
Aug 16th 2012 9 years ago |
If your top domains were fairly static, you could combine this with OSSECs ability to report on changes to automatically alert you when the output of a command changes (they have a sample script to monitor listening ports for instance).
|
Shawn 29 Posts |
Quote |
Aug 17th 2012 9 years ago |
Great idea!
For those running Bro [1] on Security Onion [2], I've modified the script [3]. [1] - http://bro-ids.org [2] - http://securityonion.blogspot.com/ [3] - http://code.google.com/p/security-onion/wiki/DNSAnomalyDetection |
DougBurks 6 Posts |
Quote |
Aug 17th 2012 9 years ago |
Actually, the "cat" is superfluous.
Should be: sed -e 's/.*query: //' $oldlogs | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > $tmpdir/oldlog sed -e 's/.*query: //' $newlog | cut -f 1 -d' ' | sort | uniq -c | sort -k2 > $tmpdir/newlog |
MikeDawg 4 Posts |
Quote |
Aug 20th 2012 9 years ago |
Sign Up for Free or Log In to start participating in the conversation!