Last Updated: 2008-10-07 18:28:15 UTC
by Jim Clausing (Version: 4)
One of the sources we use to identify incidents is the network-based intrusion detection system (NIDS) that most of our enterprises have, at least at the border, at our known internet connections. The NIDS, however, can be pretty noisy, how do we turn the noise into actionable data? How much access does the incident handler have to the raw NIDS data? As Steve pointed out yesterday, the alerts from the NIDS are just events, they don't become an incident (usually) until these events have been correlated with other data. How do you use NIDS data to indentify incidents requiring activation of your IH process? Let us know via the contact page and this story will be updated throughout the day.
This is a great question, but I'm really interested in the answer to a related question: "How do you use non-NIDS data to validate NIDS alerts?" I don't have to tell you guys that the amount of data that comes from a single alert is sometimes very skimpy, and doesn't always provide good decision-making support.
As I evaluate an alert, I routinely ask myself a series of questions, then try to find the answers. In most cases, the questions are something like:
1. Was this an actual attack?
2. If so, was the attack successful?
3. What other systems may also have been attacked?
4. What activities did the intruder try to carry out?
5. What other resources were they able to gain access to?
6. How should we contain, eradicate and recover from the intrusion?
Most of these questions are difficult to answer just by looking at an individual alert, but I can usually answer them quite easily (and quickly) by examining sessions and/or PCAP data. Well, except for #6, which is usually pretty tricky.
I'm curious to know what your other readers are doing to validate their NIDS alerts, even before they feed into the incident handling processes.
So, what do you think? Keep the thoughts and ideas coming. Over the next couple of days, we will be looking at some other non-NIDS sources for identification, but there's no reason we can't start some of that conversation today.
When I analyze events in our NIDS system I use a simple and straight forward approach.
Turn noise into data:
1. I check the event for possible false positives by looking at the data in the payload of the captured packet (on B.A.S.E using SNORT for NIDS) and the rule itself to determin what property triggered the alert.
Access as an IH:
2. If I determin that its not a false positive I verify the packet payload using a TCPdump file that is set to capture the full snaplen of the packets from the network of the PC in question. I run both SNORT and TCPdump on the same systems; just different interfaces. I run this through Wire Shark and analyze the conversation. BTW, this process doesnt take very long, 10 mintues if that. One should not make their systems so complicated that it takes a PHD to find an analyze data.
Using all the data:
3. If I find that the event is positve and requires action I start logging in a notebook what actions and steps I am taking to find and clean the PC. In most cases this simply involves calling the Support Desk and having someone retreive the PC. The PC is swapped out and the infected PC is re-imaged or analyzed further depending on the severity of the incident.
4. I review what I logged and take action if needed; such as updating firewall rules.
This process is simplified and would change based on the severity of the event/incident. Take care to note that the esscelation process from event to incident should not be conveluded with the over anaylizing of 10 different log sources. If you have a PC that is questionable and the truth can not be assertained quickly then go with the gut and remove the PC from the network. Better to be safe than sorry.
* statistical analysis of firewall logs for larger attacks (which often pointed out attacks missed otherwise).
* looked at the source activity in the firewall logs, or destination/time if there's more than one source
* Looked at the web logs
* If was outbound, looked at traffic bouncing off the internal interface of the firewall AND the proxy logs.
* Whois'd the source
* Reviewed news about the source and/or data from the IDS / Other logs
An open source netflow like datasource.
We feed it into our ArcSight collector and sort and parse from there. It's invaluable as an IH tool.
Also feeding into it are firewall logs, VPN logs, and my Snort alerts.
In my experience, handling volumes of raw data in real time boils down to simple algorithms built into your processes or your technology.
For example, if you have no Linux systems, you can filter out Linux-specific concerns before you even see them. Ditto web services, VPN, whatever you do or don't want to see. If you don't serve up a website, what's the value in constantly examining that traffic? The data can still be logged, but you don't necessarily need it live.
On the flip side, as David asked, you can gather more correlating intelligence as well. This could include firewall logs, various server logs or Windows events. This is an example of building algorithms into your monitoring procedure. (bonus points if you can have the firewall and IDS correlate for you.)
The data can be skimpy, but sometimes all you need to start is an ip, or a port, or just a timestamp. Once you pull on the first thread, the rest begins to unravel.
You can also build more intelligence into alerts that you do want to see. For example, inbound traffic should be encrypted if it is coming over ports designated for VPN, SSL, IPsec, etc. Any non-encrypted traffic on such ports would be a "red flag". How you define such anomalies is up to you, but it is simple to think of a few things that are outside the norm. A little good intelligence is better than volumes of questionable intelligence.
At the same time, while you need to limit what you see in real time, you may want to keep more raw data logged for investigative purposes. Firewall logs are great here - not real-time monitoring, but supporting data for other events. They can be examined in near-real-time as required.
Of course all of this depends on your goals which you must first define. Getting a baseline of routine vs. anomalous traffic is different for every system. No matter what you do, if your goal is to filter raw data down to real information (whatever's outside your baseline), you need to spend extensive time and analysis on the front end first.