Your APT Anti-Hype
In the interest of helping you cope with the "APT" hype, I thought I'd offer a few observations and ideas about things you can do that might actually help. After all, it's too easy to point and shout "hype" - the truth is that there is a problem, and system and network administrators who are concerned with security do have to worry about long-term embedded penetrations in their network.
There are two primary approaches to Intrusion Detection and they both work. But, they work against different threats, for different reasons. One is the 'classical' IDS approach: know what attack looks like, and look for the attack. That's what most of the signature-based IDS do, and they're good at it and therefore they are useful. The second is the 'analytical' approach (what Richard Bejtlich, in his excellent books, calls "network security monitoring"): know what your network and systems usually do, and begin an investigation if you see them suddenly start doing something new. As with everything, there are trade-offs. Some people would say that the first approach has a problem of "too many false positives" although, seriously, if your network is carrying such a large amount of apparently hostile traffic that your IDS is constantly ringing off the hook, I think you've already got a serious problem. The second approach has the problem that "start an investigation" may be outside the purview, skill set, or energy level of many system/network managers - especially now that the typical system/network admin is chief cook, busboy, and bottle-washer all rolled up in one.
There is a third approach, which I've used effectively for the last 20 years, and it's based on a design philosophy of 'burglar alarms'. The idea of a burglar alarm is that, when you're not home and the alarm is engaged, the presumption is that people aren't walking around inside your house. Of course, you might have a house-resident dog who'd set off motion detectors - in which case your policy needs to be adapted to your operational reality. Since my dogs live outside (and, weighing 150lbs each, they act as my first line of defense) I can have a policy of "no movement in the house" but one of you might find it more effective to put magnetic reed-switch alarms under your stereo gear, on some perimeter and interior doors, and in the lid of your wife's jewelry-box. In other words, your detection is based on identifying a set of events that should not happen, and looking for signs of their occurrence.
When you're looking for APT, what are some of the things that shouldn't happen? You shouldn't see command/control traffic for a remote control program. But what would it look like? Your conventional IDS might catch the 'usual' botnet controls, or you might be able to identify patterns in your firewall's egress logs using network security monitoring techniques. The burglar alarm model might be to have something specifically looking for new machines opening connections to the outside at times when they haven't, before. For example, if you are pretty sure that your employees are not at their desks after 10:00pm, you ought to be able to look for events that are not auto-updates taking place between 11:00pm and 5:00am from desktop machines that are usually doing usual desktop stuff during waking hours. Or, you might hypothesize that someone probing to identify servers in your server cluster might try accessing all the IP addresses in a particular subnet. I don't know about you, but during the course of a given week, I never try all the IP addresses on my local network. In fact, a couple of years ago I did a test and discovered that the error-rate, in which one of my systems tried to connect to a non-existent IP address, was under 5 events per month. Your mileage may vary but if you were watching for ARP failures and noticed your ARP failure rate had just quadrupled and all the failures were generated by a single desktop, what might you conclude?
My friend Ron Dilley has a passive DNS collector that he wrote, which he used to accidentally detect Conficker a long time before anyone else knew about it. Whenever Conficker tried to connect to one of its random command/control URLs it triggered a DNS lookup for a nonexistent address; the result was a sudden and unexplained ramp-up in DNS failures -Ron's detector was configured to flag when there were more than 2X the 'usual' amount during a given time-period. If you think about that for a second, you'll realize that approach might detect all kinds of naughty things. Some administrators might just choose to re-image the affected machines, others might take the time to investigate. What's important is that one site was able to easily crush a new attack tool before the community of 'good guys' even knew about it.
I did some research back in 2002, attempting to use outgoing firewall logs to model which desktops had a human behind the keyboard doing typical human stuff. In the course of doing that, I discovered that one of my highly-paid software engineers was basically playing Diablo II 40 hours a week. But if you look at a desktop machine and see it do one 'browse activity' after another, without more than a 5 minute break, you can probably assume that someone is behind the keyboard. Now, suppose that the human-generated network traffic dies off - except for a single SSL event every couple of hours
Simple burglar alarms are insanely effective. If you know that (by design) your web server never does a "select * from some table" then why not have a log analysis subroutine watching the server's log for exactly that? (If you let your DBAs turn off SQL logging 'for performance reasons' you deserve a spanking with a clue-by-4). If you have a small server lying around that exports a file system with a bunch of PDFs that nobody should ever try to access, you've got a simple honeypot/burglar alarm. Burglar alarms go a ways toward dealing with the "false positive" problem because they should only go off when something is really, truly wrong. You still don't know what's wrong, but you're looking for second-order effects of something having gone wrong. It's why nuclear power-plants don't just have 'has this particular pipe blown?' detectors -they have general 'radiation leak' detectors. Looking for specific failures in progress is your last line of defense against innovative attacks.
The bad guys have generally got the initiative on us because they're being innovative and are coming up with new attacks all the time - and we keep meeting them with the same old defenses. The way to catch the APTs is to meet them with unexpected defenses that they've never heard of before. I'll write more about this later.
'Till then, good hunting!