More on dealing with image spam

Published: 2007-02-05. Last Updated: 2007-02-06 20:23:19 UTC
by Jim Clausing (Version: 3)

During my last shift on 15 Jan, I did a story on dealing with the image spam that I was getting on the little mail server I run at home. I got quite a few excellent responses to that story, so I wanted to summarize those and share them with our readers. My thanx to Steve, Dave, Tim, Alexander, Joanne, and John (I hope I didn't miss anyone).

dspam

Several people suggested looking at dspam. Some people said they had given up on SpamAssassin and gone strictly with dspam. I've added dspam to the mix, and mostly get pretty good results. The biggest problem I'm seeing with dspam is that it still is not detecting some of the image spam that takes its text from legit sources on the internet. FuzzyOCR and some of the blocklists seem to catch most of these, but even feeding all the false negatives back through dspam for training, some are still getting through. Having said that, I like dspam and will definitely keep it in the mix. I've had a suggestion (that I haven't tried yet) that I should run dspam outside of amavisd-new rather than from within it which is how I am running it now.

clamav

Steve suggested I take a look at the clamav phish and scam rules from sanesecurity.com which can be found here. I haven't tried them out, yet. If you do, let me know what you think.

greylisting

I didn't mention it, but I do, in fact, do greylisting using gld (readers also suggested postgrey and sqlgrey) in my postfix setup. Unfortunately, because most of the addresses that receive mail on my server are forwarded from elsewhere, and those other sites have already accepted the e-mail, greylisting is only moderately useful in my personal situation, but I recommend trying it out. I also should note that because I sometimes *want* to get spam and viruses at some of these e-mail addresses (including my isc.sans.org address), I turn off spam and virus filtering at these forwarding services. If your job (or hobby) doesn't include playing with malware, leaving that filtering turned on might save you from some of the problems that I've been seeing.

DNS blocklists

Several folks suggested the blocklists such as the Spamhaus sbl+xbl list. I actually have those configured in postfix and I have the DNSBL SpamAssassin rules (25_uribl.cf) enabled. As with greylisting, the postfix use of the blocklists doesn't help if another MTA has already accepted the mail and is forwarding it to me, but the SpamAssassin usage then increases the score if it detects those source IPs in the Received: headers.

block dynamic IPs

This argument tends to take on the tone of religious argument and I'm not going to rehash that all here. Yes, I'm aware that most spambots seem to be infected home machines and that if I rejected all mail from them and/or if ISPs blocked outbound e-mail from them that would greatly reduce the problem. It would also punish people like me who have a domain website and e-mail (very low volume) hosted on my home system connected to the internet via cable modem. Having said that, some of the DNSBLs discussed above, do, in fact, block e-mail from dynamic IP ranges. Also, as noted above, that isn't quite as useful in my particular case as it might be because of the forwarding.

block all gif images

One suggestion was to block all gif images (either block e-mail containing them or strip them from the e-mail). This is another suggestion I haven't tried and probably won't in the near future. There can certainly be some backlash and/or collateral damage with this one, but since I'm reading my e-mail as plain text, I wouldn't really miss most of those images. One reader suggested that there was some fallout because of the company logo gifs getting dropped, so this person adjusted the rules to block gifs over a certain size. Of course, if you drop gifs, what about jpegs? other image types? mis-identified image types?

playing with SA scores for mailing lists

Finally, another reader commented that they were able to cut out some of the mailing list spam by some judicious playing with the scores assigned by SpamAssassin. This amounts to, giving mail to the mailing list an initial negative score (assume that most mail to the list is not spam) and then giving it an additional higher score if the Bayes tests show it is likely to be spam (e.g., add back another few points if it hits on BAYES_95 or BAYES_99, etc.). As a result of discussions with this reader I joined the spamassassin-users list and have had to tweak some of my own scoring to deal with (half-)false positives on that list. Imagine, a mailing list that deals with a tool from assassinating spam, might actually include samples of spam. Doh!

Update: (2007-02-06 14:30 UTC) Yves e-mailed me and pointed out that I had forgotten to mention "nolisting" in addition to greylisting. Both greylisting and nolisting take advantage of the fact that most spambots only make 1 attempt to deliver the e-mail and then they move on. I had read about this in the last two weeks and find the idea interesting, but have not (and probably will not) implemented it in my own setup for several reasons that I won't go into here. The page that I saw that describes this is here.

Update 2: (2007-02-06 19:55 UTC) W.r.t. DNSBLs, Nathan wrote in to recommend the new spamhaus zen BL (see http://www.spamhaus.org/zen/). He also recommended looking at XOCR and XWall for those who need to do the spam filtering on a Windows platform. Another reader challenged the assertion that most spambots only make a single attempt to deliver the e-mail. It was only a matter of time before the bot authors started making multiple attempts to thwart greylisting/nolisting. I still haven't seen much of this behavior personally, but I don't doubt that it will be happening. Tony mentioned that in his SOHO set up, he simply blocks all messages with 'Content-Type: multipart/related;' headers and hasn't seen any issues with this (YMMV). John wrote in to recommend ORF (Open Relay Filter) which appears to be a commercial product that he says is "dirt cheap". I haven't taken a look at this one yet, but info can be found here. And finally, Frank said that he blocks all GIFs inbound with qmail, informing the sender to resend without GIFs or to zip them up and (in answer to my question above) hasn't seen anything but GIFs used for the image spam (because of size and that is what the spam toolkits create). He also puts all e-mail with certain character sets in the subject directly into the spam folder. I also do this (especially for Cyrillic, Kanji, etc.) since I don't read the languages. Again, a huge thank you to all the readers that have helped out with this story.

Jim Clausing, jclausing ++at++ isc dot sans dot org

Keywords:

0 comment(s)