Once again I’m rebuilding an anti-spam gateway. This time I’m puppetizing it as I go, so I wanted to take some time today to think about the design.
MTA (flame war #1)
About four years ago I built a personal mail server and used qmail. Before that I don’t remember what I used, probably sendmail. Qmail’s nice because it’s small and well designed, but the author had some RFC fixation and support for things like TLS had to be patched in. This qmail install was on gentoo though, and the emerge auto-patched about over 20 features in as it built it. I believe the idea was that these features wouldn’t make it into the official source, so they wouldn’t be in a binary build either. Pain in the ass really.
I do have memories of using sendmail. Actually, horrible dreams of youthful innocence being torn to shreds by m4. We’ll stay away from the beast.
A couple years ago I built an anti-spam gateway using postfix and it was easy enough.
Queueing
In the past I’ve used amavisd with postfix to run the clamav and spamassassin checks. This has worked by taking incoming smtp messages to postfix and routing them to amavisd on another locally bound port, which scans them and then redelivers them to another locally bound port. One neat thing about this design is you could have amavis running on seperate boxes, with one doing spam, one doing antivirus, and just route between them all, with the final one doing the delivery to the internal mail servers.
qmail had qmail-scanner-queue which tied all of this together in a way that looks similar to MailScanner, that picks up the messages in one folder and when its done leaves them somewhere else.
postfix uses content_filter to tie into antispam otherwise. The trouble with this is that it’s already accepted a message by the time it’s gotten all of this far.
When you decide something is spam, you can do a couple things. If you’re still in the SMTP phase, you can reject it before you accept it. I prefer this. Otherwise you’ve accepted it and you can delete it, return it, tag it (modify the subject), or grey list it somewhere. Option #1 is bad because it may not have been spam. #2 is bad because you have to generate a email message back to the sender address saying “We think this is spam” and if it was spam, whoever gets it is certainly not the person that sent it. This is better than #1 though because you get less support calls for disappearing email. #3 and #4 are annoying because you still have to look at the mail.
In the past I’ve used RBLs in postfix to reject mail, which gets a lot of spam, then tagging in spamassassin so it’ll filter into users JunkMail folders so at least they only look at it if they’re looking for something. This is probably acceptable still. Sometimes I’ll delete mail based on spamassassin score if it’s really high, because if someone sends you a legitimate email that gets a score that high, you probably don’t want to talk to them anyway.
With all the botnets sourcing spam now, I’ve found that a surprising amount of spam is squashed by using ‘reject_non_fqdn_hostname’ in smtpd_recipient_restrictions.