[Shield] Free Daemon Consulting, LLC [Todd]
/home     /rates     /goals     /tech     /news     /contact     /hosting     /links     
/de
/el
/en
/es
/fr
/it
/ja
/ko
/nl
/pl
/pt
/ru
/zh

The first real content in this area is everybody's favorite: spam!

Seriously, though, I do not profess to be an expert on the topic, but I do have several pieces of interesting technology in place (read software solutions) that may be of interest to some of you. In my anti-spam arsenal, I have:
  • sendmail
check for a message-id header, etc
  • procmail
is it to me? pipe to spam filters, or mail folders
  • bmf
a learning mail filter
  • relaydb
which relay is bad?
  • pf
redirect known spammers to spamd
  • spamd
sandbox/tarpit yet leave a reason why in the reject message
I also have some special email addresses that have a direct line to being classified as spam:

So in general, my solution at my own apartment flows like this:
  • Incoming email is either accepted by sendmail, not a known spammer yet, or redirected via pf to spamd, sandbox away!
  • mail to me is filtered through procmail, which determines via bmf if this seems to be spam, or not
  • spam gets shoved into a mail folder named bulk.spam-YYYY-MM
  • if the email recipient is one of the above emails, I shove this into a special sendmail queue that re-classifies non-spam as spam (training the filter), and also gets stored in bulk.spam-YYYY-MM
  • if the mail is 'to' or 'cc' me, it gets into my inbox
  • otherwise, procmail dumps the mail into one of many multiple mailing list folders
  • as I read my mailing lists, or inbox, and I find that spam has filtered through the above mechanisms, I have a few macros defined in my .muttrc that are of use:

    index S "|/usr/sbin/sendmail -L sm-spamd-queuer -C/u/todd/etc/mail/sendmail.cf todd@spam.fries.net\ns=spam-`date +%Y-%m`.bz2"

    macro index X ";|formail -s /usr/sbin/sendmail -L sm-spamd-queuer -C/u/todd/etc/mail/sendmail.cf todd@spam.fries.net\n;s=spam-`date +%Y-%m`.bz2"

    macro index A "|/u/todd/bin/pipetogoodprogs\ns"

    macro index V "|bmf -t\n"

    ... without going into too many details, the above .muttrc macros allow me to press 'S' on a specific message that is spam, and add it to my 're-classify' queue. If I tag a bunch of messages, X does the same thing, only on all of the tagged messages. If I go into my 'bulk.spam-YYYY-MM' file and notice something that should not be flagged as spam, I can tap 'A'. If I am curious if bmf will classify a particular message as spam or not, I can tap 'V'.

  • The scripts above (pipetogoodprogs and the sendmail queue) reclassify email as good or bad, respectively. This is necessary because if a message is 'learned' to be good when indeed it is bad, this results in spam in my inbox. Or visa versa. The programs I pipe to are bmf and relaydb.
  • bmf is a Bayesian mail filter. Suffice it to say, it counts frequency of words in good and bad email messages, and uses these frequencies too determine messages that are good or bad. When classifying a message, it also learns by adding the counts to good or bad tallies. Thus a feedback mechanism allows a much more flexible method than a static set of rules. At one point in time, when I intiially tried it, spamassassin only had a static set of rules. I hear it uses Bayesian filtering as well now. C'est la vie!
  • relaydb (from the man page) is a mail header analyzer that builds a database of IP addresses either known as legitimate senders or spammers. I invoke relaydb when it is known that a particular mail is spam, or not. I also invoke relaydb when I am re-classifying email. This is how it is meant to be used.
  • spamd is fed a blacklist from spews and relaydb. The whitelist in relaydb is used to remove any addresses that are 'known to be good mail relays'. I have my own personal whitelist, hosts I always want to receive mail from, and my own personal blacklist, hosts that have annoyed me and I never wish to receive mail from them again.

So there you have it. From pf to sendmail or spamd, to procmail, to bmf and classification, to relaydb, and then feedback to pf/spamd. This is my setup.

Any questions?

Todd T. Fries todd@fries.net

Valid HTML 4.01! vipower Valid CSS!