POPFile

I've been using POPFile as my new Spam filter. It uses Bayes Theorem techniques to build buckets for classifying email types. I basically set up two buckets, mail and spam and used 2000 legitimate and 2000 spam messages to train POPFile.

POPFile is a Perl script that works as a POP3 proxy. It uses statistical probability based on the training set to determine whether new mail is classified as mail or spam and tags messages with an altered subject or with the header X-Text-Classification. I use the latter method since Mozilla (my mail client) can filter based on mail headers.

So far, out of about 250 email I've had four false positives and one false negative. I'd rather have it the other way around but each false classification is collected by myself and re-inserted back into the proper training set.