Spamassassin flagged something as spam that is not spam. How do I tell it so?

There is both specific and general advice that may be useful in this case.

Specific

The underlying problem here is that Garuda Airlines, bless their little cotton socks, are sending confirmation emails that bear many of the hallmarks of spam. The subject line is VERY SHOUTY, they send HTML-only emails which contain quite lot of images and very little text, the envelope-sender (tdsfndprd@amadeus.com) is pretty clearly a machine-constructed nonce, and the email provider for their (outsourced) confirmation system (amadeus.com) has a useless SPF record (despite all our advice to the contrary, some people mistakenly think there is value in a record that lists some of their sending systems and ends ~all).

There is not much you can do about most of this. If you want to be sure of these getting through, a line in your ~/.spamassassin/user_prefs that says whitelist_from *@amadeus.com will get these messages through to you. Going further and tampering with the weights of the rules that were triggered is probably a bad idea. The SpamAssassin (SA) ruleset is created by filtering a huge weight of spam, and working out what characteristics apply to most of it; you are likely to open your INBOX to a lot more than just Garuda confirmation emails by turning off those rules.

General

This is exactly the sort of situation the Bayesian engine handles well. It is designed to filter out email that doesn’t trigger the other rules but contains stuff you don’t want to read, whilst helping through email that does trigger those rules but contains stuff you do want to read.

IIRC, the engine won’t do anything if you’re not training it. The easiest way to train it is to maintain two folders, called (say) spam and ham. Into spam you put copies of email that made it into your INBOX but you didn’t want; into ham you put copies of emails that fell foul of SA but you did want, such as this confirmation email.

Then nightly (or so) you have a cron job that says

sa-learn --spam --mbox mail/spam
sa-learn --ham  --mbox mail/ham

modifying the paths accordingly. Over time, this will teach the engine what you do, and don’t, like to read. Since a high Bayesian score can add +4.0 points to an email’s SA score, while a low one can subtract 1.9, a well-trained engine can really help SA distinguish what you want to read from what you don’t – but you have to put the effort in to teach it.

Leave a Comment