Q. I’m used to seeing spam littered with strange spellings (“V!agra,” “Human Grgowth Hormgone”), but now I get messages with a string of nonsense words in the subject line (“bust brassiere manic”). What gives?
A. Spammers have taken up Dadaist poetry.
No, the truth is, those e-mails with oddly evocative subject lines (“convoy cantaloupe psyche grandpa,” “employer egregious household”) are the result of the spam industry’s attempts to overcome “Bayesian” spam filters.
Bayesian filters examine every word in an e-mail and rate the likelihood of each word’s being part of a spam message. According to Bayesian guru Paul Graham, the probability that the word “mortgage” is part of a spam will be high, but “lunch” will rate low on the spam-o-meter.
“When a new e-mail arrives, the filter combines all the probabilities of the words in it to arrive at an overall probability that it is a spam,” Graham said. Rejected e-mails are sent to a special folder.
By filling the title line and even the body of an e-mail with words from the verbal portion of the SAT test, spammers hope to trick Bayesian filters into thinking their e-mails are harmless. Graham said these ploys usually fail, not only because of the word checks, but also because information in the header field helps filters identify mass e-mails (and those misspellings are a giveaway as well).
Though some Internet service providers now use Bayesian filters on behalf of all their users, individuals who install their own filters may get even better results. As computer expert Brian Burton notes, “the biggest advantage of a Bayesian filter is that you can change it immediately” by telling the filter that a certain piece of mail was — or was not — a spam; the program will modify its parameters accordingly. “Our latest-generation software is now capable of better discrimination than a well-trained human,” said Richard Jowsey, director of the Death2Spam Project.
By the way, Rev. Thomas Bayes, for whom the filters are named, was an 18th Century English mathematician whose best-known work, “Essay Towards Solving a Problem in the Doctrine of Chances,” was published after his death — so he didn’t get any mail about it whatsoever.




