EmailTalk.org Blog

Bayesian Spam filters

24 December 2008  |  Filed under: Spam

The contents of emails that spammers have been sending within the past few years have been evolving with incredible craftiness. These emails seem to surpass simple-minded spam filters that we blatantly continuously install. We read spam, curse spam, and have come to hate spam, but what if there was an existing spam filter that evolved along with the endlessly developing spam, leaving it one step ahead every time?

What is the origin of Bayesian Spam Filters?

There was a man that lived from 1702 to 1761 who’s name was Thomas Bayes, an English Presbyterian minister and mathematician. After he died the Royal Society published one of his most important findings in 1763 as the Philosophical Transactions. His findings simply stated that if a deadly disease existed such as Cluvitis (non-existent) and the symptoms were fever, runny nose, toothache, and more, just because you manage to get a runny nose that doesn’t mean you have Cluvitis. However, if you were to acquire another series of symptoms at equal intervals such as a fever, then this would greatly increase the chances that you might have Cluvitis.

Bayes Law tries to outline the exact probability that you may test positive for Cluvitis. Now if you were to embrace the same concept and fast forward a couple hundred years later into the future (precisely to 2002) you’ll find yourself basking in Paul Graham’s proven theory that applies this same strategy and concept to Spam Filters, thus the Bayesian Spam Filter was conjured up.

Graham developed a concept that showed how Bayes Law could be applied to actually find and categorize legitimate emails. Since one of the major factors of the Bayesian Spam Filter is that it evolves as the spam itself becomes more advanced, the more spam messages as well as legitimate emails that the filter receives the higher the changes are of calculating and finding illegitimate emails multiply. Simply said, the more emails it receives, the more accurate it becomes.

At first you’ll have to mildly train the filter in order for it to effectively decipher what legitimate and illegitimate email is. After the first couple emails that this process was widely applied to, the Bayesian Spamming Filter will take care of the rest. The reason Bayesian email filters work is because these filters aren’t dependant in any way on invariable change. It doesn’t depend on the spelling or grammar, it’s main focus is on the things that are being said within the email!

For example, if the word shopping extensively appeared within non-spam emails, the obvious chances are that it isn’t a spammed email. This is all thanks to the Bayesian Spam Filters ability to automatically adapt when need to be. Now we’re going to enter into how your Bayesian Spam Filter can come to disappoint you. The only bad thing about this spam filter is that well simply said it’s not indestructible (as most of our childhood dreams seemed to have been at one point and time), yes it has an incredible ability to evolve and adapt as time passes, however a spammer can still make it past well-trained Bayesian filters if they’re able to wittingly (and annoyingly) mask their emails to perfectly look like regular emails.

Although, since spammers aren’t accustomed to sending out perfect emails, there is still a slight chance that an email won’t get passed the great Bayesian guard.

No Responses so far | Have Your Say!

Leave A Comment