136 CHAPTER 4 ADDING CONTRIBUTED MODULES with (Web file server)

136 CHAPTER 4 ADDING CONTRIBUTED MODULES with blatant advertisements for their products. They will comment on other people s content, sometimes masking their intent in a thin veneer of compliments before getting to the business of self promotion. They will use scripts to find the holes in your site and flood you with postings. Spammers show a lot of resourcefulness and absolutely no mercy. If given the chance, they will turn your web site into a wasteland of Viagra, poker, and payday loan advertisements. It is important to the health of your online community that you have a strong defense against this malicious activity. Just as people need to have means to deal with spam e-mail in their inbox, you need to have a way to deal with spam posts on your web site. You will find this defense in the Spam module. Detecting Spam The Spam module can be configured and trained to detect content of any kind that is considered spam, including comments and node types. The administrator has configuration options that allow the Spam module to automatically unpublish that content and/or notify the administrator. Up to four different mechanisms can be used to identify content as spam: the Bayesian filter, custom filters, URL counting, and the Distributed Server Boycott List. Bayesian Filter The Bayesian filter learns to detect spam by being shown content that has been identified as spam by the site administrator. The best way to describe this method is to quote Jeremy Andrews, the author of the Spam module. The Bayesian filter does statistical analysis on spam content, learning from spam and non-spam that it sees to determine the likelihood that new content is or is not spam. The filter starts out knowing nothing, and has to be trained every time it makes a mistake. This is done by marking spam content on your site as spam when you see it. Each word of the spam content will be remembered and assigned a probability. The more often a word shows up in spam content, the higher the probability that future content with the same word is also spam. As most comment spam contains links back to the spammer s websites (ie. to sell Prozac), the Bayesian filter provides a special option to quickly learn and block content that contains links to known spammer websites. For more information about Bayesian filtering, see http://en.wikipedia.org/wiki/ Bayesian_filtering. Custom Filters As the site administrator, you can define custom filters that increase the probability of certain words and patterns to indicate spam. The filters will cause content to be marked as follows: Blacklisted content will be definitely marked as spam. Whitelisted content will definitely be marked as not spam. Graylisted content is marked as either usually spam or usually not spam, increasing or y the Bayesian filter.
Note: If you are looking for best quality webspace to host and run your tomcat application check Vision shared web hosting services

Leave a Reply