[ Date Index ][
Thread Index ]
[ <= Previous by date /
thread ]
[ Next by date /
thread => ]
I had a look at the old "Writers' Worbench" because rather than the spelling checker I wondered if grammar was the way to attack them. Bayes has difficulty with the spams that have large numbers of unconnected words - perfectly spelled, but perfectly meaningless. Adding a grammar check, maybe an upper limit on the Gunning Fog index or on incorrect sentences, would take them out very reliably, leaving the group that have a long piece quoted form a book as their masking. Maybe one area of attack is to tackle each MIME section individually, and reject if any of them are rubbish. On Wednesday 14 April 2004 13:04, Simon Waters wrote:
Adrian Midgley wrote:What is amazing is how quickly people can do it though, isn't it. We only need to look further than the subject in maybe 10% of messages that make it through the Bayesian filter and other protections, and even when we do it takes only moments.As opposed to Dave's 500 a second ?!
500 a second that the machine can categorise... we get better at the difficult end, on the population it failed to correctly categorise.
The Wetware is also more expensive and harder to fix when it malfunctions
cheap, and can be produced by unskilled labour in 9 months... -- Adrian Midgley (Linux desktop) GP, Exeter http://www.defoam.net/ -- The Mailing List for the Devon & Cornwall LUG Mail majordomo@xxxxxxxxxxxx with "unsubscribe list" in the message body to unsubscribe.