[Rspamd-Users] How to improve Bayes effectiveness?
Andrei Goldchleger
agoldchleger at vbtec.com.br
Tue Sep 20 19:51:09 UTC 2022
Hi,
My rspamd deployment is now almost a year old (now on version 3.2), but
I cannot get the bayes classifier to work effectively and reliably. The
SPAM I get is pretty "dumb" - 95% is 5 text variations with a few words
changed. I have learn configured so that when a user moves a message
in/out of the spam folder it supposedly trains the classifier (i checked
the logs and as far as I can tell this is really happening). I would
expect that as soon as a message is learned, the following messages on
the same template would be correctly classified, but unfortunately I
cannot get a consistent behavior. I am resorting to domain blacklisting
as a stopgap but this is sub-optimal.
I know that Bayes can be quite effective since I used Thunderbird
embedded bayes classifier a long time ago and it was good. So I wonder
what I am missing. My configuration follows:
*****
classifier {
bayes {
learn_condition = "return require(\"lua_bayes_learn\").can_learn";
new_schema = true;
autolearn [
-3,
5,
]
backend = "redis";
cache {
backend = "redis";
}
expire = 2144448000;
tokenizer {
name = "osb";
}
statfile {
spam = false;
symbol = "BAYES_HAM";
}
statfile {
spam = true;
symbol = "BAYES_SPAM";
}
store_tokens = true;
signatures = true;
min_tokens = 11;
min_learns = 200;
}
}
*****
Thanks,
Andrei
More information about the Users
mailing list