[Rspamd-Users] Autolearn BAYES does not want to work

Wed Jun 26 18:24:15 UTC 2024

Hello Gerald,

yes, I've always had problems with BAYES autolearn.

It "worked" for a while, but I'm not quite sure what it does. The 
weirdest things happen.

I tried your suggestion
autolearn = [-0.5, 4];
That type of entry doesn't work for me at all.

This is how I've got it working now:

learn_condition = 'return require("lua_bayes_learn").can_learn';
autolearn = true;
autolearn {
   spam_threshold = 6.0;
   junk_threshold = 4.0;
   ham_threshold = -0.5;
   check_balance = true;
   min_balance = 0.95;
}

1. But it's still not clear when an incoming email is automatically 
classified as spam or ham. Does it matter whether pre-filter is used in 
multimap? Can you also set the learning somehow by hand, for example if 
you set autolearn = True in a multimap in Whitelist_Domain. So 
everything that is white listed is automatically learned.

2. It is not possible to control what is contained in the redis 
database. Is there a way to edit and adjust this? Wouldn't MySQL be 
better? I don't have that many emails.

3. I set the lifetime of BAYES entries in classifier-bayes.conf to 20 
days and waited until the 20 days had passed. Now I have reduced the 
time to 10 days, but no older entries are deleted from the database.

Repeated spam that has RBL, Fuzzy, Neural, Multimap, SPF, DKIM entries 
and a score of +40 is still marked as BAYES ham.

It looks as if only the content of an email, without headers, is used 
for the BAYES evaluation. Or can this still be adjusted?

Christian

Am 25.06.2024 um 17:12 schrieb Gerald Galster:
>> I'm currently having the problem that not all incoming emails marked as "HAM no action" and "SPAM add header/rewrite subject" are being trained and included in the BAYES statistics. Most of the emails that come in I have to learn afterwards with rspamc learn_ham or learn_spam.
>>
>> Why could it be that they are not being trained?
>> Autolearn is active.
>>
>> autolearn = true;
>> 	autolearn {
>> 	  spam_threshold = 6.0;
>> 	  junk_threshold = 4.0;
>> 	  ham_threshold = -0.5;
>> 	  check_balance = true;
>> 	  min_balance = 0.95;
>> 	}
> 
> Is this the same problem you reported in March?
> 
> https://lists.rspamd.com/pipermail/users/2024-March/003211.html
> https://lists.rspamd.com/pipermail/users/2024-March/003217.html
> 
> Otherwise see:
> https://rspamd.com/doc/configuration/statistic.html#autolearning
> 
> You could try to remove your autolearn config and replace it with
> something like
> 
> 	autolearn = [-0.5, 4];
> 
> Best regards,
> Gerald