[Rspamd-Users] Bayes questions and observations

Sat Mar 16 14:55:28 UTC 2024

On 16/03/2024 14:20, christian via Users wrote:
> Hello Vsevolod,
> thank you for your feedback signal.
> First of all: I'm a Rspamd beginner and still have a lot to learn. After 
> a few weeks, the filter results from Rspamd are already better than with 
> my old spam filter ASSP, which I used for a few years.
> 

>>
>> That's very interesting and I would like to investigate more. In fact, 
>> both SA and Rspamd are using more or less the same Bayes algorithm 
>> with some slight differences on tokenisation logic.
>>
>> If you have samples of misclassification, could you please do the 
>> following things:
>>
>> 1) Enable "bayes" debugging (add "bayes" to the list of 
>> `debug_modules` array in the local.d/logging.inc)
>> 2) Check all logs with tag "bayes" when you scan those messages and 
>> send them to me (probably via private email if there's some 
>> confidential data or large attachment)
>> 3) Send me both samples and your Redis dump so I can try to experiment 
>> with that
>>
>> Maybe (3) would be a huge overkill in terms of privacy and amount of 
>> data, so I would appreciate if you can do 1-2.
>>
>> Thanks in advance!

Again, could you please do what I have asked here? It might be very 
interesting to look at.

Another thing you could try is to use `rspamadm mime stat -b` command 
for a sample. Then, you will see the tokens Rspamd uses for that 
particular email. Afterwards, you can even check them in Redis using 
something like `HGETALL RS_<number>`, where `number` is printed by `mime 
stat`. You also will see this information if you do like I've asked in 
my previous email quoted above.