[Rspamd-Users] Bayes questions and observations
Vsevolod Stakhov
vsevolod at rspamd.com
Sat Mar 16 14:55:28 UTC 2024
On 16/03/2024 14:20, christian via Users wrote:
> Hello Vsevolod,
> thank you for your feedback signal.
> First of all: I'm a Rspamd beginner and still have a lot to learn. After
> a few weeks, the filter results from Rspamd are already better than with
> my old spam filter ASSP, which I used for a few years.
>
>>
>> That's very interesting and I would like to investigate more. In fact,
>> both SA and Rspamd are using more or less the same Bayes algorithm
>> with some slight differences on tokenisation logic.
>>
>> If you have samples of misclassification, could you please do the
>> following things:
>>
>> 1) Enable "bayes" debugging (add "bayes" to the list of
>> `debug_modules` array in the local.d/logging.inc)
>> 2) Check all logs with tag "bayes" when you scan those messages and
>> send them to me (probably via private email if there's some
>> confidential data or large attachment)
>> 3) Send me both samples and your Redis dump so I can try to experiment
>> with that
>>
>> Maybe (3) would be a huge overkill in terms of privacy and amount of
>> data, so I would appreciate if you can do 1-2.
>>
>> Thanks in advance!
Again, could you please do what I have asked here? It might be very
interesting to look at.
Another thing you could try is to use `rspamadm mime stat -b` command
for a sample. Then, you will see the tokens Rspamd uses for that
particular email. Afterwards, you can even check them in Redis using
something like `HGETALL RS_<number>`, where `number` is printed by `mime
stat`. You also will see this information if you do like I've asked in
my previous email quoted above.
More information about the Users
mailing list