[Rspamd-Users] Which data is used for learning bayes or fuzzy storage?

Achim Lammerts ml-rspamd at syntaxys.de
Fri Nov 22 06:57:58 UTC 2024


Many thanks for the information. On this base I assume that the hashing 
in auto_learn takes place before the subject will be rewrited.
So if I want to train the filter with messages manually afterwards via 
the web interface, should I at least delete the string, for example '*** 
SPAM [8.57/12] ***', beforehand in order to generate correct hashes?

This is also a problem with user-initiated training via the mail store 
when a message is moved into or out of the junk folder. In general, 
however, user training is given its own flag and given less weight in 
scoring.

Kind regards
Achim

Am 22.11.24 um 06:22 schrieb Gerald Galster:
> This might help:
> 
> Bayes
> https://rspamd.com/doc/configuration/statistic.html#classifier-and-headers
> https://github.com/rspamd/rspamd/blob/41eab5b874721f7abc23144e5c3386392f1820f7/conf/options.inc#L40
> 
> Fuzzy
> https://rspamd.com/doc/modules/fuzzy_check.html#module-outline  (cfg line with headers = "...")
> https://github.com/rspamd/rspamd/blob/41eab5b874721f7abc23144e5c3386392f1820f7/src/plugins/fuzzy_check.c#L190
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_0x042489E6AC4CB99F.asc
Type: application/pgp-keys
Size: 4878 bytes
Desc: OpenPGP public key
URL: <https://lists.rspamd.com/pipermail/users/attachments/20241122/ee9829a0/attachment.bin>


More information about the Users mailing list