[Rspamd-Users] Which data is used for learning bayes or fuzzy storage?

Gerald Galster list+rspamd at gcore.biz
Fri Nov 22 05:22:05 UTC 2024


> actually I reject messages above score 12 but doing also an export of the mail to postmasters junk folder as backup purpose for some days.
> 
> In this case, these messages are loosing their original „Envelope From“,  „Envelope To“ and the „Received“ Headers. Instead, these headers are available, e.g.:
> 
> Return-Path: <postmaster at mydomain.tld>
> Delivered-To: mbox_admin at mydomain.tld
> Received: from rspamd.mydomain.tld ([192.168.123.2])
>        by mailstore.mydomain.tld with LMTP
>        id mDmYAC0QP2fkOAkADDlYHw
>        (envelope-from <postmaster at mydomain.tld>)
>        for <mbox_admin at mydomain.tld>; Thu, 21 Nov 2024 11:49:17 +0100
> Received: from rspamd.mydomain.tld (localhost [127.0.0.1])
>        by mta.mydomain.tld (Postfix) with SMTP id F16417D0CF
>        for <quarantine at mydomain.tld>; Thu, 21 Nov 2024 11:49:16 +0100 (CET)
> 
> So my question is, if I use such mail files for manual learning, will these headers also be used for hashing? Is there anywhere an information about the objects which will be used for hashing?


This might help:

Bayes
https://rspamd.com/doc/configuration/statistic.html#classifier-and-headers
https://github.com/rspamd/rspamd/blob/41eab5b874721f7abc23144e5c3386392f1820f7/conf/options.inc#L40

Fuzzy
https://rspamd.com/doc/modules/fuzzy_check.html#module-outline  (cfg line with headers = "...")
https://github.com/rspamd/rspamd/blob/41eab5b874721f7abc23144e5c3386392f1820f7/src/plugins/fuzzy_check.c#L190

Best regards,
Gerald


More information about the Users mailing list