[Rspamd-Users] Bayes and Redis on multiple mx servers

Thu Feb 18 21:07:30 UTC 2021

On 18/02/2021 21:03, Vsevolod Stakhov wrote:
> On 17/02/2021 13:28, Sandy Drobic wrote:
>> Hello,
>>
>> I have rspamd with redis as backend on several servers in our company. During
>> testing I discovered that the trained bayes db was only used on mails on the
>> primary server, although the redis replication does replicate the bayes db to
>> the other servers.
>>
>> local.d/classifier.conf:
>>
>> backend = "redis";
>>
>> read_servers = "localhost:6376, mx-srv1.example.com:6376";
>> write_servers = "mx-srv1.example.com:6376";
>>
>> new_schema = true;
>> expire = 86400000;  # 1000 days
>> autolearn = [-1, 12]
>> min_learns = 200;
>>
>> The "rspamc stat" shows that bayes was properly replicated.
>>
>> [root at mx-srv1 tmp]# rspamc stat
>> Results for command: stat (2.000 seconds)
>> Messages scanned: 443
>> Messages with action reject: 10, 2.25%
>> Messages with action soft reject: 0, 0.00%
>> Messages with action rewrite subject: 37, 8.35%
>> Messages with action add header: 4, 0.90%
>> Messages with action greylist: 7, 1.58%
>> Messages with action no action: 385, 86.90%
>> Messages treated as spam: 51, 11.51%
>> Messages treated as ham: 392, 88.48%
>> Messages learned: 419
>> Connections count: 0
>> Control connections count: 95
>> Pools allocated: 170
>> Pools freed: 143
>> Bytes allocated: 26.72MiB
>> Memory chunks allocated: 140
>> Shared chunks allocated: 16
>> Chunks freed: 0
>> Oversized chunks: 1
>> Statfile: BAYES_SPAM type: redis; length: 0; free blocks: 0; total blocks: 0;
>> free: 0.00%; learned: 216; users: 1; languages: 0
>> Statfile: BAYES_HAM type: redis; length: 0; free blocks: 0; total blocks: 0;
>> free: 0.00%; learned: 202; users: 1; languages: 0
>> Total learns: 418
>>
>> [root at mx-ext ~]# rspamc stat
>> Results for command: stat (2.001 seconds)
>> Messages scanned: 16
>> Messages with action reject: 0, 0.00%
>> Messages with action soft reject: 0, 0.00%
>> Messages with action rewrite subject: 7, 43.75%
>> Messages with action add header: 0, 0.00%
>> Messages with action greylist: 2, 12.50%
>> Messages with action no action: 7, 43.75%
>> Messages treated as spam: 7, 43.75%
>> Messages treated as ham: 9, 56.25%
>> Messages learned: 0
>> Connections count: 0
>> Control connections count: 46
>> Pools allocated: 93
>> Pools freed: 67
>> Bytes allocated: 26.72MiB
>> Memory chunks allocated: 142
>> Shared chunks allocated: 16
>> Chunks freed: 0
>> Oversized chunks: 1
>> Statfile: BAYES_SPAM type: redis; length: 0; free blocks: 0; total blocks: 0;
>> free: 0.00%; learned: 216; users: 1; languages: 0
>> Statfile: BAYES_HAM type: redis; length: 0; free blocks: 0; total blocks: 0;
>> free: 0.00%; learned: 202; users: 1; languages: 0
>> Total learns: 418
>>
>> Or is Bayes only used when the number of scanned mails exceeds the min_learns
>> on the local host regardless of the number of learned in BAYES_SPAM and BAYES_HAM?
>> Is there a workaround to tell the other servers to use bayes?
>>
>> Greetings
>> Sandy Drobic
>>
> 
> It should be fixed by
> https://github.com/rspamd/rspamd/commit/dbc9ff655dfb459eb8af328a82a5b8c848cda480
> 
> As a workaround, you can specify explicit weights for all upstreams, e.g.:
> 
> read_servers = "localhost:6376:1, mx-srv1.example.com:6376:1";
> 

Or not, it might not help from what I see :( Another workaround is to
set rotation algorithm to random, e.g.:

read_servers = "random:localhost:6376, mx-srv1.example.com:6376";