[Rspamd-Users] Redis huge database

azurit at pobox.sk azurit at pobox.sk
Wed Nov 23 08:37:37 UTC 2022


Citát Alexander Moisseev via Users <users at lists.rspamd.com>:

> On 23.11.2022 10:31, azurit at pobox.sk wrote:
>>
>> Citát Alexander Moisseev via Users <users at lists.rspamd.com>:
>>
>>> On 22.11.2022 22:45, azurit at pobox.sk wrote:
>>>> i'm having problems with Redis database - it's huge and getting  
>>>> bigger, no matter what i do. Redis is taking more and more  
>>>> memory. If i look at the keys i see about 500 000 keys with name  
>>>> similar to 'RS_1028729928929298385'. Is this normal?
>>>>
>>>
>>> AFAIK no one has ever researched the correlation of number of  
>>> statistical tokens and quality of classification, but I guess 0.5M  
>>> keys may not be enough. Just for reference, I have about 4.2M RS_*  
>>> keys in the bayes database (per user classifier is no enabled),  
>>> used_memory_human:554.79M.
>>
>>
>> My Redis database has almost 1 GB and Redis needs 4 GB of memory.  
>> With lower values, i'm getting this error from rspamd:
>> Nov 17 00:13:56 server00 rspamd[4086]: <177d52>; lua;  
>> history_redis.lua:132: got error OOM command not allowed when used  
>> memory > 'maxmemory'. when writing history row: no value
>>
> The on-disk .rdb file is compressed, so 1GB is a relatively large database.
>
>> rspamd is the only service using this Redis instance.
>>
> As you store everything related to Rspamd in the single Redis  
> instance there is no easy way to determine how much database space  
> each module consumes, I'm afraid. Probably you need to to count the  
> number of keys matching patterns with redis-cli.
> Also some excessive numbers (like stored fuzzy hashes, history  
> nrows, etc.) can indirectly indicate the source of the problem.


Ok, here are the numbers:

keys beginning with "RS_": 1007930
keys beginning with "RR:": 27141
keys beginning with "rs_first_": 895
everything other: 37983

I don't see any pattern in other keys, seems random, for example  
rrxc1p8tu5skays4a5gu63 (but lots of them begins with 'rr' like the one  
in the example).

Other data:

127.0.0.1:6379> debug object BAYES_HAM
Value at:0x7f836d192ab0 refcount:1 encoding:hashtable  
serializedlength:762292662 lru:8248068 lru_seconds_idle:43

127.0.0.1:6379> debug object BAYES_SPAM
Value at:0x7f838bae6450 refcount:1 encoding:hashtable  
serializedlength:138423713 lru:8248136 lru_seconds_idle:11

Configuration in /etc/rspamd/local.d/statistic.conf:
classifier "bayes" {
     expire = 100d;
     new_schema = true;
     tokenizer {
         name = "osb";
     }

     # Minimum number of words required for statistics processing
     min_tokens = 11;
     # Minimum learn count for both spam and ham classes to perform  
classification
     min_learns = 200;

     backend = "redis";
     autolearn = [-4, 10];
     statfile {
         symbol = "BAYES_HAM";
         spam = false;
     }
     statfile {
         symbol = "BAYES_SPAM";
         spam = true;
     }
}




Btw:
rspamd 3.4
Redis 5.0.14




More information about the Users mailing list