[Rspamd-Users] Redis huge database
azurit at pobox.sk
azurit at pobox.sk
Wed Nov 23 08:37:37 UTC 2022
Citát Alexander Moisseev via Users <users at lists.rspamd.com>:
> On 23.11.2022 10:31, azurit at pobox.sk wrote:
>>
>> Citát Alexander Moisseev via Users <users at lists.rspamd.com>:
>>
>>> On 22.11.2022 22:45, azurit at pobox.sk wrote:
>>>> i'm having problems with Redis database - it's huge and getting
>>>> bigger, no matter what i do. Redis is taking more and more
>>>> memory. If i look at the keys i see about 500 000 keys with name
>>>> similar to 'RS_1028729928929298385'. Is this normal?
>>>>
>>>
>>> AFAIK no one has ever researched the correlation of number of
>>> statistical tokens and quality of classification, but I guess 0.5M
>>> keys may not be enough. Just for reference, I have about 4.2M RS_*
>>> keys in the bayes database (per user classifier is no enabled),
>>> used_memory_human:554.79M.
>>
>>
>> My Redis database has almost 1 GB and Redis needs 4 GB of memory.
>> With lower values, i'm getting this error from rspamd:
>> Nov 17 00:13:56 server00 rspamd[4086]: <177d52>; lua;
>> history_redis.lua:132: got error OOM command not allowed when used
>> memory > 'maxmemory'. when writing history row: no value
>>
> The on-disk .rdb file is compressed, so 1GB is a relatively large database.
>
>> rspamd is the only service using this Redis instance.
>>
> As you store everything related to Rspamd in the single Redis
> instance there is no easy way to determine how much database space
> each module consumes, I'm afraid. Probably you need to to count the
> number of keys matching patterns with redis-cli.
> Also some excessive numbers (like stored fuzzy hashes, history
> nrows, etc.) can indirectly indicate the source of the problem.
Ok, here are the numbers:
keys beginning with "RS_": 1007930
keys beginning with "RR:": 27141
keys beginning with "rs_first_": 895
everything other: 37983
I don't see any pattern in other keys, seems random, for example
rrxc1p8tu5skays4a5gu63 (but lots of them begins with 'rr' like the one
in the example).
Other data:
127.0.0.1:6379> debug object BAYES_HAM
Value at:0x7f836d192ab0 refcount:1 encoding:hashtable
serializedlength:762292662 lru:8248068 lru_seconds_idle:43
127.0.0.1:6379> debug object BAYES_SPAM
Value at:0x7f838bae6450 refcount:1 encoding:hashtable
serializedlength:138423713 lru:8248136 lru_seconds_idle:11
Configuration in /etc/rspamd/local.d/statistic.conf:
classifier "bayes" {
expire = 100d;
new_schema = true;
tokenizer {
name = "osb";
}
# Minimum number of words required for statistics processing
min_tokens = 11;
# Minimum learn count for both spam and ham classes to perform
classification
min_learns = 200;
backend = "redis";
autolearn = [-4, 10];
statfile {
symbol = "BAYES_HAM";
spam = false;
}
statfile {
symbol = "BAYES_SPAM";
spam = true;
}
}
Btw:
rspamd 3.4
Redis 5.0.14
More information about the Users
mailing list