[Rspamd-Users] Email hits BAYES_* after a few times

Sun Jun 2 20:33:30 UTC 2019

> On 2 Jun 2019, at 22:17, Tim Harman via Users <users at lists.rspamd.com> wrote:
> 
> On 03/06/2019 6:47 am, Sophie Loewenthal wrote:
>> Hi,
>> For some reason emails that come in more than twice start hitting
>> BAYES_* rule, but these emails were not processed by 'rspamc
>> learn_spam' or 'rspamc learn_ham', those can be discounted.  How does
>> this email get into BAYES when I didn’t feed any eamils from the
>> sender into rspamc learn_spam?
> 
> <snip>
> 
>> It’s a bit rum : How could i investigate this?
>> Thank, Sophie
> 
> What does "rspamadm configdump classifier" tell you?
> Probably you have autolearn enabled, thus rspamd is automatically learning your ham/spam.
> 
> Suggested Reading: https://rspamd.com/doc/configuration/statistic.html

Hi Tim,

 I thought autolearn was disabled, unless it’s on by default.  I don’t have autolearn = true in my config that I know of.   Bayes should be autolearning and configdump didn’t shed any light.

# rspamadm configdump classifier
*** Section classifier ***
bayes {
    backend = "sqlite3";
    min_tokens = 11;
    languages_enabled = true;
    cache {
        path = "/var/lib/rspamd/learn_cache.sqlite";
    }
    statfile {
        path = "/var/lib/rspamd/bayes.ham.sqlite";
        spam = false;
        symbol = "BAYES_HAM";
    }
    statfile {
        path = "/var/lib/rspamd/bayes.spam.sqlite";
        spam = true;
        symbol = "BAYES_SPAM";
    }
    tokenizer {
        name = "osb";
    }
    learn_condition = <<EOD
return function(task, is_spam, is_unlearn)
  local learn_type = task:get_request_header('Learn-Type')

  if not (learn_type and tostring(learn_type) == 'bulk') then
    local prob = task:get_mempool():get_variable('bayes_prob', 'double')

    if prob then
      local in_class = false
      local cl
      if is_spam then
        cl = 'spam'
        in_class = prob >= 0.95
      else
        cl = 'ham'
        in_class = prob <= 0.05
      end

      if in_class then
        return false,string.format('already in class %s; probability %.2f%%',
          cl, math.abs((prob - 0.5) * 200.0))
      end
    end
  end

  return true
end
EOD;
    min_learns = 200;
}

*** End of section classifier ***