[Rspamd-Users] Multimap and syntax...

G.W. Haywood rspamd at jubileegroup.co.uk
Wed Feb 28 19:16:31 UTC 2024


Hi there,

On Wed, 28 Feb 2024, Gerald Galster wrote:
> On Wed, 28 Feb 2024, christian via Users wrote:
> 
>> I use the filter email:domain:tld which according to Docs Somebody
>> "<user at foo.example.com> -> example.com " only returns the domain.
>> So I enter e.g. aok.de in my map. But what about: canford.co.uk? Is
>> co.uk then considered a TLD or as a domain and tld.

The documentation at

https://rspamd.com/doc/modules/multimap.html#from-rcpt-and-header-filters

is very confusing on these issues and your problem is understandable.

If you look at

https://rspamd.com/doc/modules/multimap.html#helo-hostname-filters

you'll see that the term 'tld' in the rspamd configuration is distinct
from the term 'top'.  To me this looks like an afterthought.  But even
if this were not so

> Rspamd includes the public suffix list (see https://publicsuffix.org/list/).
> https://github.com/rspamd/rspamd/blob/master/contrib/publicsuffix/effective_tld_names.dat

keeping lists like this current is an onerous task.  I wouldn't want
to (a) rely on that currency and (b) let my configuration be changed
without my approval when the list is updated - perhaps in ways which I
would not myself have chosen.

For example, looking at the .uk TLD, rspamd and Wikipedia disagree on
second level domains.  Compared with

https://en.wikipedia.org/wiki/.uk#Second-level_domains

the list at .../effective_tld_names.dat is missing at least these:

     .bl.uk – used solely for the British Library
     .judiciary.uk – judiciary of England and Wales
     .mod.uk – armed forces and Ministry of Defence establishments and systems
     .nic.uk – network use only (reserved exclusively for Nominet UK)
     .parliament.uk – Parliament of the United Kingdom and the devolved national parliaments and assemblies
     .rct.uk – used solely for the Royal Collection Trust
     .royal.uk – used solely for the royal family website
     .ukaea.uk – used solely for the United Kingdom Atomic Energy Authority

(The .sch.uk domain is debatable; it's given as "*.sch.uk" by Nominet/rspamd.)

My feeling is that in preference to "email:domain:tld" I might use
"email:domain" and decide for myself.  So I'd take on a maintenance
task, but at least I'd know who would be to blame when it all went
wrogn.

>> Should I do this with regex or not?
>
> With hyperscan enabled you can use lots of regexes without performance penalty.
> On the other hand you need to be familiar with regular expressions and be excact.
> Given the problems you currently have I don't recommend it because it's harder to debug.

Agreed.  I've used regexes almost daily for decades, and on occasion I
still find myself staring at one for hours before I finally figure out
what I've done wrong.  Sometimes you have to write code to debug them.
In addition it can take a bit of experience to avoid some pitfalls; if
you aren't careful you can easily craft a regex which will perform a
denial of service attack on your own system if somebody just sends a
big image file.

>> ... In my current white domain list I have around 2000
>> entries. Could it be that there are too many?
>
> Generally speaking, no.

Again I'd agree with Mr. Galster, rspamd can handle it, but I'd go
further and ask *why* do you have 2,000 entries?  For what I'm
thinking is a relatively new installation it seems to me like an awful
lot, and I wonder if that's a symptom of something.  Perhaps it's that
your spam rules are catching more than they really ought to?  If you
try to get around woolly spam rules which catch things that they
shouldn't catch by whitelisting everything then you're building
brokenness into your configuration.  Inevitably this becomes difficult
to cleanse other than by throwing away the baby with the bathwater.

> ... if you add e.g. adidas.com to your whitelist, any spammer that
> sends with @adidas.com is probably whitelisted due to score -20.

If you rely on the address in the 'From:' header, then unless you have
some other way of knowing that it's not forged you're more or less
obliged to check that it's vouched for by a DKIM signature.  This is
unlike the envelope 'from' address, which (apart, obviously, from all
the freemail domains) you can usually trust if SPF gives it the OK.
You'll find legitimate senders who can't get SPF right, but thesedays
their numbers are shrinking.

-- 

73,
Ged.


More information about the Users mailing list