[Rspamd-Users] multimap and header lines

Gerald Galster list+rspamd at gcore.biz
Tue Nov 19 19:01:51 UTC 2024


> Thanks for your reply. To start, I'm a bit puzzled by the debug log:
> 
> On 19-11-2024 01:08, Gerald Galster wrote:
> [...]
>> You may enable multimap debugging in local.d/logging.inc:
>> debug_modules=['multimap'];
>> Then you will see the input that will be searched by your multimap regexes.
>> type = "content"; and filter = "headers"; receives undecoded headers.
>> (https://rspamd.com/doc/modules/multimap.html#content-filters)
>> It might look like this:
>> rspamd[1927161]: <4Xxk1b>; multimap; multimap.lua:563: check value Received: from localhost (localhost.localdomain [127.0.0.1])\x0A\x09by mx1.example.com (Postfix) with ESMTP id 4X2kAV5t2qzvR7n\x0A\x09for <user at example.com>; Tue, 19 Nov 2024 00:18:15 +0100 (CET)\x0D\x0AX-Virus-Scanned: amavisd-new at example.com\x0D\x0AX....
>> So this is one long string containing ALL headers in undecoded form.
> 
> ... it shows <LF> 0x0a between multiline headers and <CR><LF> 0x0d 0x0a between headers - which looks like the headers actually /are/ normalized? Because otherwise, a proper CRLF would be available between every header line, wouldn't it?

Unfolding a multiline header is typically done by removing <CR><LF><WHITESPACE>, yes.
Only <CR><LF> separates headers, so technically a header field body including
<LF><WHITESPACE> would be part of the whole header line.

I can't tell you at which stage (postfix/milter/network/rspamd) this might have been
altered. Perhaps you find the time to investigate further.

<LF> may simply be used in the traditional (obsoleted) ASCII sense in this context.
https://datatracker.ietf.org/doc/html/rfc5322#section-4.1


>> /^X-Virus-Scanned: amavisd-new at/m MSEU_HWL:1.23
>> Note the /m switch, which is a regexp modifier that switches to
>> multiline mode, so that you can match the start and end of each
>> header line using the ^ and $ symbols.
> Thank you for the /m modifier, that looks like the one I was looking for. However, the regexp does not seem to apply. Regexp says:
> 
> /^X-fc9822d6-c227-4fb2-a50a-c86656e68129: yes$/m
> 
> Rspamd log:
> 
> 2024-11-19 11:39:31 #1932889(normal) <2c506b>; multimap; multimap.lua:437: check value X-fc9822d6-c227-4fb2-a50a-c86656e68129: yes\x0D\x0ATo: "'Valentijn Sessink'" <valentijn at sessink.nl>\x0D\x0ASubject:
> 
> It looks a bit like the /m modifier makes <LF> the line ending, instead of the proper <CR><LF>?

https://perldoc.perl.org/perlre#Metacharacters

$   Match the end of the string (or before newline at the end of the
    string; or before any newline if /m is used)

Try /^X-fc9822d6-c227-4fb2-a50a-c86656e68129: yes\r?$/m


> (I'm aware that I probably won't need the line end for this very purpose, but I'm also trying to understand how the matching works).
> 
> [...]> Or you could write a lua rule:
>> https://rspamd.com/doc/modules/regexp.html#regular-expressions
> 
> The reason for not using the regexp module is that, from what I understood, I had to write a lua module which would mean reloading rspamd every time a new regexp would be added. I was looking for a simple file with regexps that I could match to just block mails, and multimap looked like the best candidate for that.


Yes, go with multimaps. I just wanted to show there are other options, depending on individual usage.

Best regards,
Gerald




More information about the Users mailing list