[Rspamd-Users] Block emails with Chinese and Cyrillic characters in the subject

Steve Witten stevewi.niteflyte at gmail.com
Wed Aug 6 14:08:19 UTC 2025


Here's what I do.

X_LOCAL_CONTENT_BL {
    type = "content";
    filter = "full";
    map = "$LOCAL_CONFDIR/local.d/maps.d/blacklist/content.map";
    description = "Local content blacklist.";
    regexp = true;
    score = 10000;
}


Here are the relevant regexps in
$LOCAL_CONFDIR/local.d/maps.d/blacklist/content.map:

# Non-Western charsets
#
/(=\?|charset\=)big5/i                  # Big5
/(=\?|charset\=)euc-kr/i                # Extended Unix Code -- Korean

/(=\?|charset\=)windows-1251/i          # Windows 1251
/(=\?|charset\=)koi8-r/i                # Cyrillic
/(=\?|charset\=)gb2312/i                # Chinese
/(=\?|charset\=)ks_c_5601-1987/i        # Hangul (Korean)


Big5 and gb2312 are traditional & simplified Chinese, respectively.  Here's
the complete list:

https://www.lsoft.com/manuals/maestro/4.0/htmlhelp/data%20administrator/CharacterSets.html

Note that this DOES NOT block UTF-8...which can also contain Asian
characters...so this method is not foolproof.  However, I don't get a lot
of this so the above has worked for me for several years.

I use this map to block any kind of unwanted content.  For example:

/.*norton\s*(360)?.*/i                     # Mentions Norton (360)
/.*m(a)?c\s*afee.*/i                       # Mentions McAfee
/.*geek\s*squad.*/i                        # Mentions Geek Squad
/.*m(icro)?((\-|\s)*)?s(oft)?\s*office.*/i # Mentions Microsoft Office
/.*office\s*365.*/i                        # Mentions Office 365


I hope this helps...

Regards,

Steve Witten
Portland, OR

On Wed, Aug 6, 2025 at 2:29 AM Andreas <rspamd at linuxmaker.com> wrote:

> Hello everyone,
>
> I want to use Rspamd to block emails whose subject line contains Chinese
> or
> Cyrillic characters. The reason is that such emails are unreadable in our
> environment and usually contain unwanted content.
>
> I have defined the following rules for this:
>
> BANNED_CYRILLIC {
>         type = "header";
>         header = "Subject";
>         filter = "regexp";
>         map = "${LOCAL_CONFDIR}/local.d/maps.d/banned_cyrillic.map";
>        symbol = "BANNED_CYRILLIC";
>         description = "Subject contains Cyrillic characters";
>         action = "reject";
> }
>
> BANNED_CHINESE {
>         type = "header";
>         header = "Subject";
>         filter = "regexp";
>         map = "${LOCAL_CONFDIR}/local.d/maps.d/banned_chinese.map";
>        symbol = "BANNED_CHINESE";
>         description = "Subject contains Chinese characters";
>         action = "reject";
> }
>
> The associated maps show:
>
> banned_chinese.map:
> /[\u4E00-\u9FFF]/u
>
> banned_cyrillic.map:
> /[\u0400-\u04FF]/u
>
> Unfortunately, this doesn't work as expected. Emails containing these
> characters aren't reliably blocked.
>
> Does anyone have a tip on how to reliably implement this with Rspamd?
> Perhaps
> I'm doing something wrong with the regular expressions or the map
> configuration.
>
> Thanks in advance!
>
> Andreas
>
>
> --
> Users mailing list
> Users at lists.rspamd.com
> https://lists.rspamd.com/mailman/listinfo/users
>


More information about the Users mailing list