From usenet at schani.com  Fri Mar  1 10:54:44 2024
From: usenet at schani.com (christian)
Date: Fri, 1 Mar 2024 11:54:44 +0100
Subject: [Rspamd-Users] Multimap and syntax...
In-Reply-To: <5FDD5A1F-DF46-4E34-8761-7F10B0C96C5E@gcore.biz>
References: <03bdf4ff-f8c1-4706-9df1-862397d977d3@schani.com>
 <5FDD5A1F-DF46-4E34-8761-7F10B0C96C5E@gcore.biz>
Message-ID: <58ada3d4-c345-42b0-b5fe-e37e78ab3c03@schani.com>

Hello,

I have attached my config dump file here: 
https://www.leicht.info/rspamd-dump.txt

Rspamd 3.8.4
rspamadm configtest
syntax OK

When I take a closer look at your answers, it seems that the income 
filtering is mainly done by Bayes and you train this filter. The 
decisive factor is the score of an email as to whether it is listed as 
spam or ham in the Bayes filter.

I completely deleted the redis entries for rspamd and started learning 
from scratch. But after a few hours I have a large surplus of Ham 
entries - about 100:10. I don't think that's the point of the matter. 
After one day I have 5000 BAYES_HAM entries and 600 BAYES_SPAM.

But when I look at spam emails that get through, BAYES_SPAM/HAM is not 
checked at all.

Here is an example of Spam:

X-Spamd-Result: default: False [0.81 / 30.00];
	R_DKIM_ALLOW(1.11)[gexton.us:s=root];
	MX_INVALID(0.50)[];
	DMARC_POLICY_ALLOW(-0.50)[gexton.us,reject];
	R_SPF_ALLOW(-0.20)[+ip4:209.141.51.0/24];
	MIME_GOOD(-0.10)[multipart/alternative,text/plain];
	RCPT_COUNT_ONE(0.00)[1];
	DKIM_TRACE(0.00)[gexton.us:+];
	ASN(0.00)[asn:53667, ipnet:209.141.32.0/19, country:US];
	MIME_TRACE(0.00)[0:+,1:+,2:~];
	MISSING_XM_UA(0.00)[];
	SPF_REPUTATION_SPAM(0.00)[0.78822517302659];
	DKIM_REPUTATION(0.00)[0.78822517302659];
	HAS_WP_URI(0.00)[];
	GENERIC_REPUTATION(0.00)[0.78822517302659];
	FROM_EQ_ENVFROM(0.00)[];
The sender Email ist on my multimap blacklist. No Multimap test and no 
BAYES Test.


Here is an example of a non-spam:
X-Spamd-Result: default: False [1.87 / 30.00];
	INFO_TO_INFO_LU(2.00)[];
	SUBJECT_HAS_CURRENCY(1.00)[];
	DMARC_POLICY_ALLOW(-0.50)[unitedplugins.com,reject];
	R_DKIM_ALLOW(-0.20)[unitedplugins.com:s=mailjet];
	R_SPF_ALLOW(-0.20)[+ip4:185.250.236.0/22];
	MAILLIST(-0.11)[generic];
	MIME_GOOD(-0.10)[multipart/alternative,text/plain];
	MX_GOOD(-0.01)[];
	HAS_LIST_UNSUB(-0.01)[];
	DKIM_TRACE(0.00)[unitedplugins.com:+];
	RCPT_COUNT_ONE(0.00)[1];
	TO_MATCH_ENVRCPT_ALL(0.00)[];
	SPF_REPUTATION_HAM(0.00)[-0.51883337370734];
	IP_REPUTATION_HAM(0.00)[asn: 200069(-0.21), country: FR(0.00), 	 ip: 
185.250.237.60(0.00)];

I trained the email as HAM. But no BAYES entry appears. In addition, the 
domain is in a multimap whitelist which is also not displayed. The email 
is accepted, but only just.


Am 28.02.2024 um 15:15 schrieb Gerald Galster:

> Rspamd includes the public suffix list (see https://publicsuffix.org/list/).
> https://github.com/rspamd/rspamd/blob/master/contrib/publicsuffix/effective_tld_names.dat

Ok, then I don't have to worry about the multiple TLDs. Rspamd does this 
automatically.

> 
> Try to be more precise when reading the documentation.
> 

Unfortunately, the documentation is very confusing and not very 
structured. You don't recognize the connections.


> Just a hint: if you add e.g. adidas.com to your whitelist, any spammer that sends with @adidas.com is probably whitelisted due to score -20.
> I'd rather train rspamd to filter spam and use those maps to assist learning. Otherwise a spammail with an added score of -20 will probably be learned as ham, which can ruin your bayes filter.


Should an email that does not actually come from adidas.com not be 
checked further and be assessed differently as phishing? Check against 
DKIM and MX. This makes it clear that the email doesn't really come from 
adidias.com, right? OK, maybe -20 is a bit much.

But what always surprises me is that it's hard to understand why 
sometimes my multimaps work and the next email doesn't. Why I can see 
that Bayesian statistics counts up for incoming emails, but no check is 
displayed in the email fields.

Please don't be mad at me for my stupid questions, but I want to learn this.

Thanks
Christian


From albrecht.backhaus at gmail.com  Sat Mar  2 16:05:51 2024
From: albrecht.backhaus at gmail.com (Albrecht Backhaus)
Date: Sat, 2 Mar 2024 17:05:51 +0100
Subject: [Rspamd-Users] Problems with dmarc reports
Message-ID: <0bd85a43-bad5-4da7-971f-580fb5fbd7ad@gmail.com>

Hi there

I try to setup dmarc reporting properly. I followed the description on 
https://rspamd.com/doc/modules/dmarc.html

When executing |rspamadm dmarc_report|I do get a bunch of error messages 
(partially redacted for privacy reasons) in the terminal:

> Couldn't send mail for github.com: error on stage connect: IO read 
> error while trying to read data: Connection refused
> Couldn't send mail for xxxx.de: error on stage connect: IO read error 
> while trying to read data: Connection refused
> Couldn't send mail for yyyyy.de: error on stage connect: IO read error 
> while trying to read data: Connection refused
> Couldn't send mail for zzzzzz.com: error on stage connect: IO read 
> error while trying to read data: Connection refused
> Couldn't send mail for xxx-yyyyy.com: error on stage connect: IO read 
> error while trying to read data: Connection refused
> Couldn't send mail for zzzz.domain.tld: error on stage connect: IO 
> read error while trying to read data: Connection refused
> Reporting collection has finished 1 dates processed, 6 reports: 0 
> completed, 6 failed

I then tried to find any related log entries in the rspamd logs in 
general or in any other system log or mail log. There is nothing to find 
which is related to the mentioned issue.

I am now looking for advice to understand what the error messages really 
mean and where I can start to solve the obviously existing problem.

My second problem is that the docs say:

> Starting from Rspamd 3.0, the recommended way to send DMARC reports is 
> to use the|rspamadm dmarc_report|command with cron or systemd timers. 
> Depending on the amount of traffic, this should be scheduled either 
> daily or hourly.
When I try to execute |rspamadm dmarc_report|a second time at the same 
day (it was today) I do get the message: "No reports for 20240301" It 
seems that the described command tries to execute all reports for the 
previous day in a single run and after that the data for theses reports 
are somehow gone.? I have not seen any chance to set up anything related 
to this. So if that is true it would not make sense to run this command 
on a hourly base.? Am I missing or misunderstanding something ?

Any help is highly appreciated.

Regards, Albrecht

From rspamd at jubileegroup.co.uk  Sun Mar  3 11:32:34 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sun, 3 Mar 2024 11:32:34 +0000 (GMT)
Subject: [Rspamd-Users] Problems with dmarc reports
In-Reply-To: <0bd85a43-bad5-4da7-971f-580fb5fbd7ad@gmail.com>
References: <0bd85a43-bad5-4da7-971f-580fb5fbd7ad@gmail.com>
Message-ID: <3b4be591-4e38-9679-673c-8852748fa25b@jubileegroup.co.uk>

Hi there,

On Sat, 2 Mar 2024, Albrecht Backhaus wrote:

> I try to setup dmarc reporting properly. I followed the description on 
> https://rspamd.com/doc/modules/dmarc.html

In that document it says

" A working MTA running on a specific host is required to send the
reports. Ideally, the local MTA should allow email to be sent without
authentication or SSL."

> ...
> Couldn't send mail for github.com: error on stage connect: IO read error 
> while trying to read data: Connection refused
> ...

It looks like you don't have a working MTA listening for the connections
which will send the DMARC reports.  If you think that my suggestion isn't
correct, please supply more detailed information.

You could try to send a DMARC report to me.

If you'll let me know from where I can expect the connection (the IP
address and the MTA HELO name), and very roughly when I can expect it,
then I can look in the mail logs for any attempt to send the report.

My guess is that the 'connection refused' is from your own MTA (or you
don't have an MTA listening for the reports) and that I'll see nothing.

-- 

73,
Ged.

From usenet at schani.com  Sun Mar  3 12:54:06 2024
From: usenet at schani.com (christian)
Date: Sun, 3 Mar 2024 13:54:06 +0100
Subject: [Rspamd-Users] Question about Bayes / Statistic
Message-ID: <cc67a99e-54e5-4f76-9e7f-89339dae369d@schani.com>

Hello,
I have a question about Bayes. So the integrated statistical function of 
Rspamd.
What data is used for the evaluation? Only email body components or 
words that are weighted, or header components such as sender, recipient, 
Dmarc, RBL results, etc.

The standard setting for Bayes_ham and Bayes_spam is only -2 and +2, 
which, however, has a very low impact on a test. Does Bayes have an 
impact in other ways?
Thanks for info
Christian

From albrecht.backhaus at gmail.com  Sun Mar  3 16:07:52 2024
From: albrecht.backhaus at gmail.com (Albrecht Backhaus)
Date: Sun, 3 Mar 2024 17:07:52 +0100
Subject: [Rspamd-Users] Problems with dmarc reports
In-Reply-To: <3b4be591-4e38-9679-673c-8852748fa25b@jubileegroup.co.uk>
References: <0bd85a43-bad5-4da7-971f-580fb5fbd7ad@gmail.com>
 <3b4be591-4e38-9679-673c-8852748fa25b@jubileegroup.co.uk>
Message-ID: <82084d47-35e9-4bc8-ba63-18cc6679fd25@gmail.com>

*Von:/From:* G.W. Haywood <rspamd at jubileegroup.co.uk>
*Gesendet:/Sent:* Sonntag, 03.03.2024 - 12:32
*An:/To:* User questions <users at lists.rspamd.com>
*Kopie:/CC:* Albrecht Backhaus <albrecht.backhaus at gmail.com>
*Betreff:/Subject:* Re: [Rspamd-Users] Problems with dmarc reports
> Hi there,
>
> On Sat, 2 Mar 2024, Albrecht Backhaus wrote:
>
>> I try to setup dmarc reporting properly. I followed the description 
>> on https://rspamd.com/doc/modules/dmarc.html
>
> In that document it says
>
> " A working MTA running on a specific host is required to send the
> reports. Ideally, the local MTA should allow email to be sent without
> authentication or SSL."
>
I have seen this already as well. The wording "Ideally... " is not very 
helpful - it would be desirable to find a clear statement of what works 
and what does not.
But anyway - there is no entry in the mail logs of my mta - so no 
rejected attempt to send an email.

>> ...
>> Couldn't send mail for github.com: error on stage connect: IO read 
>> error while trying to read data: Connection refused
>> ...
>
> It looks like you don't have a working MTA listening for the connections
> which will send the DMARC reports.? If you think that my suggestion isn't
> correct, please supply more detailed information.

See my statement above. If that would be the case there would be log 
entries documenting rejected attempts to access the mta.

>
> You could try to send a DMARC report to me.
>
> If you'll let me know from where I can expect the connection (the IP
> address and the MTA HELO name), and very roughly when I can expect it,
> then I can look in the mail logs for any attempt to send the report.
>
I don't think this is a good idea. It is also a mystery to me how this 
should work at all. The sending of dmarc reports (valid recipients for 
specific domains etc.) is regulated via corresponding DNS entries for 
the domains concerned. I don't know how I could "manipulate" your 
recipient data into these DNS entries for emails already received from 
different domains on my mail server - especially as I have no access to 
DNS entries for other domains ....

> My guess is that the 'connection refused' is from your own MTA (or you
> don't have an MTA listening for the reports) and that I'll see nothing.
>
"Guessing" is exactly what I wanted to avoid, hence my question if 
anyone can tell me exactly what the messages of the "rspamadm 
dmarc_report" command mean and where this is documented. If it is 
unclear or nobody knows, it might be a good idea to supplement the 
documentation accordingly.


From rspamd at jubileegroup.co.uk  Sun Mar  3 17:14:05 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sun, 3 Mar 2024 17:14:05 +0000 (GMT)
Subject: [Rspamd-Users] Problems with dmarc reports
In-Reply-To: <82084d47-35e9-4bc8-ba63-18cc6679fd25@gmail.com>
References: <0bd85a43-bad5-4da7-971f-580fb5fbd7ad@gmail.com>
 <3b4be591-4e38-9679-673c-8852748fa25b@jubileegroup.co.uk>
 <82084d47-35e9-4bc8-ba63-18cc6679fd25@gmail.com>
Message-ID: <e5ed5334-6bb4-41c8-f5af-cd5aaee581de@jubileegroup.co.uk>

Hi there,

On Sun, 3 Mar 2024, Albrecht Backhaus wrote:
> ...
> ...
> But anyway - there is no entry in the mail logs of my mta - so no
> rejected attempt to send an email.

Another explanation could be that the connection attempt was not an
attempt to connect to the mail server at whose logs you are looking.
Can you say how you have configured the connection?


> G.W. Haywood wrote:
>> You could try to send a DMARC report to me.
>> 
>> If you'll let me know from where I can expect the connection (the IP
>> address and the MTA HELO name), and very roughly when I can expect it,
>> then I can look in the mail logs for any attempt to send the report.
>> 
> I don't think this is a good idea.  It is also a mystery to me how
> this should work at all.

Normally I reply to the mailing list only.  You may have noticed that
I sent my previous message both to the list and directly to you.  You
can send DMARC reports to us in the same way that Amazon, Gmail, Yahoo
and hundreds of other reporting organizations currently send them:

...
10:05 dmarc_reports at ausics.net          Report Domain: jubileegroup.co.uk Submitter: ausics.net ...
10:27 postmaster at amazonses.com          Dmarc Aggregate Report Domain: {jubileegroup.co.uk}  Submitter: {Amazon SES} ...
11:33 reports at fastmaildmarc.com         Report Domain: jubileegroup.co.uk Submitter: fastmail.com ...
12:32 Seznam.cz                         Report Domain: jubileegroup.co.uk Submitter: seznam.cz ...
15:00 postmaster at aegis.com              DMARC Failure Report for jubileegroup.co.uk ...
15:56 dmarc_reports                     Report Domain: jubileegroup.co.uk Submitter: acsbbs.org ...
15:59 noreply-dmarc-support at google.com  Report domain: jubileegroup.co.uk Submitter: google.com ...
16:30 dmarc-bounces at unipi.it            Report Domain: jubileegroup.co.uk Submitter: unipi.it ...
16:49 Comcast DMARC Report Generator    Report Domain: jubileegroup.co.uk Submitter: comcast.net ...
...

(Of course many mailing lists may cause DKIM failures, a few will even
cause SPF failures too; some reporters (such as aegis.com above) don't
handle that too well.  But for the purposes of testing we need not be
concerned about those issues.)

> The sending of dmarc reports (valid recipients for specific domains
> etc.) is regulated via corresponding DNS entries for the domains

Er, well, yes:

$ dig +short -t txt _dmarc.jubileegroup.co.uk
"v=DMARC1;p=none;adkim=s;aspf=s;pct=100;fo=1:d:s;rua=mailto:dmarc at jubileegroup.co.uk!2m;ruf=mailto:dmarc at jubileegroup.co.uk"


> I don't know how I could "manipulate" your recipient data into these
> DNS entries for emails already received from different domains ...

Nothing like this was suggested.

-- 

73,
Ged.

From rspamd at jubileegroup.co.uk  Sun Mar  3 17:15:25 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sun, 3 Mar 2024 17:15:25 +0000 (GMT)
Subject: [Rspamd-Users] Question about Bayes / Statistic
In-Reply-To: <cc67a99e-54e5-4f76-9e7f-89339dae369d@schani.com>
References: <cc67a99e-54e5-4f76-9e7f-89339dae369d@schani.com>
Message-ID: <6d565ff-1827-2ffd-3956-5034ab9ce0e5@jubileegroup.co.uk>

Hi there,

On Sun, 3 Mar 2024, christian via Users wrote:

> I have a question about Bayes. So the integrated statistical
> function of Rspamd.  What data is used for the evaluation? Only
> email body components or words that are weighted, or header
> components such as sender, recipient, Dmarc, RBL results, etc.

Selected headers are used.  There are configuration options to control
which, see the documentation:

https://rspamd.com/doc/configuration/statistic.html

-- 

73,
Ged.

From list+rspamd at gcore.biz  Sun Mar  3 23:16:20 2024
From: list+rspamd at gcore.biz (Gerald Galster)
Date: Mon, 4 Mar 2024 00:16:20 +0100
Subject: [Rspamd-Users] Multimap and syntax...
In-Reply-To: <58ada3d4-c345-42b0-b5fe-e37e78ab3c03@schani.com>
References: <03bdf4ff-f8c1-4706-9df1-862397d977d3@schani.com>
 <5FDD5A1F-DF46-4E34-8761-7F10B0C96C5E@gcore.biz>
 <58ada3d4-c345-42b0-b5fe-e37e78ab3c03@schani.com>
Message-ID: <21EFF320-B078-47C1-9434-226D66A33A40@gcore.biz>

> When I take a closer look at your answers, it seems that the income filtering is mainly done by Bayes

No. Spam filtering consists of many tests that add to the global score. A mail is considered spam if that global score is high enough.
Bayes is just one of those tests, don't forget blocklists (Spamhaus, ...), whitelist module, multimaps, neural checks, dkim checks, antivirus, fuzzy, reputation ...

> and you train this filter.

Yes, Bayes has to be trained. It won't work until it has sufficiently been trained.
https://rspamd.com/doc/configuration/statistic.html
(min_learns = 200;)

> The decisive factor is the score of an email as to whether it is listed as spam or ham in the Bayes filter.

I don't know what you mean by that. When an email is learned as spam, its text is tokenized (sort of split into words) and those tokens are then associated with spam.
Any new mail is tokenized and compared with existing spam/ham tokens. The score the Bayes filter calculates from that tells you how likely it considers an email as spam. 
 
As I mentioned before the Bayes filter is just one test. Many other tests may add their scores as well.

> I completely deleted the redis entries for rspamd and started learning from scratch. But after a few hours I have a large surplus of Ham entries - about 100:10. I don't think that's the point of the matter. After one day I have 5000 BAYES_HAM entries and 600 BAYES_SPAM.

For initial learning you should manually train a corpus of ham/spammails (more than min_learns). It's best to train the same amount of ham mails and spam mails.

> But when I look at spam emails that get through, BAYES_SPAM/HAM is not checked at all.

Then you should manually train that mail as spam (rspamc learn_spam /path/to/spammail.eml).


> Here is an example of Spam:
> The sender Email ist on my multimap blacklist. No Multimap test and no BAYES Test.

For multimap I see two possibilities:

- config is wrong (checking for wrong selector or something like that)
- with regex: the regex does not match (wrong regex definition)


> Here is an example of a non-spam:
> X-Spamd-Result: default: False [1.87 / 30.00];
> 	INFO_TO_INFO_LU(2.00)[];
> 	SUBJECT_HAS_CURRENCY(1.00)[];
> 	DMARC_POLICY_ALLOW(-0.50)[unitedplugins.com,reject];
> 	R_DKIM_ALLOW(-0.20)[unitedplugins.com:s=mailjet];
> 	R_SPF_ALLOW(-0.20)[+ip4:185.250.236.0/22];
> 	MAILLIST(-0.11)[generic];
> 	MIME_GOOD(-0.10)[multipart/alternative,text/plain];
> 	MX_GOOD(-0.01)[];
> 	HAS_LIST_UNSUB(-0.01)[];
> 	DKIM_TRACE(0.00)[unitedplugins.com:+];
> 	RCPT_COUNT_ONE(0.00)[1];
> 	TO_MATCH_ENVRCPT_ALL(0.00)[];
> 	SPF_REPUTATION_HAM(0.00)[-0.51883337370734];
> 	IP_REPUTATION_HAM(0.00)[asn: 200069(-0.21), country: FR(0.00), 	 ip: 185.250.237.60(0.00)];
> 
> I trained the email as HAM. But no BAYES entry appears.

Check your logs if rspamd complains that Bayes has not been trained enough.

Otherwise learn the message with: rspamc learn_spam /path/to/spammail.eml
Then check if it's recognized: rspamc /path/to/spammail.eml  (BAYES_SPAM should be listed)

You do not train spam per user, right?
#per_user = true; # Enable per user classifier
https://rspamd.com/doc/configuration/statistic.html

> In addition, the domain is in a multimap whitelist which is also not displayed. The email is accepted, but only just.

Then any of your definition/selector/regex is wrong. Multimap works if configured correctly.

>> Rspamd includes the public suffix list (see https://publicsuffix.org/list/).
>> https://github.com/rspamd/rspamd/blob/master/contrib/publicsuffix/effective_tld_names.dat
> 
> Ok, then I don't have to worry about the multiple TLDs. Rspamd does this automatically.
> 
>> Try to be more precise when reading the documentation.
> 
> Unfortunately, the documentation is very confusing and not very structured. You don't recognize the connections.

As I wrote before:

  You've copied the example "email:domain:tld" which converts user at foo.example.com to example.com.
  So user at cmp.dotmail.co.uk will be converted to dotmail.co.uk, which is not in your list and therefore does not match.

You've added "email:domain" style domains to your multimap but configured "email:domain:tld" and wondered why it did not work.
The example in the documentation was clear about that and that's why I wrote you should try to be more precise when reading the documentation.


>> Just a hint: if you add e.g. adidas.com to your whitelist, any spammer that sends with @adidas.com is probably whitelisted due to score -20.
>> I'd rather train rspamd to filter spam and use those maps to assist learning. Otherwise a spammail with an added score of -20 will probably be learned as ham, which can ruin your bayes filter.
> 
> 
> Should an email that does not actually come from adidas.com not be checked further and be assessed differently as phishing? Check against DKIM and MX. This makes it clear that the email doesn't really come from adidias.com, right? OK, maybe -20 is a bit much.

This was just an example of what can happen when you set extreme scores like -20, it was not about the domain adidas.com.

Of course there are other tests and rspamd will check DKIM/DMARC/... if configured.

> But what always surprises me is that it's hard to understand why sometimes my multimaps work and the next email doesn't.

That means rspamd generally knows about the multimap, otherwise it would never match.
If it matches only sometimes you did not correctly configure the selector/type or the multimap content does not match (errors in regex, incomplete domainnames, ...).

> Why I can see that Bayesian statistics counts up for incoming emails, but no check is displayed in the email fields.

Check the logs if it complains about too few learned emails.

Best regards,
Gerald


From rspamd at linuxmaker.com  Tue Mar  5 09:49:39 2024
From: rspamd at linuxmaker.com (Andreas)
Date: Tue, 05 Mar 2024 10:49:39 +0100
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
Message-ID: <6036955.lOV4Wx5bFT@stuttgart>

Hello,

Postfix with Dovecot, Rspamd and ClamAV is installed on a Debian 12. On the one 
hand, I see the following in the log file and in the Rspamd webgui:

tail -f /var/log/rspamd/rspamd.log 
2024-03-05 10:26:41 #46897(rspamd_proxy) <5d7d22>; lua; clamav.lua:117: 
clamav: failed to scan, maximum retransmits exceed 
2024-03-05 10:26:41 #46897(rspamd_proxy) rspamd_log_reset_repeated: Last 
message repeated 58 times 
2024-03-05 10:26:41 #46897(rspamd_proxy) <f77b91>; oletools; common.lua:414: 
oletools: extension matched: |docx|nil| 
2024-03-05 10:26:41 #46897(rspamd_proxy) <f77b91>; oletools; oletools.lua:111: 
oletools: error: Socket error detected: Verbindungsaufbau abgelehnt; retry IP: 
127.0.0.1; retries left: 1 
2024-03-05 10:26:41 #46897(rspamd_proxy) <f77b91>; oletools; oletools.lua:111: 
oletools: error: Socket error detected: Verbindungsaufbau abgelehnt; retry IP: 
127.0.0.1; retries left: 0 
2024-03-05 10:26:41 #46897(rspamd_proxy) <f77b91>; lua; oletools.lua:125: 
oletools: failed to scan, maximum retransmits exceed - err: Socket error 
detected: Verbindungsaufbau abgelehnt 
2024-03-05 10:26:45 #46897(rspamd_proxy) <99ae35>; lua; clamav.lua:117: 
clamav: failed to scan, maximum retransmits exceed 
2024-03-05 10:26:45 #46897(rspamd_proxy) <99ae35>; lua; clamav.lua:117: 
clamav: failed to scan, maximum retransmits exceed 
2024-03-05 10:26:45 #46897(rspamd_proxy) <99ae35>; lua; clamav.lua:117: 
clamav: failed to scan, maximum retransmits exceed 
2024-03-05 10:26:45 #46897(rspamd_proxy) <99ae35>; lua; clamav.lua:117: 
clamav: failed to scan, maximum retransmits exceed

On the other hand, the action "soft reject" is applied to normal emails, where 
I would actually expect "no action".
Where can I configure the "soft reject" and how do I get the "clamav: failed to 
scan, maximum retransmits exceed" controlled? For the latter, I set timeout=20 
in /etc/rspamd/local.d/antivirus.conf.

Best regards

Andreas


From rspamd at jubileegroup.co.uk  Tue Mar  5 11:16:45 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Tue, 5 Mar 2024 11:16:45 +0000 (GMT)
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
In-Reply-To: <6036955.lOV4Wx5bFT@stuttgart>
References: <6036955.lOV4Wx5bFT@stuttgart>
Message-ID: <133e8bfa-8db7-2da3-5163-554d35c17c5@jubileegroup.co.uk>

Hi there,

On Tue, 5 Mar 2024, Andreas wrote:

> ...
> 2024-03-05 10:26:41 #46897(rspamd_proxy) <f77b91>; oletools; oletools.lua:111:
> oletools: error: Socket error detected: Verbindungsaufbau abgelehnt; retry IP:
> ...

Has this all worked for you in the past, or is it a new installation?

"Connection rejected" likely means either that the clamd daemon is not running
(and so no connection to the daemon can be made) or your configuration is broken
(in which case the connection attempts are e.g. being made to the wrong socket).
It's possible that you have not arranged appropriate permissions.

> ... I set timeout=20 in /etc/rspamd/local.d/antivirus.conf.

Although I would not expect "Connection rejected" to be the message
which you see when a connection is timed out, in my experience twenty
seconds is very optimistic for ClamAV.  I've seen it take minutes to
scan a PDF.  Obviously this will depend on the performance of (the
machine providing) your scanning process, and what you send to it.

Below are the sizes and approximate ClamAV scanning times of those
messages which arrived here in March and took longer than 20 seconds
to scan.  The scanner is an 8Gbyte Pi4b dedicated to scanning.

   bytes  seconds
   69268  20 - 80
  158175  25 - 33
  187887  33
  872245 140
1055157 142

We see very few messages much larger than this.  They are rejected.

-- 

73,
Ged.

From rspamd at linuxmaker.com  Tue Mar  5 13:01:57 2024
From: rspamd at linuxmaker.com (Andreas)
Date: Tue, 05 Mar 2024 14:01:57 +0100
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
In-Reply-To: <133e8bfa-8db7-2da3-5163-554d35c17c5@jubileegroup.co.uk>
References: <6036955.lOV4Wx5bFT@stuttgart>
 <133e8bfa-8db7-2da3-5163-554d35c17c5@jubileegroup.co.uk>
Message-ID: <4552196.LvFx2qVVIh@stuttgart>

Hello,

and thanks for your answer.
> 
> Has this all worked for you in the past, or is it a new installation?
> 
I only mentioned "clamav: failed to scan" because I don't know whether that is the cause 
of my core question about the "soft reject".

So let me ask my question again what I can do to turn "soft reject" into a normal delivery.
Yes, this is a new installation in which I added rejects to the multimap.conf, as shown here 
recently, which selects according to subject content.
So where would I have to look so that the soft rejects are delivered normally?

Best regards

Andreas


From rspamd at jubileegroup.co.uk  Tue Mar  5 13:26:08 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Tue, 5 Mar 2024 13:26:08 +0000 (GMT)
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
In-Reply-To: <4552196.LvFx2qVVIh@stuttgart>
References: <6036955.lOV4Wx5bFT@stuttgart>
 <133e8bfa-8db7-2da3-5163-554d35c17c5@jubileegroup.co.uk>
 <4552196.LvFx2qVVIh@stuttgart>
Message-ID: <925f998f-498c-e0af-751e-dfe1bb061f2@jubileegroup.co.uk>

Hi there,

On Tue, 5 Mar 2024, Andreas wrote:

> ...  where would I have to look so that the soft rejects are
> delivered normally?

You could try something like

rspamadm configdump | grep -C10 soft

-- 

73,
Ged.

From rspamd at linuxmaker.com  Tue Mar  5 13:37:39 2024
From: rspamd at linuxmaker.com (Andreas)
Date: Tue, 05 Mar 2024 14:37:39 +0100
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
In-Reply-To: <925f998f-498c-e0af-751e-dfe1bb061f2@jubileegroup.co.uk>
References: <6036955.lOV4Wx5bFT@stuttgart> <4552196.LvFx2qVVIh@stuttgart>
 <925f998f-498c-e0af-751e-dfe1bb061f2@jubileegroup.co.uk>
Message-ID: <12397230.O9o76ZdvQC@stuttgart>

Am Dienstag, 5. M?rz 2024, 14:26:08 CET schrieb G.W. Haywood:
> Hi there,
> 
> On Tue, 5 Mar 2024, Andreas wrote:
> > ...  where would I have to look so that the soft rejects are
> > delivered normally?
> 
> You could try something like
> 
> rspamadm configdump | grep -C10 soft

Thanks for it. Here?s the output:

rspamadm configdump | grep -C10 soft 
conflicting files /etc/rspamd/local.d/statistic.conf and /etc/rspamd/local.d/
classifier-bayes.conf are found: Rspamd classifier configuration might be broken! 
ip_score module is deprecated in honor of reputation module! 
redefining fallback backend from /etc/rspamd/maps.d/surbl-whitelist.inc to /
etc/rspamd/maps.d/surbl-whitelist.inc 
implicitly enabling luapattern returncodes_matcher for rule SURBL_HASHBL 
implicitly enabling luapattern returncodes_matcher for rule DWL_DNSWL 
implicitly enabling luapattern returncodes_matcher for rule RCVD_IN_DNSWL 
       } 
       VIRUS_REJECT { 
           action = "reject"; 
           expression = "CLAM_VIRUS"; 
           message = "REJECT - virus found (support-id ${queueid})"; 
       } 
       VIRUS_SCANNER_FAIL_EXC { 
           honor_action [ 
               "reject", 
           ] 
           action = "soft reject"; 
           expression = "CLAM_VIRUS_FAIL"; 
           message = "Tempfail - internal scan engine error. (support-id $
{queueid})"; 
       } 
   } 
} 
worker { 
   normal { 
       bind_socket = "localhost:11333"; 
       mime = true; 
   } 
-- 
           } 
           ARC_NA { 
               weight = 0; 
               description = "ARC signature absent"; 
               groups [ 
                   "arc", 
               ] 
           } 
           R_DKIM_TEMPFAIL { 
               weight = 0; 
               description = "DKIM verification soft-failed"; 
               groups [ 
                   "dkim", 
               ] 
           } 
           R_DKIM_NA { 
               weight = 0; 
               description = "Missing DKIM signature"; 
               groups [ 
                   "dkim", 
               ] 
-- 
           } 
           DMARC_POLICY_ALLOW_WITH_FAILURES { 
               weight = -0.500000; 
               description = "DMARC permit policy with DKIM/SPF failure"; 
               groups [ 
                   "dmarc", 
               ] 
           } 
           R_SPF_SOFTFAIL { 
               weight = 0; 
               description = "SPF verification soft-failed"; 
               groups [ 
                   "spf", 
               ] 
           } 
           R_SPF_NEUTRAL { 
               weight = 0; 
               description = "SPF policy is neutral"; 
               groups [ 
                   "spf", 
               ] 
-- 
       "169.254.0.0/16", 
       "fe80::/10", 
       "127.2.4.7", 
   ] 
   pidfile = "/run/rspamd/rspamd.pid"; 
   check_all_filters = true; 
   cache_file = "/var/lib/rspamd/symbols.cache"; 
   map_watch_interval = 300; 
   map_file_watch_multiplier = 0.100000; 
   dynamic_conf = "/var/lib/rspamd/rspamd_dynamic"; 
   soft_reject_on_timeout = false; 
   history_file = "/var/lib/rspamd/rspamd.history"; 
   hs_cache_dir = "/var/lib/rspamd/"; 
   dns_max_requests = 64; 
   max_lua_urls = 1024; 
   max_urls = 10240; 
   max_recipients = 1024; 
   task_timeout = 8; 
   tempdir = "/tmp"; 
   dns { 
       timeout = 1; 
-- 
   timeout = 300; 
   message = "Try again later"; 
   expire = 86400; 
   whitelist_domains_url [ 
       "/etc/rspamd/local.d/greylist-whitelist-domains.inc", 
       "/etc/rspamd/local.d/maps.d/greylist-whitelist-domains.inc", 
   ] 
   ipv6_mask = 64; 
   max_data_len = 10000; 
   key_prefix = "rg"; 
   action = "soft reject"; 
   ipv4_mask = 19; 
} 
url_tags { 
   enabled = false; 
} 
mime_types { 
   file [ 
       "https://maps.rspamd.com/rspamd/mime_types.inc.zst", 
       "/etc/rspamd/local.d/maps.d/mime_types.inc.local", 
       "/var/lib/rspamd/mime_types.inc.local", 
-- 
       expression = "FORGED_SENDER & (ENVFROM_PRVS | ENVFROM_VERP)"; 
   } 
   IP_SCORE_FREEMAIL { 
       score = 0; 
       description = "Negate IP_SCORE when message comes from FreeMail"; 
       expression = "FREEMAIL_FROM & SENDER_REP_SPAM"; 
       policy = "remove_weight"; 
   } 
   VIOLATED_DIRECT_SPF { 
       score = 3.500000; 
       description = "Has no Received (or no trusted received relays) and SPF 
policy fails or soft fails"; 
       expression = "(R_SPF_FAIL | R_SPF_SOFTFAIL) & (RCVD_COUNT_ZERO | 
RCVD_NO_TLS_LAST)"; 
       policy = "leave"; 
   } 
   AUTH_NA { 
       score = 1; 
       policy = "remove_weight"; 
       expression = "R_DKIM_NA & R_SPF_NA & DMARC_NA & ARC_NA"; 
       description = "Authenticating message via SPF/DKIM/DMARC/ARC not 
available"; 
   } 
   BAD_REP_POLICIES { 
-- 
       group = "compromised_hosts"; 
   } 
   INTRODUCTION { 
       score = 2; 
       description = "Sender introduces themselves"; 
       re = "/\\b(?:my name is\\b|(?:i am|this is)\\s+(?:mr|mrs|ms|miss|
master|sir|prof(?:essor)?|d(?:octo)?r|rev(?:erend)?)(?:\\.|\\b))/{sa_body}i"; 
       group = "scams"; 
       one_shot = true; 
   } 
   OLD_X_MAILER { 
       re = "X-Mailer=/^(?:Microsoft Outlook Express|QUALCOMM Windows Eudora 
(Pro )?Version [1-6]\\.|The Bat! \\(v[12]\\.|Microsoft Outlook IMO, Build 9\\.
0\\.|Microsoft Outlook, Build 10\\.|i(Phone|Pad) Mail \\((?:[1-8][A-L]|12H|
13E))/
{header}"; 
       description = "X-Mailer header has a very old MUA version"; 
       group = "headers"; 
       score = 2; 
   } 
   TO_EXCESS_QP { 
       re = "To=/=\\?\\S+\\?Q\\?/iX & !To=/[\\x00-\\x08\\x0b\\x0c\\x0e-\\x1f\
\x7f-\\xff]/Hr"; 
       description = "To header is unnecessarily encoded in quoted-printable"; 
       group = "excessqp"; 
       score = 1.200000; 
   } 
-- 
       score = 0; 
   } 
   MAIL_RU_MAILER { 
       re = "(X-Mailer=/^Mail\\.Ru Mailer 1\\.0$/H) & (Received=/^(?:from \\[\
\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\] )?by e\\.mail\\.ru with HTTP;/
mH)"; 
       description = "Sent with Mail.Ru webmail"; 
       group = "headers"; 
       score = 0; 
   } 
   MICROSOFT_SPAM { 
       re = "X-Forefront-Antispam-Report=/SFV:SPM/H"; 
       description = "Microsoft says the message is spam"; 
       group = "upstream_spam_filters"; 
       score = 4; 
   } 
   MIME_HTML_ONLY { 
       re = "has_only_html_part()"; 
       description = "Message has only an HTML part"; 
       group = "headers"; 
       score = 0.200000; 
   } 
   SUBJ_EXCESS_QP { 
-- 
       description = "Forged X-Mailer header"; 
       group = "headers"; 
       score = 4.500000; 
   } 
   HAS_X_ANTIABUSE { 
       re = "header_exists('X-AntiAbuse')"; 
       description = "Has X-AntiAbuse headers"; 
       group = "compromised_hosts"; 
   } 
   MISSING_MIMEOLE { 
       re = "(header_exists(X-MSMail-Priority)) & !(header_exists(X-MimeOLE)) 
& !(X-Mailer=/SquirrelMail\\b/H) & !(X-Mailer=/^Microsoft (?:Office )?Outlook 
[12]\\d\\.0/) & !(header_exists(X-Android-Message-Id))"; 
       description = "Mime-OLE is needed but absent (e.g. fake Outlook or fake 
Exchange)"; 
       group = "headers"; 
       score = 2; 
   } 
   MISSING_SUBJECT { 
       score = 2; 
       description = "Subject header is missing"; 
       re = "!raw_header_exists(Subject)"; 
       group = "headers"; 
       mime_only = true; 
-- 
       description = "Message contains X-PHP-Script pattern"; 
       group = "compromised_hosts"; 
   } 
   PRECEDENCE_BULK { 
       re = "Precedence=/bulk/Hi"; 
       description = "Message marked as bulk"; 
       group = "upstream_spam_filters"; 
       score = 0; 
   } 
   RATWARE_MS_HASH { 
       re = "(Message-Id=/[0-9a-f]{4,}\\$[0-9a-f]{4,}\\$[0-9a-f]{4,}\\@\\S+/H) 
& !(X-MimeOLE=/^Produced By Microsoft MimeOLE/H) & !(Received=/with Microsoft 
Exchange Server/H)"; 
       description = "Forged Exchange messages"; 
       group = "headers"; 
       score = 2; 
   } 
   R_RCVD_SPAMBOTS { 
       score = 3; 
       description = "Spambots signatures in received headers"; 
       re = "Received=/^from \\[\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\.\\d{1,3}\\] 
by [-.\\w+]{5,255}; [SMTWF][a-z][a-z], [\\s\\d]?\\d [JFMAJSOND][a-z][a-z] \
\d{4} \\d{2}:\\d{2}:\\d{2} [-+]\\d{4}$/mH"; 
       group = "headers"; 
       mime_only = true; 
-- 
       group = "headers"; 
       mime_only = true; 
   } 
   FORGED_MSGID_YAHOO { 
       re = "(Message-Id=/\\@yahoo\\.com\\b/iH) & !(From=/\\@yahoo\\.com\\b/
iH)"; 
       description = "Forged Yahoo Message-ID header"; 
       group = "headers"; 
       score = 2; 
   } 
   FORGED_MUA_OUTLOOK { 
       re = "((X-Mailer=/\\bOutlook Express [456]\\./H & !Message-Id=/^<?[A-
Za-z0-9-]{7}[A-Za-z0-9]{20}\\@hotmail\\.com>?$/mH & !Message-Id=/^<?(?:[0-9a-
f]{8}|[0-9a-f]{12})\\$[0-9a-f]{8}\\$[0-9a-f]{8}\\@\\S+>?$/H & !(List-
Unsubscribe=/<
mailto:(?:leave-\\S+|\\S+-unsubscribe)\\@\\S+>$/H | Received=/\\/CWT\\/DCE\\)/
H | Received=/iPlanet Messaging Server/H | Message-Id=/^<?BAY\\d+-DAV\\d+[A-
Z0-9]{25}\\@phx\\.gbl?>$/H | Message-Id=/^<?BAYC\\d+-PASMTP\\d+[A-Z0-9]{25}\
\@CEZ\\
.ICE>?$/H | Message-ID=/^<mailman\\.\\d+\\.\\d+\\.\\d+\\.[-+.:=\\w]+@[-a-zA-Z\
\d.]+>$/H)) | (X-Mailer=/^Microsoft Outlook(?: 8| CWS, Build 9|, Build 10)\\./
H & !Message-Id=/^<?(?:[0-9a-f]{8}|[0-9a-f]{12})\\$[0-9a-f]{8}\\$[0-9a-f]{8}\
\@\\
S+>?$/H & !Message-Id=/^<?\\!\\~\\!>?/H & !Message-Id=/^<?[A-F\\d]{32}\\@\
\S+>?$/H & !Message-Id=/^<?[A-F\\d]{36,40}\\@\\S+>?$/H & !(List-Unsubscribe=/
<mailto:(?:leave-\\S+|\\S+-unsubscribe)\\@\\S+>$/H | Received=/\\/CWT\\/DCE\
\)/H | Rec
eived=/iPlanet Messaging Server/H | Message-Id=/^<?BAY\\d+-DAV\\d+[A-Z0-9]{25}
\\@phx\\.gbl?>$/H | Message-Id=/^<?BAYC\\d+-PASMTP\\d+[A-Z0-9]{25}\\@CEZ\
\.ICE>?$/H | Message-ID=/^<mailman\\.\\d+\\.\\d+\\.\\d+\\.[-+.:=\\w]+@[-a-zA-
Z\\d.]+>$
/H))) & !X-Mailer=/^Microsoft Outlook, Build 10.0.3416$/H & !X-Mailer=/
^Microsoft Outlook Express 6.00.3790.3959$/H & !Message-Id=/^<?[A-F\\d]{32}\
\@\\S+>?$/H"; 
       description = "Forged Outlook MUA"; 
       group = "mua"; 
       score = 3; 
   } 
   FROM_EXCESS_BASE64 { 
       score = 1.500000; 
       description = "From header is unnecessarily encoded in base64"; 
       re = "From=/=\\?\\S+\\?B\\?/iX & !From=/[\\x00-\\x08\\x0b\\x0c\\x0e-\
\x1f\\x7f-\\xff]/Hr"; 
       group = "excessb64"; 
       mime_only = true; 
-- 
   } 
   SUBJ_EXCESS_BASE64 { 
       re = "Subject=/\\=\\?\\S+\\?B\\?/iX & !Subject=/[\\x00-\\x08\\x0b\\x0c\
\x0e-\\x1f\\x7f-\\xff]/Hr"; 
       description = "Subject header is unnecessarily encoded in base64"; 
       group = "excessb64"; 
       score = 1.500000; 
   } 
   FORGED_OUTLOOK_HTML { 
       score = 5; 
       description = "Forged Outlook HTML signature"; 
       re = "!Received=/from \\[\\S+\\] by \\S+\\.(?:groups|scd|dcn)\\.yahoo\
\.com with NNFMP/H & X-Mailer=/^Microsoft Outlook\\b/H & 
has_only_html_part()"; 
       group = "headers"; 
       mime_only = true; 
   } 
   FORGED_OUTLOOK_TAGS { 
       re = "!Received=/from \\[\\S+\\] by \\S+\\.(?:groups|scd|dcn)\\.yahoo\
\.com with NNFMP/H & X-Mailer=/^Microsoft Outlook\\b/H & 
content_type_is_type(text) & content_type_is_subtype(/.?html/) & !
(has_html_tag(html) & has_html_tag(h
ead) & has_html_tag(meta) & has_html_tag(body))"; 
       description = "Message pretends to be send from Outlook but has 
'strange' tags"; 
       group = "headers"; 
       score = 2.100000; 
   } 
   FROM_NEEDS_ENCODING { 
       score = 1; 
       description = "From header needs encoding"; 
       re = "!(From=/=\\?\\S+\\?B\\?/iX) & !(From=/=\\?\\S+\\?Q\\?/iX) & 
(From=/[\\x00-\\x08\\x0b\\x0c\\x0e-\\x1f\\x7f-\\xff]/X)"; 
       group = "headers"; 
       mime_only = true;


From rspamd at jubileegroup.co.uk  Tue Mar  5 15:14:16 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Tue, 5 Mar 2024 15:14:16 +0000 (GMT)
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
In-Reply-To: <12397230.O9o76ZdvQC@stuttgart>
References: <6036955.lOV4Wx5bFT@stuttgart> <4552196.LvFx2qVVIh@stuttgart>
 <925f998f-498c-e0af-751e-dfe1bb061f2@jubileegroup.co.uk>
 <12397230.O9o76ZdvQC@stuttgart>
Message-ID: <33219a3b-7142-ee3-77b-87503fe2397@jubileegroup.co.uk>

Hi there,

On Tue, 5 Mar 2024, Andreas wrote:

> Am Dienstag, 5. M?rz 2024, 14:26:08 CET schrieb G.W. Haywood:
> > On Tue, 5 Mar 2024, Andreas wrote:
> > > ...  where would I have to look so that the soft rejects are
> > > delivered normally?
> > 
> > You could try something like
> > 
> > rspamadm configdump | grep -C10 soft
> 
> Thanks for it. Here?s the output:
> 
> rspamadm configdump | grep -C10 soft
> conflicting files /etc/rspamd/local.d/statistic.conf and /etc/rspamd/local.d/
> classifier-bayes.conf are found: Rspamd classifier configuration might be broken!
> ...

Does the output not tell you what you need to know?

-- 

73,
Ged.

From rspamd at linuxmaker.com  Tue Mar  5 15:49:22 2024
From: rspamd at linuxmaker.com (Andreas)
Date: Tue, 05 Mar 2024 16:49:22 +0100
Subject: [Rspamd-Users] "soft reject" and "clamav: failed to scan,
 maximum retransmits exceed"
In-Reply-To: <33219a3b-7142-ee3-77b-87503fe2397@jubileegroup.co.uk>
References: <6036955.lOV4Wx5bFT@stuttgart> <12397230.O9o76ZdvQC@stuttgart>
 <33219a3b-7142-ee3-77b-87503fe2397@jubileegroup.co.uk>
Message-ID: <4555443.LvFx2qVVIh@stuttgart>

Am Dienstag, 5. M?rz 2024, 16:14:16 CET schrieb G.W. Haywood:
> Does the output not tell you what you need to know?

I actually saw that too. On the one hand, I see the same conflict message on a 
similarly installed mail server where the problem with "soft reject" does not 
occur.
On the other hand, contains

/etc/rspamd/local.d/classifier-bayes.conf
backend = "redis";

and

/etc/rspamd/local.d/statistic.conf
classifier "bayes" {
   tokenizer {
     name = "osb";
   }
   cache {
   }
   new_schema = true; # Always use new schema
   store_tokens = false; # Redefine if storing of tokens is desired
   signatures = false; # Store learn signatures
   #per_user = true; # Enable per user classifier
   min_tokens = 11;
   backend = "redis";
   min_learns = 200;

   statfile {
     symbol = "BAYES_HAM";
     spam = false;
   }
   statfile {
     symbol = "BAYES_SPAM";
     spam = true;
   }
   learn_condition = 'return require("lua_bayes_learn").can_learn';

   # Autolearn sample
    autolearn {
     spam_threshold = 17.0; # When to learn spam (score >= threshold and 
action is rejected)
     junk_threshold = 4.0; # When to learn spam (score >= threshold and action 
is rewrite subject or add header, and has two or more positive results)
     ham_threshold = -0.5; # When to learn ham (score <= threshold and action 
is no action, and score is negative or has three or more negative results)
     check_balance = true; # Check spam and ham balance
     min_balance = 0.9; # Keep diff for spam/ham learns for at least this value
   }

   .include(try=true; priority=1) "$LOCAL_CONFDIR/local.d/classifier-
bayes.conf"
   .include(try=true; priority=10) "$LOCAL_CONFDIR/override.d/classifier-
bayes.conf"
}

I can agree with the statement "Rspamd classifier configuration might be 
broken!" do not follow. Furthermore, I hadn't edited either file. These are, so 
to speak, the original states after installation.


From t.hendricks at interpool.de  Thu Mar  7 09:50:43 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Thu, 7 Mar 2024 10:50:43 +0100
Subject: [Rspamd-Users] No greylisting with score 5.70 ?
Message-ID: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>

Dear list,

it seems I?m still having a little understanding difficulty.

My configdump says, greylist score is 4:

> actions {
>     greylist = 4;
>     add_header = 6;
>     reject = 15;
> }


Shouldn?t this mail then be greylisted?

2024-03-07 09:50:42 #1190659(normal) <dc6c73>; task; rspamd_task_write_log: id: <300f7a2d3629f4ac01902b53846ee69a at bn-yx.com>, qid: <2859C472D5B>, ip: 64.188.4.213, from: <rothgmbhkf at bn-yx.com>, (default: F (no action): [5.69/15.00] [BAYES_SPAM(5.09){99.99%;},FORGED_SENDER(0.30){rothgmbhse at bn-yx.com;rothgmbhkf at bn-yx.com;},MIME_HTML_ONLY(0.20){},BAD_REP_POLICIES(0.10){},ARC_NA(0.00){},ASN(0.00){asn:8100, ipnet:64.188.0.0/20, country:US;},DKIM_TRACE(0.00){bn-yx.com:+;},DMARC_POLICY_ALLOW(0.00){bn-yx.com;none;},FROM_HAS_DN(0.00){},FROM_NEQ_ENVFROM(0.00){rothgmbhse at bn-yx.com;rothgmbhkf at bn-yx.com;},GREYLIST(0.00){pass;body;},HAS_REPLYTO(0.00){rothgmbh at bn-yx.com;},MID_RHS_MATCH_FROM(0.00){},MIME_TRACE(0.00){0:~;},NEURAL_SPAM(0.00){0.548;},RCPT_COUNT_ONE(0.00){1;},RCVD_COUNT_ZERO(0.00){0;},REPLYTO_DOM_EQ_FROM_DOM(0.00){},RWL_MAILSPIKE_POSSIBLE(0.00){64.188.4.213:from;},R_DKIM_ALLOW(0.00){bn-yx.com:s=mail;},R_SPF_ALLOW(0.00){+ip4:64.188.4.213:c;},TO_DN_NONE(0.00){},TO_MATCH_ENVRCPT_ALL(0.00){}]), len: 6755, time: 76.995ms, dns req: 22, digest: <de411862bd4aec0f38b62dab2e42218b>, rcpts: <t.hendricks at interpool.de>, mime_rcpts: <t.hendricks at interpool.de> 
2024-03-07 09:50:42 #1190659(normal) <dc6c73>; task; rspamd_protocol_http_reply: regexp statistics: 0 pcre regexps scanned, 4 regexps matched, 174 regexps total, 44 regexps cached, 0B scanned using pcre, 12.75KiB scanned total

Thanks for clarification!

Tino

From rspamd-users at judo.za.org  Thu Mar  7 10:17:46 2024
From: rspamd-users at judo.za.org (Andrew Lewis)
Date: Thu, 07 Mar 2024 12:17:46 +0200
Subject: [Rspamd-Users] No greylisting with score 5.70 ?
In-Reply-To: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
References: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
Message-ID: <8c27ce3b2d2b9e356b0917d1282bc3336ca2df4a.camel@judo.za.org>

Hi Tino,

On Thu, 2024-03-07 at 10:50 +0100, Tino Hendricks via Users wrote:
> My configdump says, greylist score is 4:
> Shouldn?t this mail then be greylisted?

Greylisting module replaces `greylist` action with `no action` once a
message has passed greylisting; that seems to be what you're seeing:

> GREYLIST(0.00){pass;body;}

Best,
-AL.

From lucas at lucasrolff.com  Thu Mar  7 10:07:20 2024
From: lucas at lucasrolff.com (Lucas Rolff)
Date: Thu, 7 Mar 2024 10:07:20 +0000
Subject: [Rspamd-Users] No greylisting with score 5.70 ?
In-Reply-To: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
References: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
Message-ID: <D85048C0-6328-493A-9B9E-5EDA713544A1@lucasrolff.com>

Have you checked whether `rothgmbhkf at bn-yx.com<mailto:rothgmbhkf at bn-yx.com>` attempted to send an email in the past 24 hours prior to that email to the recipient? Since graylisting (by default) will only happen once for `from : to : ip` (/19 for v4, /64 for v6) in a 24 hour period.

On 7 Mar 2024, at 10:50, Tino Hendricks via Users <users at lists.rspamd.com> wrote:

Dear list,

it seems I?m still having a little understanding difficulty.

My configdump says, greylist score is 4:

actions {
   greylist = 4;
   add_header = 6;
   reject = 15;
}


Shouldn?t this mail then be greylisted?

2024-03-07 09:50:42 #1190659(normal) <dc6c73>; task; rspamd_task_write_log: id: <300f7a2d3629f4ac01902b53846ee69a at bn-yx.com>, qid: <2859C472D5B>, ip: 64.188.4.213, from: <rothgmbhkf at bn-yx.com>, (default: F (no action): [5.69/15.00] [BAYES_SPAM(5.09){99.99%;},FORGED_SENDER(0.30){rothgmbhse at bn-yx.com;rothgmbhkf at bn-yx.com;},MIME_HTML_ONLY(0.20){},BAD_REP_POLICIES(0.10){},ARC_NA(0.00){},ASN(0.00){asn:8100, ipnet:64.188.0.0/20, country:US;},DKIM_TRACE(0.00){bn-yx.com:+;},DMARC_POLICY_ALLOW(0.00){bn-yx.com;none;},FROM_HAS_DN(0.00){},FROM_NEQ_ENVFROM(0.00){rothgmbhse at bn-yx.com;rothgmbhkf at bn-yx.com;},GREYLIST(0.00){pass;body;},HAS_REPLYTO(0.00){rothgmbh at bn-yx.com;},MID_RHS_MATCH_FROM(0.00){},MIME_TRACE(0.00){0:~;},NEURAL_SPAM(0.00){0.548;},RCPT_COUNT_ONE(0.00){1;},RCVD_COUNT_ZERO(0.00){0;},REPLYTO_DOM_EQ_FROM_DOM(0.00){},RWL_MAILSPIKE_POSSIBLE(0.00){64.188.4.213:from;},R_DKIM_ALLOW(0.00){bn-yx.com:s=mail;},R_SPF_ALLOW(0.00){+ip4:64.188.4.213:c;},TO_DN_NONE(0.00){},TO_MATCH_ENVRCPT_ALL(0.00){}]), len: 6755, time: 76.995ms, dns req: 22, digest: <de411862bd4aec0f38b62dab2e42218b>, rcpts: <t.hendricks at interpool.de>, mime_rcpts: <t.hendricks at interpool.de>
2024-03-07 09:50:42 #1190659(normal) <dc6c73>; task; rspamd_protocol_http_reply: regexp statistics: 0 pcre regexps scanned, 4 regexps matched, 174 regexps total, 44 regexps cached, 0B scanned using pcre, 12.75KiB scanned total

Thanks for clarification!

Tino
--
Users mailing list
Users at lists.rspamd.com
https://lists.rspamd.com/mailman/listinfo/users


From t.hendricks at interpool.de  Thu Mar  7 11:20:11 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Thu, 7 Mar 2024 12:20:11 +0100
Subject: [Rspamd-Users] No greylisting with score 5.70 ?
In-Reply-To: <8c27ce3b2d2b9e356b0917d1282bc3336ca2df4a.camel@judo.za.org>
References: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
 <8c27ce3b2d2b9e356b0917d1282bc3336ca2df4a.camel@judo.za.org>
Message-ID: <3123AEAE-810E-4A46-A479-526C99849692@interpool.de>

Hi all,

thanks you all for your as usually valuable input!

I always thought greylisting would take place on a sender/receiver basis.
Now that I read you answers and the fine manual thouroughly I see it is (besides body) based on sender?s IP ? which makes a lot of sense.

Sorry for the noise; so I need to tweak other parameters to catch these annoying mails.

Thanks and have great day

Tino

> Am 07.03.2024 um 11:17 schrieb Andrew Lewis via Users <users at lists.rspamd.com>:
> 
> Hi Tino,
> 
> On Thu, 2024-03-07 at 10:50 +0100, Tino Hendricks via Users wrote:
>> My configdump says, greylist score is 4:
>> Shouldn?t this mail then be greylisted?
> 
> Greylisting module replaces `greylist` action with `no action` once a
> message has passed greylisting; that seems to be what you're seeing:
> 
>> GREYLIST(0.00){pass;body;}
> 
> Best,
> -AL.
> -- 
> Users mailing list
> Users at lists.rspamd.com
> https://lists.rspamd.com/mailman/listinfo/users


From rspamd at jubileegroup.co.uk  Thu Mar  7 12:36:24 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Thu, 7 Mar 2024 12:36:24 +0000 (GMT)
Subject: [Rspamd-Users] No greylisting with score 5.70 ?
In-Reply-To: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
References: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
Message-ID: <18b5e9-31e9-16bb-bca2-a8aeda6f25@jubileegroup.co.uk>

Hi there,

On Thu, 7 Mar 2024, Tino Hendricks via Users wrote:

> ...
> 2024-03-07 09:50:42 ... ip: 64.188.4.213 ...
> ...

> ...
> ... I need to tweak other parameters to catch these annoying mails.
> ...

You can probably eliminate around half your spam by using DNS-based
blacklists.  For some examples, take a look at

https://multirbl.valli.org/dnsbl-lookup/64.188.4.213.html

which as of this morning shows that IP listed in 26 blacklists.  As it
happens, we use about half of those to score connecting IP addresses.

This IP would *never* get spam through to us, based entirely on the IP
that connects.  The decision is made even before the connecting server
says 'EHLO'.  After the message has been received (but not accepted),
the connection then gets dropped into the tarpit until the sending
server gives up trying.  This does use milter children to monitor the
state of all the connections, but there are a lot of them and they're
doing nothing while they wait for a connection to break so it's cheap.

8<----------------------------------------------------------------------
milter=> SELECT timestamp,ip,bl_count,bl_score,delay \
          FROM connections \
 	 WHERE ip && '64.188.4.0/24' AND \
 	    timestamp > '2024-02-23' ORDER BY timestamp ;

       timestamp      |      ip      | bl_count | bl_score | delay 
---------------------+--------------+----------+----------+-------
  2024-02-28 14:29:32 | 64.188.4.208 |        3 |        5 |   130
  2024-02-28 14:30:24 | 64.188.4.208 |        3 |        5 |   150
  2024-02-28 14:32:18 | 64.188.4.208 |        3 |        5 |    60
  2024-02-28 14:33:21 | 64.188.4.208 |        3 |        5 |    30
  2024-02-28 14:38:33 | 64.188.4.208 |        3 |        5 |    90
  2024-02-28 14:38:33 | 64.188.4.208 |        3 |        5 |  1090
  2024-02-28 14:40:48 | 64.188.4.208 |        4 |        6 |  1690
  2024-03-04 15:27:37 | 64.188.4.215 |        4 |        7 |    80
  2024-03-04 16:58:38 | 64.188.4.215 |        7 |       10 |    30
  2024-03-04 17:29:50 | 64.188.4.215 |        7 |       10 |   210
  2024-03-04 17:33:49 | 64.188.4.215 |        7 |       10 |    30
  2024-03-04 18:10:56 | 64.188.4.215 |        8 |       12 |   300
  2024-03-04 18:39:33 | 64.188.4.215 |        9 |       13 |    10
  2024-03-04 19:40:06 | 64.188.4.215 |        9 |       13 |    10
  2024-03-05 13:28:49 | 64.188.4.207 |        2 |        2 |   440
  2024-03-05 15:41:01 | 64.188.4.207 |        6 |        9 |  7840
(16 rows)
8<----------------------------------------------------------------------

You can see in the table that the score for each of the three IPs in
this /24 which tried to connect to us in the past two weeks had, by
the time of the last connection increased from the value which it had
at the first.  This is typical of many spammer IPs, and gives you an
idea of the magnitude of greylist delay which might be useful.  Our
servers reply with a 4xx (temporary failure) to everything that they
don't like, the idea being that by the time the greylist interval has
expired, the spammy IP has had plenty of time to get itself onto some
of the DNS block lists which we use.

The column 'delay' above is the number of seconds our server held onto
the connection without replying at End Of Message before the remote
server gave up.  That's over two hours for 64.188.4.207 on 2024-03-05
at 15:41:01, during which time that thread, at least, wasn't spamming
anybody else. }:-)

We also collect all the spammy messages and send them to e.g. SpamCop
and other spam collectors, and to law enforcement where appropriate.

You don't have to do anything so fancy as this of course, this is just
to show what's possible with SMTP and the resources that are available.

The spam from the IPs in my table was all about electric bicycles.

-- 

73,
Ged.

From t.hendricks at interpool.de  Thu Mar  7 12:56:28 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Thu, 7 Mar 2024 13:56:28 +0100
Subject: [Rspamd-Users] No greylisting with score 5.70 ?
In-Reply-To: <18b5e9-31e9-16bb-bca2-a8aeda6f25@jubileegroup.co.uk>
References: <25CC620B-B8DF-45B4-A2EE-16FF885B5220@interpool.de>
 <18b5e9-31e9-16bb-bca2-a8aeda6f25@jubileegroup.co.uk>
Message-ID: <98D4E1FC-27AB-4B4F-B27A-25E097D96451@interpool.de>

? which I as an bicycle enthusiast wouldn?t define as SPAM. ;-D

Thanks, Andrew!

> Am 07.03.2024 um 13:36 schrieb G.W. Haywood <rspamd at jubileegroup.co.uk>:
> 
> The spam from the IPs in my table was all about electric bicycles.


From rspam.mailing at kk-computer-service.de  Mon Mar 11 10:32:08 2024
From: rspam.mailing at kk-computer-service.de (=?UTF-8?Q?Knut_Kr=C3=BCger?=)
Date: Mon, 11 Mar 2024 11:32:08 +0100
Subject: [Rspamd-Users] Rules for different separated spam words such as
 pharmac=euticals or pharmaceutic*als
Message-ID: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>

Can I add a rule that eliminates special characters before checking for 
spam words?
It would be even better if it only does this when it finds unseparated 
keywords.
Pharmaceutical spam actually always contains unseparated words such as 
pharmacy, so that you users immediately recognize what it is about.
I think eliminating special characters from every message first would 
result in a high server load.

Or is there another way to recognize SPAM from botnets with constantly 
changing spellings of keywords?


From t.hendricks at interpool.de  Mon Mar 11 11:10:38 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Mon, 11 Mar 2024 12:10:38 +0100
Subject: [Rspamd-Users] Rules for different separated spam words such as
 pharmac=euticals or pharmaceutic*als
In-Reply-To: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
References: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
Message-ID: <46F6A0FD-5132-415B-821B-F3ED3253D2CE@interpool.de>

Hi Knut,

?Lua Master? Patrick helped me a lot with that in January:
https://lists.rspamd.com/pipermail/users/2024-January/003009.html
Maybe you?d give it a try, too?

Best,

Tino


> Am 11.03.2024 um 11:32 schrieb Knut Kr?ger via Users <users at lists.rspamd.com>:
> 
> Can I add a rule that eliminates special characters before checking for spam words?
> It would be even better if it only does this when it finds unseparated keywords.
> Pharmaceutical spam actually always contains unseparated words such as pharmacy, so that you users immediately recognize what it is about.
> I think eliminating special characters from every message first would result in a high server load.
> 
> Or is there another way to recognize SPAM from botnets with constantly changing spellings of keywords?
> 
> -- 
> Users mailing list
> Users at lists.rspamd.com
> https://lists.rspamd.com/mailman/listinfo/users


From rspamd at jubileegroup.co.uk  Mon Mar 11 11:28:26 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Mon, 11 Mar 2024 11:28:26 +0000 (GMT)
Subject: [Rspamd-Users] Rules for different separated spam words such as
 pharmac=euticals or pharmaceutic*als
In-Reply-To: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
References: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
Message-ID: <7b6fc548-9eaa-5686-9eae-952247174915@jubileegroup.co.uk>

Hi there,

On Mon, 11 Mar 2024, Knut Kr?ger via Users wrote:

> Can I add a rule that eliminates special characters before checking for spam 
> words?
> It would be even better if it only does this when it finds unseparated 
> keywords.
> Pharmaceutical spam actually always contains unseparated words such as 
> pharmacy, so that you users immediately recognize what it is about.

You can do things with regular expressions such as

p.?h.?a.?r.?m.?a.?c.?y

or

p\W?h\W?a\W?r\W?m\W?a\W?c\W?y

or even

p[\._-]?h[\._-]?a[\._-]?r[\._-]?m[\._-]?a[\._-]?c[\._-]?y

but it's a bit clumsy and you need to be careful that you don't catch
things unintentionally.  If you really do need to remove all 'special'
characters before scanning then you probably need to code something.
My feeling is that this kind of thing rapidly leads to diminishing
returns but I admit I do have quite a few rules which look for some of
the more common junk.

> I think eliminating special characters from every message first would result 
> in a high server load.

Not necessarily an excessive one, stripping characters from text isn't
a difficult operation.  Regular expressions however can bite you if
you're careless e.g. with dot-asterisk.  It depends on your situation;
on your spam profile, server performance, normal load, ...

> Or is there another way to recognize SPAM from botnets with constantly 
> changing spellings of keywords?

You might be better off looking for indications of the sources of the
spam rather than the content.  Try looking at the headers to see if
there are any common characteristics which help you identify the
unwanted messages.  I find blocking by ASN fairly effective, but if
you really are up against a world-wide botnet of hijacked boxes it's
going to be difficult to identify them all.  I use p0f to try to
identify compromised Windows boxes but it isn't especially reliable.

-- 

73,
Ged.

From rspam.mailing at kk-computer-service.de  Mon Mar 11 12:58:59 2024
From: rspam.mailing at kk-computer-service.de (=?UTF-8?Q?Knut_Kr=C3=BCger?=)
Date: Mon, 11 Mar 2024 13:58:59 +0100
Subject: [Rspamd-Users] Rules for different separated spam words such as
 pharmac=euticals or pharmaceutic*als
In-Reply-To: <7b6fc548-9eaa-5686-9eae-952247174915@jubileegroup.co.uk>
References: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
 <7b6fc548-9eaa-5686-9eae-952247174915@jubileegroup.co.uk>
Message-ID: <6ea00580-6d6e-4d5c-a927-87b81f976410@kk-computer-service.de>

Am 11.03.24 um 12:28 schrieb G.W. Haywood:
>
> You might be better off looking for indications of the sources of the
> spam rather than the content.? Try looking at the headers to see if
> there are any common characteristics which help you identify the
> unwanted messages.? I find blocking by ASN fairly effective, but if
> you really are up against a world-wide botnet of hijacked boxes it's
> going to be difficult to identify them all.? I use p0f to try to
> identify compromised Windows boxes but it isn't especially reliable.
>
Thank you for your Answer.

It looks like a bot net. Each mail is slightly different and from 
different IPs. Always different URL to buy the medicine, but always the 
same medicines for men
I will have a look to p0f

From rspam.mailing at kk-computer-service.de  Mon Mar 11 13:18:53 2024
From: rspam.mailing at kk-computer-service.de (=?UTF-8?Q?Knut_Kr=C3=BCger?=)
Date: Mon, 11 Mar 2024 14:18:53 +0100
Subject: [Rspamd-Users] Rules for different separated spam words such as
 pharmac=euticals or pharmaceutic*als
In-Reply-To: <46F6A0FD-5132-415B-821B-F3ED3253D2CE@interpool.de>
References: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
 <46F6A0FD-5132-415B-821B-F3ED3253D2CE@interpool.de>
Message-ID: <6b68dedd-ce21-41ed-a663-46c99ce572c1@kk-computer-service.de>

Am 11.03.24 um 12:10 schrieb Tino Hendricks:
Hi Tino,

In which config file do I have to insert this?
Does another config file have to be changed?

Best,
Knut
> Hi Knut,
>
> ?Lua Master? Patrick helped me a lot with that in January:
> https://lists.rspamd.com/pipermail/users/2024-January/003009.html 
> <https://lists.rspamd.com/pipermail/users/2024-January/003009.html>
> Maybe you?d give it a try, too?
>
> Best,
>
> Tino
>
>
>> Am 11.03.2024 um 11:32 schrieb Knut Kr?ger via Users 
>> <users at lists.rspamd.com>:
>>
>> Can I add a rule that eliminates special characters before checking 
>> for spam words?
>> It would be even better if it only does this when it finds 
>> unseparated keywords.
>> Pharmaceutical spam actually always contains unseparated words such 
>> as pharmacy, so that you users immediately recognize what it is about.
>> I think eliminating special characters from every message first would 
>> result in a high server load.
>>
>> Or is there another way to recognize SPAM from botnets with 
>> constantly changing spellings of keywords?
>>
>> -- Users mailing list
>> Users at lists.rspamd.com
>> https://lists.rspamd.com/mailman/listinfo/users
>


From t.hendricks at interpool.de  Mon Mar 11 13:35:13 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Mon, 11 Mar 2024 14:35:13 +0100
Subject: [Rspamd-Users] Rules for different separated spam words such as
 pharmac=euticals or pharmaceutic*als
In-Reply-To: <6b68dedd-ce21-41ed-a663-46c99ce572c1@kk-computer-service.de>
References: <434cebc3-5a6a-43dd-b370-131aeba825be@kk-computer-service.de>
 <46F6A0FD-5132-415B-821B-F3ED3253D2CE@interpool.de>
 <6b68dedd-ce21-41ed-a663-46c99ce572c1@kk-computer-service.de>
Message-ID: <9B8E3247-1538-44AF-9502-7292698FBB75@interpool.de>

Hi Knut,

to my (very limited) understanding the script goes into a /etc/rspamd/rspamd.local.lua, see  https://rspamd.com/doc/tutorials/writing_rules.html

If you want to do the composites thing it?s $LOCAL_CONFDIR/local.d/composites.conf

I stopped using the composites after a while because one guy here started writing serious mails that matched the composite thing exactly (like ?To? and ?Subject? starting with the same word?)

Cheers,

Tino


> Am 11.03.2024 um 14:18 schrieb Knut Kr?ger via Users <users at lists.rspamd.com>:
> 
> Am 11.03.24 um 12:10 schrieb Tino Hendricks:
> Hi Tino,
> 
> In which config file do I have to insert this?
> Does another config file have to be changed?
> 
> Best,
> Knut
>> Hi Knut,
>> 
>> ?Lua Master? Patrick helped me a lot with that in January:
>> https://lists.rspamd.com/pipermail/users/2024-January/003009.html <https://lists.rspamd.com/pipermail/users/2024-January/003009.html>
>> Maybe you?d give it a try, too?
>> 
>> Best,
>> 
>> Tino
>> 
>> 
>>> Am 11.03.2024 um 11:32 schrieb Knut Kr?ger via Users <users at lists.rspamd.com>:
>>> 
>>> Can I add a rule that eliminates special characters before checking for spam words?
>>> It would be even better if it only does this when it finds unseparated keywords.
>>> Pharmaceutical spam actually always contains unseparated words such as pharmacy, so that you users immediately recognize what it is about.
>>> I think eliminating special characters from every message first would result in a high server load.
>>> 
>>> Or is there another way to recognize SPAM from botnets with constantly changing spellings of keywords?
>>> 
>>> -- Users mailing list
>>> Users at lists.rspamd.com
>>> https://lists.rspamd.com/mailman/listinfo/users
>> 
> 
> -- 
> Users mailing list
> Users at lists.rspamd.com
> https://lists.rspamd.com/mailman/listinfo/users


From george.asenov at wpx.net  Mon Mar 11 14:34:53 2024
From: george.asenov at wpx.net (George Asenov)
Date: Mon, 11 Mar 2024 16:34:53 +0200
Subject: [Rspamd-Users] Map rcpt and from in single map
Message-ID: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>

Hello,

How create map to achieve the result of this rule but for multiple pairs:

  creative_blacklist_name_one {
         priority = high;
         from = "@example.com";
         rcpt = "@example.net";
         apply "default" {
             R_DUMMY = 100.0;

but for "from" and "rcpt" values to use single map:
key=rcpt and value=from or the opposite.

I want to create personal whitelist
whitelist from X for user Y without the need to edit files and use maps 
instead.

Thanks in advance!

-- 
Warm regards
George A.


From george.asenov at wpx.net  Tue Mar 12 07:55:52 2024
From: george.asenov at wpx.net (George Asenov)
Date: Tue, 12 Mar 2024 09:55:52 +0200
Subject: [Rspamd-Users] Map rcpt and from in single map
In-Reply-To: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>
References: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>
Message-ID: <bb228841-1df2-47fc-b694-679a33153aff@wpx.net>

Hello,

How create map to achieve the result of this rule but for multiple pairs:

  creative_blacklist_name_one {
         priority = high;
         from = "@example.com";
         rcpt = "@example.net";
         apply "default" {
             R_DUMMY = 100.0;

but for "from" and "rcpt" values to use single map:
key=rcpt and value=from or the opposite.

I want to create personal whitelist
whitelist from X for user Y without the need to edit files and use maps 
instead.

Thanks in advance!

-- 
Warm regards
George A.

From tkazmierczak at man.poznan.pl  Tue Mar 12 11:14:38 2024
From: tkazmierczak at man.poznan.pl (=?UTF-8?Q?Tomasz_Ka=C5=BAmierczak?=)
Date: Tue, 12 Mar 2024 12:14:38 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
Message-ID: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>

Hello,

we trying Avast as antivirus for rspamd (now have 30-day trial license).

Rspamd and Avast on the same VM.

Standard configuration:

# local.d/antivirus.conf

avast {

 ? symbol = "AVAST_VIRUS";
 ? servers = "127.0.0.1:8080";

 ? scan_mime_parts = true; # (Default) Just attachments
 ? use_files = false; # (Default) Or true if you need the file mode (not 
recommend)
 ? use_https = false; # (Default) Enable if you like to use SSL

 ? warnings_as_threat = false; # (Default)

 ? # https://repo.avcdn.net/linux-av/doc/avast-techdoc.pdf
 ? parameter = {
 ??? archives = true, # (Default)
 ??? # email = false,
 ??? # full = false,
 ??? # pup = false,
 ??? # heuristics = 40,
 ??? # detections = false,

 ? }
}

And rspamd.log:

2024-03-12 10:46:36 #909(main) <7festm>; lua; lua_util.lua:1216: enable 
debug for Lua module avast (antivirus aliased)
2024-03-12 10:46:36 #909(main) <7festm>; lua; antivirus.lua:209: added 
antivirus engine avast -> AVAST_VIRUS
2024-03-12 10:48:33 #1188(normal) <d6edc7>; avast; avast.lua:168: 
established connection to 127.0.0.1:8080; retransmits=0
2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; avast.lua:179: failed 
to request to avast (127.0.0.1:8080): IO timeout
2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; avast.lua:148: 
AVAST_VIRUS [avast]: failed to scan, maximum retransmits exceed
2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; common.lua:113: avast: 
result - FAILED with error: "failed to scan and retransmits exceed - 
score: 0"
2024-03-12 10:48:37 #1188(normal) <d6edc7>; task; finalize_item: slow 
rule: AVAST_VIRUS(328): 4006.03 ms; enable slow timer delay

Avast log:

Mar 12 10:48:33 mosaic-rspamd-proxy avast-rest[1275]: Session: 
[7fd678000cd0] New connection from 127.0.0.1:49954

avast-rest is working properly:

/usr/share/avast$ ./scan-rest.sh eicar.txt
eicar.txt?????? {"issues":[{"path":["eicar.txt"],"virus":"EICAR Test-NOT 
virus!!!"}],"vps_version":"24031202"}


Is any option to "enable slow timer delay" or increase retransmit?

Maybe some one use AVAST and can help with configuration.


Thank you

kazix


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5799 bytes
Desc: Kryptograficzna sygnatura S/MIME
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240312/3181f4ce/attachment.bin>

From usenet at schani.com  Tue Mar 12 12:55:57 2024
From: usenet at schani.com (christian)
Date: Tue, 12 Mar 2024 13:55:57 +0100
Subject: [Rspamd-Users] A question about greylisting
Message-ID: <351df58d-97bf-44a9-9525-cd4a96d02fb0@schani.com>

Hello, I'm just wondering about one thing:
An obvious spam email that has a value of +18 after the check is then 
softly rejected and immediately sorted out or rejected.

I actually thought that if the value for soft reject is 0, the message 
will be soft rejected from that value onwards, but if the value already 
reaches +4 (add header), soft reject shouldn't take effect at all and 
add header should occur straight away.
What is the reason for that?

Thanks
Christian

greylist	19.43 / 30

ABUSE_SURBL (5) 
[healtyfreak.life:replyto,healtyfreak.life:dkim,healtyfreak.life:url,janus.healtyfreak.life:helo]
MX_MISSING (5) [requested record is not found]
HFILTER_HOSTNAME_UNKNOWN (2.5)
BAD_REP_POLICIES (2)
SPF_REPUTATION_SPAM (1.435351) [0.71767547593916]
RDNS_NONE (1)
URI_COUNT_ODD (1) [3]
SPAMD (0.51) 
[DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,FROM_SUSPICIOUS_NTLD,FROM_SUSPICIOUS_NTLD_FP,HTML_FONT_LOW_CONTRAST,HTML_IMAGE_ONLY_28,HTML_IMAGE_RATIO_02,HTML_MESSAGE,NO_RECEIVED,NO_RELAYS,PP_MIME_FAKE_ASCII_TEXT,T_PDS_OTHER_BAD_TLD,T_SCC_BODY_TEXT_LINE,URIBL_ABUSE_SURBL,URIBL_BLOCKEDautolearn=no]
MV_CASE (0.5)
R_MISSING_CHARSET (0.5)
ONCE_RECEIVED (0.1)
MIME_GOOD (-0.1) [multipart/alternative,text/plain]
BAYES_HAM (-0.092727) [68.11%]
MANY_INVISIBLE_PARTS (0.05) [1]
IP_REPUTATION_SPAM (0.034134) [asn: 213035(0.00), country: NL(0.01), ip: 
185.121.123.172(0.00)]
MX_GOOD (-0.01) []
REPLYTO_EQ_FROM (0)
TO_MATCH_ENVRCPT_ALL (0)
DKIM_REPUTATION (0) [0]
DKIM_TRACE (0) [healtyfreak.life:+]
ASN (0) [asn:213035, ipnet:185.121.123.0/24, country:NL]
FROM_HAS_DN (0)
R_SPF_ALLOW (0) [+a]
FROM_NEQ_ENVFROM (0) 
[american_home_shield_today at healtyfreak.life,12395-13409-26710-382-andrea=beyond-history.de at mail.healtyfreak.life]
MID_RHS_MATCH_FROM (0)
MIME_TRACE (0) [0:+,1:+,2:~]
RCPT_COUNT_ONE (0) [1]
R_DKIM_ALLOW (0) [healtyfreak.life:s=k1]
MISSING_XM_UA (0)
GREYLIST (0) [greylisted,Tue, 12 Mar 2024 12:47:31 GMT,new record]
DMARC_POLICY_ALLOW (0) [healtyfreak.life,quarantine]
RCVD_COUNT_ZERO (0) [0]
FORGED_SENDER_VERP_SRS (0)
HAS_REPLYTO (0) [american_home_shield_today at healtyfreak.life]
TO_DN_NONE (0)

From t.hendricks at interpool.de  Wed Mar 13 10:13:50 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Wed, 13 Mar 2024 11:13:50 +0100
Subject: [Rspamd-Users] Yet another multimap mystery
Message-ID: <15E19E43-47F0-43D5-B489-026B9CF7C8A8@interpool.de>

Hi list,

I?m trying to create a multimap that catches a certain type of SPAM that always features three significant, individual headers.

To reduce it to maximum simplicity and for testing purposes I stripped everything down to a single header which I can?t even get to match.

In my 
/etc/rspamd/local.d/multimap.conf I have (besides other, working maps)

BEWERBUNGEN {
    type = "content";
    filter = "headers";
    map = "${LOCAL_CONFDIR}/known_spam_headers.map";
    prefilter = false;
    score = 10.0;
    regexp = true;
}
(I also tried ?filters = full? to no avail)

with 
/etc/rspamd/local.d//known_spam_headers.map nothing else but

/Return-Path: <hostmaster.*/

rspamadm configdump successfully confirms it?s loaded, but output is

rspamc symbols <theEmail>
Results for file: 1710323151.2603_1.mail:2,S (0.144 seconds)
[Metric: default]
Action: no action
Spam: false
Score: 2.29 / 15.00
Symbol: ARC_NA (0.00)
Symbol: BAD_REP_POLICIES (0.50)
Symbol: BAYES_SPAM (0.09)[55.85%]
Symbol: DKIM_TRACE (0.00)[dom.com:+]
Symbol: DMARC_POLICY_ALLOW (0.00)[domain.com, quarantine]
Symbol: FROM_HAS_DN (0.00)
Symbol: FROM_NEQ_ENVFROM (0.00)[email at domain.com, hostmaster at domain.com]
Symbol: HAS_ATTACHMENT (0.00)
Symbol: HAS_REPLYTO (0.00)[email at domain.com]
Symbol: HFILTER_HOSTNAME_UNKNOWN (2.50)
Symbol: MID_RHS_MATCH_FROM (0.00)
Symbol: MIME_GOOD (-0.10)[multipart/mixed]
Symbol: MIME_HTML_ONLY (0.20)
Symbol: MIME_TRACE (0.00)[0:+, 1:~, 2:~]
Symbol: NEURAL_HAM (-0.00)[-0.980]
Symbol: PREVIOUSLY_DELIVERED (-1.00)[recipient at domain.com]
Symbol: RCPT_COUNT_ONE (0.00)[1]
Symbol: RCVD_COUNT_THREE (0.00)[3]
Symbol: RCVD_NO_TLS_LAST (0.10)
Symbol: RCVD_VIA_SMTP_AUTH (0.00)
Symbol: REPLYTO_EQ_FROM (0.00)
Symbol: R_DKIM_ALLOW (0.00)[domain.com:s=email]
Symbol: TO_DN_NONE (0.00)
Message-ID: hT13V8WfAf0KgOqCEZVcJHLWn0ulmIkQkywyMesneo at domain.com
Urls: []
Emails: ["email at domain.com?]

What am I missing?

Thank you very much.

Tino

From philipp.faeustlin at uni-hohenheim.de  Wed Mar 13 10:48:04 2024
From: philipp.faeustlin at uni-hohenheim.de (=?UTF-8?Q?Philipp_F=C3=A4ustlin?=)
Date: Wed, 13 Mar 2024 11:48:04 +0100
Subject: [Rspamd-Users] Yet another multimap mystery
In-Reply-To: <15E19E43-47F0-43D5-B489-026B9CF7C8A8@interpool.de>
References: <15E19E43-47F0-43D5-B489-026B9CF7C8A8@interpool.de>
Message-ID: <f83e1610-02e6-4d13-b99a-00033171f5c0@uni-hohenheim.de>

Am 13.03.24 um 11:13 schrieb Tino Hendricks via Users:
> Hi list,
>
> I?m trying to create a multimap that catches a certain type of SPAM that always features three significant, individual headers.
>
> To reduce it to maximum simplicity and for testing purposes I stripped everything down to a single header which I can?t even get to match.
>
> In my
> /etc/rspamd/local.d/multimap.conf I have (besides other, working maps)
>
> BEWERBUNGEN {
>      type = "content";
>      filter = "headers";
>      map = "${LOCAL_CONFDIR}/known_spam_headers.map";
>      prefilter = false;
>      score = 10.0;
>      regexp = true;
> }
> (I also tried ?filters = full? to no avail)
>
> with
> /etc/rspamd/local.d//known_spam_headers.map nothing else but
>
> /Return-Path: <hostmaster.*/
>
> rspamadm configdump successfully confirms it?s loaded, but output is
>
> rspamc symbols <theEmail>
> Results for file: 1710323151.2603_1.mail:2,S (0.144 seconds)
> [Metric: default]
> Action: no action
> Spam: false
> Score: 2.29 / 15.00
> Symbol: ARC_NA (0.00)
> Symbol: BAD_REP_POLICIES (0.50)
> Symbol: BAYES_SPAM (0.09)[55.85%]
> Symbol: DKIM_TRACE (0.00)[dom.com:+]
> Symbol: DMARC_POLICY_ALLOW (0.00)[domain.com, quarantine]
> Symbol: FROM_HAS_DN (0.00)
> Symbol: FROM_NEQ_ENVFROM (0.00)[email at domain.com, hostmaster at domain.com]
> Symbol: HAS_ATTACHMENT (0.00)
> Symbol: HAS_REPLYTO (0.00)[email at domain.com]
> Symbol: HFILTER_HOSTNAME_UNKNOWN (2.50)
> Symbol: MID_RHS_MATCH_FROM (0.00)
> Symbol: MIME_GOOD (-0.10)[multipart/mixed]
> Symbol: MIME_HTML_ONLY (0.20)
> Symbol: MIME_TRACE (0.00)[0:+, 1:~, 2:~]
> Symbol: NEURAL_HAM (-0.00)[-0.980]
> Symbol: PREVIOUSLY_DELIVERED (-1.00)[recipient at domain.com]
> Symbol: RCPT_COUNT_ONE (0.00)[1]
> Symbol: RCVD_COUNT_THREE (0.00)[3]
> Symbol: RCVD_NO_TLS_LAST (0.10)
> Symbol: RCVD_VIA_SMTP_AUTH (0.00)
> Symbol: REPLYTO_EQ_FROM (0.00)
> Symbol: R_DKIM_ALLOW (0.00)[domain.com:s=email]
> Symbol: TO_DN_NONE (0.00)
> Message-ID: hT13V8WfAf0KgOqCEZVcJHLWn0ulmIkQkywyMesneo at domain.com
> Urls: []
> Emails: ["email at domain.com?]
>
> What am I missing?
>
> Thank you very much.
>
> Tino

Not sure but the "Return-Path:" Header is probably set by postfix after 
rspamd checked the message.

Because it is the last header in the received mail, I guess.

I think you should test against "ENVFROM" not the header for that.

Best regards

Philipp

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5357 bytes
Desc: Kryptografische S/MIME-Signatur
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240313/7b50bd32/attachment.bin>

From t.hendricks at interpool.de  Wed Mar 13 12:20:12 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Wed, 13 Mar 2024 13:20:12 +0100
Subject: [Rspamd-Users] Yet another multimap mystery
In-Reply-To: <f83e1610-02e6-4d13-b99a-00033171f5c0@uni-hohenheim.de>
References: <15E19E43-47F0-43D5-B489-026B9CF7C8A8@interpool.de>
 <f83e1610-02e6-4d13-b99a-00033171f5c0@uni-hohenheim.de>
Message-ID: <D3017FB9-686A-49DD-AA93-0102C4C32F54@interpool.de>

Hi Philipp,

thank you very much for diving into it!

Sounds like a trap I?ve been falling into before.

But since I'm testing with an exported, local .eml-File the "Return-Path:? Header is present.

To be sure I tested with another, single header 

/\(envelope-from <hostmaster at .*/

Again same results (no match on BEWERBUNGEN) with 

rspamc symbols <theEmail>

But I?m correct doing a "systemctl reload rspamd? is sufficient for rspamd to take into account the changed files, right?

Thankful for any ideas,

Tino

> Am 13.03.2024 um 11:48 schrieb Philipp F?ustlin <philipp.faeustlin at uni-hohenheim.de>:
> 
> Am 13.03.24 um 11:13 schrieb Tino Hendricks via Users:
>> Hi list,
>> 
>> I?m trying to create a multimap that catches a certain type of SPAM that always features three significant, individual headers.
>> 
>> To reduce it to maximum simplicity and for testing purposes I stripped everything down to a single header which I can?t even get to match.
>> 
>> In my
>> /etc/rspamd/local.d/multimap.conf I have (besides other, working maps)
>> 
>> BEWERBUNGEN {
>>     type = "content";
>>     filter = "headers";
>>     map = "${LOCAL_CONFDIR}/known_spam_headers.map";
>>     prefilter = false;
>>     score = 10.0;
>>     regexp = true;
>> }
>> (I also tried ?filters = full? to no avail)
>> 
>> with
>> /etc/rspamd/local.d//known_spam_headers.map nothing else but
>> 
>> /Return-Path: <hostmaster.*/
>> 
>> rspamadm configdump successfully confirms it?s loaded, but output is
>> 
>> rspamc symbols <theEmail>
>> Results for file: 1710323151.2603_1.mail:2,S (0.144 seconds)
>> [Metric: default]
>> Action: no action
>> Spam: false
>> Score: 2.29 / 15.00
>> Symbol: ARC_NA (0.00)
>> Symbol: BAD_REP_POLICIES (0.50)
>> Symbol: BAYES_SPAM (0.09)[55.85%]
>> Symbol: DKIM_TRACE (0.00)[dom.com:+]
>> Symbol: DMARC_POLICY_ALLOW (0.00)[domain.com, quarantine]
>> Symbol: FROM_HAS_DN (0.00)
>> Symbol: FROM_NEQ_ENVFROM (0.00)[email at domain.com, hostmaster at domain.com]
>> Symbol: HAS_ATTACHMENT (0.00)
>> Symbol: HAS_REPLYTO (0.00)[email at domain.com]
>> Symbol: HFILTER_HOSTNAME_UNKNOWN (2.50)
>> Symbol: MID_RHS_MATCH_FROM (0.00)
>> Symbol: MIME_GOOD (-0.10)[multipart/mixed]
>> Symbol: MIME_HTML_ONLY (0.20)
>> Symbol: MIME_TRACE (0.00)[0:+, 1:~, 2:~]
>> Symbol: NEURAL_HAM (-0.00)[-0.980]
>> Symbol: PREVIOUSLY_DELIVERED (-1.00)[recipient at domain.com]
>> Symbol: RCPT_COUNT_ONE (0.00)[1]
>> Symbol: RCVD_COUNT_THREE (0.00)[3]
>> Symbol: RCVD_NO_TLS_LAST (0.10)
>> Symbol: RCVD_VIA_SMTP_AUTH (0.00)
>> Symbol: REPLYTO_EQ_FROM (0.00)
>> Symbol: R_DKIM_ALLOW (0.00)[domain.com:s=email]
>> Symbol: TO_DN_NONE (0.00)
>> Message-ID: hT13V8WfAf0KgOqCEZVcJHLWn0ulmIkQkywyMesneo at domain.com
>> Urls: []
>> Emails: ["email at domain.com?]
>> 
>> What am I missing?
>> 
>> Thank you very much.
>> 
>> Tino
> 
> Not sure but the "Return-Path:" Header is probably set by postfix after rspamd checked the message.
> 
> Because it is the last header in the received mail, I guess.
> 
> I think you should test against "ENVFROM" not the header for that.
> 
> Best regards
> 
> Philipp
> 
> -- 
> Users mailing list
> Users at lists.rspamd.com
> https://lists.rspamd.com/mailman/listinfo/users


From rspamd_users_ml at cmb.ch  Wed Mar 13 13:15:06 2024
From: rspamd_users_ml at cmb.ch (C. Bernard)
Date: Wed, 13 Mar 2024 14:15:06 +0100
Subject: [Rspamd-Users] Yet another multimap mystery
In-Reply-To: <D3017FB9-686A-49DD-AA93-0102C4C32F54@interpool.de>
References: <15E19E43-47F0-43D5-B489-026B9CF7C8A8@interpool.de>
 <f83e1610-02e6-4d13-b99a-00033171f5c0@uni-hohenheim.de>
 <D3017FB9-686A-49DD-AA93-0102C4C32F54@interpool.de>
Message-ID: <d6b74f4063b3343c7a58d897da4226b3@cmb.ch>

Hi Tino

On 2024-03-13 13:20, Tino Hendricks via Users wrote:
> Hi Philipp,
> 
> thank you very much for diving into it!
> 
> Sounds like a trap I?ve been falling into before.
> 
> But since I'm testing with an exported, local .eml-File the 
> "Return-Path:? Header is present.
> 
> To be sure I tested with another, single header
> 
> /\(envelope-from <hostmaster at .*/

use:
/envelope-from..?hostmaster/

My example is root, as this I can test better with.
I couldn't match the ( nor the @. The "..?" matches the TAB and the "<" 
if there is one. My example is local, therefore it is non-canonified at 
that stage in the miltr / process.
The @ is not seen by rspamd in my case (caonofy follows as it seems 
after scanning probably; it is not in the Received header below)
Why the ( did not match, not sure. rspamd follows "pcre" regex and in my 
tests on regex101 it matched with the parenthesis "(".

and set in logging.inc:
debug_modules = ["multimap"];

I see now:
2024-03-13 13:55:12 #66320(normal) <62a67c>; multimap; multimap.lua:563: 
check value Received: (from root at localhost)\x0A\x09by beastly.vbz.ch 
(8.18.1/8.17.1/Submit) id 42DCtC9i079538\x0A\x09for spamstat at vbz.ch; 
Wed, 13 Mar 2024 13:55:12 +0100 (CET)\x0A\x09(envelope-from 
root)\x0D\x0ADate: Wed, 13 Mar 2024 13:55:12 +0100 (CET)\x0D\x0AFrom: 
Charlie Root <root at beastly.vbz.ch>\x0D\x0AMessage-Id: 
<202403131255.42DCtC9i079538 at beastly.vbz.ch>\x0D\x0ATo: 
spamstat at vbz.ch\x0D\x0ASubject: Spam of the day statistic \x0D\x0A for 
multimap BEWERBUNGEN

/Domain "anonymised" :)

maybe some "raw" or so the "match type" to try out after the last / :
https://rspamd.com/doc/modules/regexp.html


[root at beastly /usr/local/etc/rspamd/local.d]# tail -9 multimap.conf
# War nur ein Test
BEWERBUNGEN {
     type = "content";
     filter = "headers";
     map = "/usr/local/etc/rspamd/local.d/test.map";
     prefilter = false;
     score = 0.1;
     regexp = true;
}

[root at beastly /usr/local/etc/rspamd/local.d]# cat test.map
/envelope-from root/

> 
> Again same results (no match on BEWERBUNGEN) with
> 
> rspamc symbols <theEmail>
> 
> But I?m correct doing a "systemctl reload rspamd? is sufficient for 
> rspamd to take into account the changed files, right?

Yes, I do that as well. .map changes are going into effect after 
changing the .map file, no need for restart on changing that.

Cheers
Christian

> 
> Thankful for any ideas,
> 
> Tino
> 
>> Am 13.03.2024 um 11:48 schrieb Philipp F?ustlin 
>> <philipp.faeustlin at uni-hohenheim.de>:
>> 
>> Am 13.03.24 um 11:13 schrieb Tino Hendricks via Users:
>>> Hi list,
>>> 
>>> I?m trying to create a multimap that catches a certain type of SPAM 
>>> that always features three significant, individual headers.
>>> 
>>> To reduce it to maximum simplicity and for testing purposes I 
>>> stripped everything down to a single header which I can?t even get to 
>>> match.
>>> 
>>> In my
>>> /etc/rspamd/local.d/multimap.conf I have (besides other, working 
>>> maps)
>>> 
>>> BEWERBUNGEN {
>>>     type = "content";
>>>     filter = "headers";
>>>     map = "${LOCAL_CONFDIR}/known_spam_headers.map";
>>>     prefilter = false;
>>>     score = 10.0;
>>>     regexp = true;
>>> }
>>> (I also tried ?filters = full? to no avail)
>>> 
>>> with
>>> /etc/rspamd/local.d//known_spam_headers.map nothing else but
>>> 
>>> /Return-Path: <hostmaster.*/
>>> 
>>> rspamadm configdump successfully confirms it?s loaded, but output is
>>> 
>>> rspamc symbols <theEmail>
>>> Results for file: 1710323151.2603_1.mail:2,S (0.144 seconds)
>>> [Metric: default]
>>> Action: no action
>>> Spam: false
>>> Score: 2.29 / 15.00
>>> Symbol: ARC_NA (0.00)
>>> Symbol: BAD_REP_POLICIES (0.50)
>>> Symbol: BAYES_SPAM (0.09)[55.85%]
>>> Symbol: DKIM_TRACE (0.00)[dom.com:+]
>>> Symbol: DMARC_POLICY_ALLOW (0.00)[domain.com, quarantine]
>>> Symbol: FROM_HAS_DN (0.00)
>>> Symbol: FROM_NEQ_ENVFROM (0.00)[email at domain.com, 
>>> hostmaster at domain.com]
>>> Symbol: HAS_ATTACHMENT (0.00)
>>> Symbol: HAS_REPLYTO (0.00)[email at domain.com]
>>> Symbol: HFILTER_HOSTNAME_UNKNOWN (2.50)
>>> Symbol: MID_RHS_MATCH_FROM (0.00)
>>> Symbol: MIME_GOOD (-0.10)[multipart/mixed]
>>> Symbol: MIME_HTML_ONLY (0.20)
>>> Symbol: MIME_TRACE (0.00)[0:+, 1:~, 2:~]
>>> Symbol: NEURAL_HAM (-0.00)[-0.980]
>>> Symbol: PREVIOUSLY_DELIVERED (-1.00)[recipient at domain.com]
>>> Symbol: RCPT_COUNT_ONE (0.00)[1]
>>> Symbol: RCVD_COUNT_THREE (0.00)[3]
>>> Symbol: RCVD_NO_TLS_LAST (0.10)
>>> Symbol: RCVD_VIA_SMTP_AUTH (0.00)
>>> Symbol: REPLYTO_EQ_FROM (0.00)
>>> Symbol: R_DKIM_ALLOW (0.00)[domain.com:s=email]
>>> Symbol: TO_DN_NONE (0.00)
>>> Message-ID: hT13V8WfAf0KgOqCEZVcJHLWn0ulmIkQkywyMesneo at domain.com
>>> Urls: []
>>> Emails: ["email at domain.com?]
>>> 
>>> What am I missing?
>>> 
>>> Thank you very much.
>>> 
>>> Tino
>> 
>> Not sure but the "Return-Path:" Header is probably set by postfix 
>> after rspamd checked the message.
>> 
>> Because it is the last header in the received mail, I guess.
>> 
>> I think you should test against "ENVFROM" not the header for that.
>> 
>> Best regards
>> 
>> Philipp
>> 
>> --
>> Users mailing list
>> Users at lists.rspamd.com
>> https://lists.rspamd.com/mailman/listinfo/users

From t.hendricks at interpool.de  Wed Mar 13 13:58:39 2024
From: t.hendricks at interpool.de (Tino Hendricks)
Date: Wed, 13 Mar 2024 14:58:39 +0100
Subject: [Rspamd-Users] Yet another multimap mystery
In-Reply-To: <d6b74f4063b3343c7a58d897da4226b3@cmb.ch>
References: <15E19E43-47F0-43D5-B489-026B9CF7C8A8@interpool.de>
 <f83e1610-02e6-4d13-b99a-00033171f5c0@uni-hohenheim.de>
 <D3017FB9-686A-49DD-AA93-0102C4C32F54@interpool.de>
 <d6b74f4063b3343c7a58d897da4226b3@cmb.ch>
Message-ID: <AC22DB11-D467-4EEB-8E9D-2506E6B6E31E@interpool.de>

Hi Christian,

thank you so much for your time!

I did as you suggested (I wasn?t aware of the debug feature ????)

but even with just

root at mail:/tmp# tail -9 /etc/rspamd/local.d/multimap.conf

BEWERBUNGEN {
    type = "content";
    filter = "headers";
    map = "${LOCAL_CONFDIR}/local.d/known_spam_headers.map";
    prefilter = false;
	score = 10.0;
    regexp = true;
}

root at mail:/tmp# cat /etc/rspamd/local.d/known_spam_headers.map
/envelope-from/

I?m getting

2024-03-13 14:36:57 #1422681(controller) <8d24ef>; multimap; multimap.lua:435: check value Return-Path: <hostmaster?.Uynjxh1dmSHIPzX8EU=;\x0AReceived: from domain.com ([12.23.45.56])\x0A\x09by domain.com with esmtpsa  (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384\x0A\x09(Exim 4.96)\x0A\x09(envelope-from <hostmaster at domain.com>)\x0A\x09id 1rk0nc-006zEA-1M\x0A\x09for?. Content-Type: multipart/mixed;\x0A boundary="b1=_hT13V8WfAf0KgOqCEZVcJHLWn0ulmIkQkywyMesneo"\x0AContent-Transfer-Encoding: 8bit\x0AContent-Length: 2959359\x0A for multimap BEWERBUNGEN
2024-03-13 14:36:57 #1422681(controller) <8d24ef>; multimap; multimap.lua:474: found return "false" for multimap BEWERBUNGEN

But:
Now I finally did a _restart_ of rspamd, and now it uses the map ? I have no idea why it didn?t do it before.

Thanks to everyone!

Tino

> Am 13.03.2024 um 14:15 schrieb C. Bernard via Users <users at lists.rspamd.com>:
> 
> Hi Tino
> 
> On 2024-03-13 13:20, Tino Hendricks via Users wrote:
>> Hi Philipp,
>> thank you very much for diving into it!
>> Sounds like a trap I?ve been falling into before.
>> But since I'm testing with an exported, local .eml-File the "Return-Path:? Header is present.
>> To be sure I tested with another, single header
>> /\(envelope-from <hostmaster at .*/
> 
> use:
> /envelope-from..?hostmaster/
> 
> My example is root, as this I can test better with.
> I couldn't match the ( nor the @. The "..?" matches the TAB and the "<" if there is one. My example is local, therefore it is non-canonified at that stage in the miltr / process.
> The @ is not seen by rspamd in my case (caonofy follows as it seems after scanning probably; it is not in the Received header below)
> Why the ( did not match, not sure. rspamd follows "pcre" regex and in my tests on regex101 it matched with the parenthesis "(".
> 
> and set in logging.inc:
> debug_modules = ["multimap"];
> 
> I see now:
> 2024-03-13 13:55:12 #66320(normal) <62a67c>; multimap; multimap.lua:563: check value Received: (from root at localhost)\x0A\x09by beastly.vbz.ch <http://beastly.vbz.ch/>(8.18.1/8.17.1/Submit) id 42DCtC9i079538\x0A\x09for spamstat at vbz.ch <mailto:spamstat at vbz.ch>; Wed, 13 Mar 2024 13:55:12 +0100 (CET)\x0A\x09(envelope-from root)\x0D\x0ADate: Wed, 13 Mar 2024 13:55:12 +0100 (CET)\x0D\x0AFrom: Charlie Root <root at beastly.vbz.ch <mailto:root at beastly.vbz.ch>>\x0D\x0AMessage-Id: <202403131255.42DCtC9i079538 at beastly.vbz.ch <mailto:202403131255.42DCtC9i079538 at beastly.vbz.ch>>\x0D\x0ATo: spamstat at vbz.ch <mailto:spamstat at vbz.ch>\x0D\x0ASubject: Spam of the day statistic \x0D\x0A for multimap BEWERBUNGEN
> 
> /Domain "anonymised" :)
> 
> maybe some "raw" or so the "match type" to try out after the last / :
> https://rspamd.com/doc/modules/regexp.html
> 
> 
> [root at beastly /usr/local/etc/rspamd/local.d]# tail -9 multimap.conf
> # War nur ein Test
> BEWERBUNGEN {
>    type = "content";
>    filter = "headers";
>    map = "/usr/local/etc/rspamd/local.d/test.map";
>    prefilter = false;
>    score = 0.1;
>    regexp = true;
> }
> 
> [root at beastly /usr/local/etc/rspamd/local.d]# cat test.map
> /envelope-from root/
> 
>> Again same results (no match on BEWERBUNGEN) with
>> rspamc symbols <theEmail>
>> But I?m correct doing a "systemctl reload rspamd? is sufficient for rspamd to take into account the changed files, right?
> 
> Yes, I do that as well. .map changes are going into effect after changing the .map file, no need for restart on changing that.
> 
> Cheers
> Christian
> 
>> Thankful for any ideas,
>> Tino
>>> Am 13.03.2024 um 11:48 schrieb Philipp F?ustlin <philipp.faeustlin at uni-hohenheim.de>:
>>> Am 13.03.24 um 11:13 schrieb Tino Hendricks via Users:
>>>> Hi list,
>>>> I?m trying to create a multimap that catches a certain type of SPAM that always features three significant, individual headers.
>>>> To reduce it to maximum simplicity and for testing purposes I stripped everything down to a single header which I can?t even get to match.
>>>> In my
>>>> /etc/rspamd/local.d/multimap.conf I have (besides other, working maps)
>>>> BEWERBUNGEN {
>>>>    type = "content";
>>>>    filter = "headers";
>>>>    map = "${LOCAL_CONFDIR}/known_spam_headers.map";
>>>>    prefilter = false;
>>>>    score = 10.0;
>>>>    regexp = true;
>>>> }
>>>> (I also tried ?filters = full? to no avail)
>>>> with
>>>> /etc/rspamd/local.d//known_spam_headers.map nothing else but
>>>> /Return-Path: <hostmaster.*/
>>>> rspamadm configdump successfully confirms it?s loaded, but output is
>>>> rspamc symbols <theEmail>
>>>> Results for file: 1710323151.2603_1.mail:2,S (0.144 seconds)
>>>> [Metric: default]
>>>> Action: no action
>>>> Spam: false
>>>> Score: 2.29 / 15.00
>>>> Symbol: ARC_NA (0.00)
>>>> Symbol: BAD_REP_POLICIES (0.50)
>>>> Symbol: BAYES_SPAM (0.09)[55.85%]
>>>> Symbol: DKIM_TRACE (0.00)[dom.com:+]
>>>> Symbol: DMARC_POLICY_ALLOW (0.00)[domain.com, quarantine]
>>>> Symbol: FROM_HAS_DN (0.00)
>>>> Symbol: FROM_NEQ_ENVFROM (0.00)[email at domain.com, hostmaster at domain.com]
>>>> Symbol: HAS_ATTACHMENT (0.00)
>>>> Symbol: HAS_REPLYTO (0.00)[email at domain.com]
>>>> Symbol: HFILTER_HOSTNAME_UNKNOWN (2.50)
>>>> Symbol: MID_RHS_MATCH_FROM (0.00)
>>>> Symbol: MIME_GOOD (-0.10)[multipart/mixed]
>>>> Symbol: MIME_HTML_ONLY (0.20)
>>>> Symbol: MIME_TRACE (0.00)[0:+, 1:~, 2:~]
>>>> Symbol: NEURAL_HAM (-0.00)[-0.980]
>>>> Symbol: PREVIOUSLY_DELIVERED (-1.00)[recipient at domain.com]
>>>> Symbol: RCPT_COUNT_ONE (0.00)[1]
>>>> Symbol: RCVD_COUNT_THREE (0.00)[3]
>>>> Symbol: RCVD_NO_TLS_LAST (0.10)
>>>> Symbol: RCVD_VIA_SMTP_AUTH (0.00)
>>>> Symbol: REPLYTO_EQ_FROM (0.00)
>>>> Symbol: R_DKIM_ALLOW (0.00)[domain.com:s=email]
>>>> Symbol: TO_DN_NONE (0.00)
>>>> Message-ID: hT13V8WfAf0KgOqCEZVcJHLWn0ulmIkQkywyMesneo at domain.com
>>>> Urls: []
>>>> Emails: ["email at domain.com?]
>>>> What am I missing?
>>>> Thank you very much.
>>>> Tino
>>> Not sure but the "Return-Path:" Header is probably set by postfix after rspamd checked the message.
>>> Because it is the last header in the received mail, I guess.
>>> I think you should test against "ENVFROM" not the header for that.
>>> Best regards
>>> Philipp
>>> --
>>> Users mailing list
>>> Users at lists.rspamd.com
>>> https://lists.rspamd.com/mailman/listinfo/users
> -- 
> Users mailing list
> Users at lists.rspamd.com <mailto:Users at lists.rspamd.com>
> https://lists.rspamd.com/mailman/listinfo/users


From florian at effenberger.org  Wed Mar 13 15:07:31 2024
From: florian at effenberger.org (Florian Effenberger)
Date: Wed, 13 Mar 2024 16:07:31 +0100
Subject: [Rspamd-Users] Pyzor weight
Message-ID: <911f6f76-8cc5-45e8-b042-3f205e8a8d8b@effenberger.org>

Hello,

I have installed Pyzor as per the instructions on rspamd.com. As others 
have reported, without further configuration, the weight seems always 0.0.

I found some references to this online and based on that my current 
configuration in local.d/external_services_group.conf reads

  "PYZOR" {
   weight = 1.0;
   description = "check message digest on pyzor"
   }

That 1.0 was just a wild guess, I found also other examples that used 
3.0 and more.

Does anyone have a best practice/hands-on experience for this? Which 
weight is recommended for a Pyzor setup to be effective but not too 
aggressive either?

Thanks a lot,
Florian

From list+rspamd at gcore.biz  Wed Mar 13 22:32:05 2024
From: list+rspamd at gcore.biz (Gerald Galster)
Date: Wed, 13 Mar 2024 23:32:05 +0100
Subject: [Rspamd-Users] Map rcpt and from in single map
In-Reply-To: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>
References: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>
Message-ID: <898EC6AB-CFA9-44B1-814E-68302DD9A782@gcore.biz>

> How create map to achieve the result of this rule but for multiple pairs:
> 
> creative_blacklist_name_one {
>        priority = high;
>        from = "@example.com";
>        rcpt = "@example.net";
>        apply "default" {
>            R_DUMMY = 100.0;
> 
> but for "from" and "rcpt" values to use single map:
> key=rcpt and value=from or the opposite.
> 
> I want to create personal whitelist
> whitelist from X for user Y without the need to edit files and use maps instead.

Just a guess: a multimap, type = "selector", using a combined from/to selector.

https://rspamd.com/doc/modules/multimap.html#map-types
https://rspamd.com/doc/configuration/selectors.html#selectors-combinations

Or maybe extend some lua code. This checks if a map contains "from":
https://rspamd.com/doc/lua/rspamd_map.html

https://rspamd.com/doc/lua/rspamd_task.html#mc5168

Best regards,
Gerald

From list+rspamd at gcore.biz  Wed Mar 13 22:32:31 2024
From: list+rspamd at gcore.biz (Gerald Galster)
Date: Wed, 13 Mar 2024 23:32:31 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
Message-ID: <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>

> we trying Avast as antivirus for rspamd (now have 30-day trial license).
> 
> [...]
> 2024-03-12 10:46:36 #909(main) <7festm>; lua; lua_util.lua:1216: enable debug for Lua module avast (antivirus aliased)
> 2024-03-12 10:46:36 #909(main) <7festm>; lua; antivirus.lua:209: added antivirus engine avast -> AVAST_VIRUS
> 2024-03-12 10:48:33 #1188(normal) <d6edc7>; avast; avast.lua:168: established connection to 127.0.0.1:8080; retransmits=0
> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; avast.lua:179: failed to request to avast (127.0.0.1:8080): IO timeout
> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; avast.lua:148: AVAST_VIRUS [avast]: failed to scan, maximum retransmits exceed
> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; common.lua:113: avast: result - FAILED with error: "failed to scan and retransmits exceed - score: 0"
> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; task; finalize_item: slow rule: AVAST_VIRUS(328): 4006.03 ms; enable slow timer delay
> 
> [...]
> Is any option to "enable slow timer delay" or increase retransmit?


Once I had to increase these values because clamav was too slow reloading its signature database.
Maybe this works with Avast as well.

/etc/rspamd/local.d/antivirus.conf:

clamav {
    ...
    timeout = 15.0;
    retransmits = 4;
}

Best regards,
Gerald

From tkazmierczak at man.poznan.pl  Thu Mar 14 14:03:15 2024
From: tkazmierczak at man.poznan.pl (=?UTF-8?Q?Tomasz_Ka=C5=BAmierczak?=)
Date: Thu, 14 Mar 2024 15:03:15 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
Message-ID: <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>

Thank you Gerald,

i add this to avast config, and in log i see only this:

2024-03-14 14:44:27 #5792(normal) <346d30>; avast; avast.lua:168: 
established connection to 127.0.0.1:8080; retransmits=3

and thats all about avast :(

rspamd not waiting for response from Avast.

2024-03-14 14:44:27 #5792(normal) <346d30>; avast; avast.lua:168: 
established connection to 127.0.0.1:8080; retransmits=3
2024-03-14 14:44:35 #5792(normal) <346d30>; events; 
rspamd_session_cleanup: forced removed event on destroy: 
00007FDA087A76C0, subsystem: rspamd lua tcp, scheduled from: AVAST_VIRUS

i see in rspamd.log that some modules (Greylist, ISN, etc.) reports:

enable slow timer delay

But this is for another questions...

Regards,
kazix

W dniu 13.03.2024 o?23:32, Gerald Galster pisze:
>> we trying Avast as antivirus for rspamd (now have 30-day trial license).
>>
>> [...]
>> 2024-03-12 10:46:36 #909(main) <7festm>; lua; lua_util.lua:1216: enable debug for Lua module avast (antivirus aliased)
>> 2024-03-12 10:46:36 #909(main) <7festm>; lua; antivirus.lua:209: added antivirus engine avast -> AVAST_VIRUS
>> 2024-03-12 10:48:33 #1188(normal) <d6edc7>; avast; avast.lua:168: established connection to 127.0.0.1:8080; retransmits=0
>> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; avast.lua:179: failed to request to avast (127.0.0.1:8080): IO timeout
>> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; avast.lua:148: AVAST_VIRUS [avast]: failed to scan, maximum retransmits exceed
>> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; lua; common.lua:113: avast: result - FAILED with error: "failed to scan and retransmits exceed - score: 0"
>> 2024-03-12 10:48:37 #1188(normal) <d6edc7>; task; finalize_item: slow rule: AVAST_VIRUS(328): 4006.03 ms; enable slow timer delay
>>
>> [...]
>> Is any option to "enable slow timer delay" or increase retransmit?
>
> Once I had to increase these values because clamav was too slow reloading its signature database.
> Maybe this works with Avast as well.
>
> /etc/rspamd/local.d/antivirus.conf:
>
> clamav {
>      ...
>      timeout = 15.0;
>      retransmits = 4;
> }
>
> Best regards,
> Gerald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5799 bytes
Desc: Kryptograficzna sygnatura S/MIME
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240314/1c35a561/attachment.bin>

From vsevolod at rspamd.com  Thu Mar 14 15:00:19 2024
From: vsevolod at rspamd.com (Vsevolod Stakhov)
Date: Thu, 14 Mar 2024 15:00:19 +0000
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
Message-ID: <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>

On 14/03/2024 14:03, Tomasz Ka?mierczak wrote:
> Thank you Gerald,
> 
> i add this to avast config, and in log i see only this:
> 
> 2024-03-14 14:44:27 #5792(normal) <346d30>; avast; avast.lua:168: 
> established connection to 127.0.0.1:8080; retransmits=3
> 
> and thats all about avast :(
> 
> rspamd not waiting for response from Avast.
> 
> 2024-03-14 14:44:27 #5792(normal) <346d30>; avast; avast.lua:168: 
> established connection to 127.0.0.1:8080; retransmits=3
> 2024-03-14 14:44:35 #5792(normal) <346d30>; events; 
> rspamd_session_cleanup: forced removed event on destroy: 
> 00007FDA087A76C0, subsystem: rspamd lua tcp, scheduled from: AVAST_VIRUS
> 
> i see in rspamd.log that some modules (Greylist, ISN, etc.) reports:
> 
> enable slow timer delay
> 

Please stop top-posting in this mailing list, place your reply below the 
quote.

WRT your issue: you need to adjust task_timeout as well. Honestly, I'd 
better advise not to use such a slow AV at all.

From usenet at schani.com  Thu Mar 14 16:56:37 2024
From: usenet at schani.com (christian)
Date: Thu, 14 Mar 2024 17:56:37 +0100
Subject: [Rspamd-Users] Bayes questions and observations
Message-ID: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>

Hello,
I've been trying to optimize my RSPAMD for a few weeks now and continue 
to learn how everything is connected.
Please excuse my stupid questions.
I have now looked more into Bayes and came across the following and 
still have a few questions about it.

1. There appears to be a difference between BAYES_SPAM/HAM and 
spamassassin. The BAYES_SPAM/HAM variant is integrated under the name 
?statistic?. It is configured under statistic.conf and 
classifier-bayes.conf. The results are saved in Redis and displayed in 
the web frontend under Status/Bayesian statistics.
The data is learned when the emails and previously generated scores from 
RBL, reputation, fuzzy and much more are delivered.
I'm not too happy with the results because I often get ham scores even 
though all other checks declare the email as spam. The content of an 
email can look quite reasonable even though it is spam. I don't have 
good experience with these results and that's why I only specified -2 
and +2. Emails can also be learned using rspamc learn_spam/ham. I have 
learned about 10,000 emails - spam and ham.
Please correct me, if I am wrong.

2. The next way to improve the results is via the external Spamassassin. 
There is also spamassassin.conf (SA), or you can integrate it via 
external_services.conf (SPAMD). The advantage is that external filter 
sources (Heinlein, Schaal-it,...) can be used. The filter can then be 
further learned and improved using spamc --spam/ham.
Please correct me, if I am wrong.

Now I have via rspamd spamassassin.conf:
ruleset = "/etc/spamassassin/local.cf";
base_ruleset = "/var/lib/spamassassin/4.000000/*.cf";
# Limit search size to 100 kilobytes for all regular expressions
match_limit = 100k;

sa-update is working

SA local.cf is
use_bayes 1
bayes_auto_learn 1
bayes_file_mode 777
bayes_path /var/lib/spamassassin/bayes_db


specified, but I can't find out whether these are also used by rspamd. 
spamassassin itself does not generate any logs. I can't find anything 
about this in the RSPAMD logs (debug mode). There is also no symbol for 
spamassassin. How are this SA results processed? spamc --spam email.eml 
works and learns the email, but I don't know where the results are 
saved. I can't come up with a solution to this.

Thank you very much for your help
Christian

From vsevolod at rspamd.com  Thu Mar 14 17:51:29 2024
From: vsevolod at rspamd.com (Vsevolod Stakhov)
Date: Thu, 14 Mar 2024 17:51:29 +0000
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
Message-ID: <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>

On 14/03/2024 16:56, christian via Users wrote:
> Hello,
> I've been trying to optimize my RSPAMD for a few weeks now and continue 
> to learn how everything is connected.
> Please excuse my stupid questions.
> I have now looked more into Bayes and came across the following and 
> still have a few questions about it.
> 
> 1. There appears to be a difference between BAYES_SPAM/HAM and 
> spamassassin. The BAYES_SPAM/HAM variant is integrated under the name 
> ?statistic?. It is configured under statistic.conf and 
> classifier-bayes.conf. The results are saved in Redis and displayed in 
> the web frontend under Status/Bayesian statistics.
> The data is learned when the emails and previously generated scores from 
> RBL, reputation, fuzzy and much more are delivered.
> I'm not too happy with the results because I often get ham scores even 
> though all other checks declare the email as spam. The content of an 
> email can look quite reasonable even though it is spam. I don't have 
> good experience with these results and that's why I only specified -2 
> and +2. Emails can also be learned using rspamc learn_spam/ham. I have 
> learned about 10,000 emails - spam and ham.
> Please correct me, if I am wrong.
> 
> 2. The next way to improve the results is via the external Spamassassin. 
> There is also spamassassin.conf (SA), or you can integrate it via 
> external_services.conf (SPAMD). The advantage is that external filter 
> sources (Heinlein, Schaal-it,...) can be used. The filter can then be 
> further learned and improved using spamc --spam/ham.
> Please correct me, if I am wrong.
> 
> Now I have via rspamd spamassassin.conf:
> ruleset = "/etc/spamassassin/local.cf";
> base_ruleset = "/var/lib/spamassassin/4.000000/*.cf";
> # Limit search size to 100 kilobytes for all regular expressions
> match_limit = 100k;
> 
> sa-update is working
> 
> SA local.cf is
> use_bayes 1
> bayes_auto_learn 1
> bayes_file_mode 777
> bayes_path /var/lib/spamassassin/bayes_db
> 
> 
> specified, but I can't find out whether these are also used by rspamd. 
> spamassassin itself does not generate any logs. I can't find anything 
> about this in the RSPAMD logs (debug mode). There is also no symbol for 
> spamassassin. How are this SA results processed? spamc --spam email.eml 
> works and learns the email, but I don't know where the results are 
> saved. I can't come up with a solution to this.
> 
> Thank you very much for your help
> Christian


Looks like XY problem to me: why do you need SA for Bayes counting that 
it uses much more stupid algorithm for it? Of course, your whole problem 
looks very weird to me. The *only* reason why SA integration exists are 
testing and legacy concerns (not Bayes or regexps where Rspamd can do 
much better job).

From usenet at schani.com  Fri Mar 15 09:55:50 2024
From: usenet at schani.com (christian)
Date: Fri, 15 Mar 2024 10:55:50 +0100
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
Message-ID: <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>

Am 14.03.2024 um 18:51 schrieb Vsevolod Stakhov:

> Looks like XY problem to me: why do you need SA for Bayes counting that 
> it uses much more stupid algorithm for it? Of course, your whole problem 
> looks very weird to me. The *only* reason why SA integration exists are 
> testing and legacy concerns (not Bayes or regexps where Rspamd can do 
> much better job).

I still get a lot of spam that isn't recognized. There are batches of 
spam campaigns that come from different senders from different 
countries, with the same appearance but different words on the same 
topic (financial, ?hoonky? kitchen knife), which I can currently only 
block with multimap and regex. But after 2 days the new wave comes.
The statistical function (BAYES_SPAM) is of no help because the results 
are not correct. The email has a value of 20, through ASN, RBL, Neural 
and Reputation. Then BAYES_Spam comes and says the email is ok -2. 
Learning doesn't help. I now learn every spam email again using rspamc 
learn_spam. The results do not improve.

How do you solve this?
Christian

From tkazmierczak at man.poznan.pl  Fri Mar 15 10:36:23 2024
From: tkazmierczak at man.poznan.pl (=?UTF-8?Q?Tomasz_Ka=C5=BAmierczak?=)
Date: Fri, 15 Mar 2024 11:36:23 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
 <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
Message-ID: <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>

W dniu 14.03.2024 o?16:00, Vsevolod Stakhov pisze:
> On 14/03/2024 14:03, Tomasz Ka?mierczak wrote:
>> Thank you Gerald,
>>
>> i add this to avast config, and in log i see only this:
>>
>> 2024-03-14 14:44:27 #5792(normal) <346d30>; avast; avast.lua:168: 
>> established connection to 127.0.0.1:8080; retransmits=3
>>
>> and thats all about avast :(
>>
>> rspamd not waiting for response from Avast.
>>
>> 2024-03-14 14:44:27 #5792(normal) <346d30>; avast; avast.lua:168: 
>> established connection to 127.0.0.1:8080; retransmits=3
>> 2024-03-14 14:44:35 #5792(normal) <346d30>; events; 
>> rspamd_session_cleanup: forced removed event on destroy: 
>> 00007FDA087A76C0, subsystem: rspamd lua tcp, scheduled from: AVAST_VIRUS
>>
>> i see in rspamd.log that some modules (Greylist, ISN, etc.) reports:
>>
>> enable slow timer delay
>>
>
> Please stop top-posting in this mailing list, place your reply below 
> the quote.
>
> WRT your issue: you need to adjust task_timeout as well. Honestly, I'd 
> better advise not to use such a slow AV at all.


Sorry, I chose the wrong button for the reply...


in other case i use CLAMAV - its really great.

in this case, the client requires one of the commercial AV.

i'm testing:

- F-Secure Atlant (successor GateKeeper) by ICAP - write to support for help

- Avast - timeout error


i increase task_timeout to 30s and still the same, maybe it is AVAST error?

is any chance to see what exacly rspamd send to AVAST


kazix

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5799 bytes
Desc: Kryptograficzna sygnatura S/MIME
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240315/7456b0c8/attachment.bin>

From vsevolod at rspamd.com  Fri Mar 15 12:14:52 2024
From: vsevolod at rspamd.com (Vsevolod Stakhov)
Date: Fri, 15 Mar 2024 12:14:52 +0000
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
Message-ID: <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>

On 15/03/2024 09:55, christian via Users wrote:
> Am 14.03.2024 um 18:51 schrieb Vsevolod Stakhov:
> 
>> Looks like XY problem to me: why do you need SA for Bayes counting 
>> that it uses much more stupid algorithm for it? Of course, your whole 
>> problem looks very weird to me. The *only* reason why SA integration 
>> exists are testing and legacy concerns (not Bayes or regexps where 
>> Rspamd can do much better job).
> 
> I still get a lot of spam that isn't recognized. There are batches of 
> spam campaigns that come from different senders from different 
> countries, with the same appearance but different words on the same 
> topic (financial, ?hoonky? kitchen knife), which I can currently only 
> block with multimap and regex. But after 2 days the new wave comes.
> The statistical function (BAYES_SPAM) is of no help because the results 
> are not correct. The email has a value of 20, through ASN, RBL, Neural 
> and Reputation. Then BAYES_Spam comes and says the email is ok -2. 
> Learning doesn't help. I now learn every spam email again using rspamc 
> learn_spam. The results do not improve.
> 
> How do you solve this?
> Christian


That's very interesting and I would like to investigate more. In fact, 
both SA and Rspamd are using more or less the same Bayes algorithm with 
some slight differences on tokenisation logic.

If you have samples of misclassification, could you please do the 
following things:

1) Enable "bayes" debugging (add "bayes" to the list of `debug_modules` 
array in the local.d/logging.inc)
2) Check all logs with tag "bayes" when you scan those messages and send 
them to me (probably via private email if there's some confidential data 
or large attachment)
3) Send me both samples and your Redis dump so I can try to experiment 
with that

Maybe (3) would be a huge overkill in terms of privacy and amount of 
data, so I would appreciate if you can do 1-2.

Thanks in advance!

From george.asenov at wpx.net  Fri Mar 15 13:33:50 2024
From: george.asenov at wpx.net (George Asenov)
Date: Fri, 15 Mar 2024 15:33:50 +0200
Subject: [Rspamd-Users] Map rcpt and from in single map
In-Reply-To: <898EC6AB-CFA9-44B1-814E-68302DD9A782@gcore.biz>
References: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>
 <898EC6AB-CFA9-44B1-814E-68302DD9A782@gcore.biz>
Message-ID: <cd150b38-01be-411e-9f1c-43f41db1ee13@wpx.net>

Thanks Gerald;

This multimap type=selector looks like should do exactly what I want but 
can't get it to work.

/etc/rspamd/local.d/multimap.conf

whitelist_rcpt_from {
   type = "selector";
   selector = 'from.lower;rcpts.lower';
   symbol = "TEST_RCPT_FROM_WHITE";
   map = "/var/lib/rspamd/whitelits_rcpt_from.map";
   description = "whitelist rcpt_mail:from combination";
   #score = -10.0;
   action = "reject";
}


/var/lib/rspamd/whitelits_rcpt_from.map
from-mail at domain.com:to-mail at otherdomain.tld
from-mail at domain.com,to-mail at otherdomain.tld


the comma delimited entry is here because when I test with the received 
test email in test selectors UI I get the output with comma


But the symbol is not in the headers list. Also when scan the source the 
SYMBOL is not listed there.

There are other multimap rules in this file which are working properly 
but not this one.

Tested with the score setting and also with the action - same result.

What I miss?


On 14-Mar-24 12:32 AM, Gerald Galster wrote:
>> How create map to achieve the result of this rule but for multiple pairs:
>>
>> creative_blacklist_name_one {
>>         priority = high;
>>         from = "@example.com";
>>         rcpt = "@example.net";
>>         apply "default" {
>>             R_DUMMY = 100.0;
>>
>> but for "from" and "rcpt" values to use single map:
>> key=rcpt and value=from or the opposite.
>>
>> I want to create personal whitelist
>> whitelist from X for user Y without the need to edit files and use maps instead.
> 
> Just a guess: a multimap, type = "selector", using a combined from/to selector.
> 
> https://rspamd.com/doc/modules/multimap.html#map-types
> https://rspamd.com/doc/configuration/selectors.html#selectors-combinations
> 
> Or maybe extend some lua code. This checks if a map contains "from":
> https://rspamd.com/doc/lua/rspamd_map.html
> 
> https://rspamd.com/doc/lua/rspamd_task.html#mc5168
> 
> Best regards,
> Gerald

-- 
Warm regards
George A.
WPXHosting

From cr at ncxs.de  Fri Mar 15 15:42:41 2024
From: cr at ncxs.de (Carsten Rosenberg)
Date: Fri, 15 Mar 2024 16:42:41 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
 <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
 <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
Message-ID: <76effb4f-6a51-4cb7-87b3-0a94c06dec98@ncxs.de>

On 15.03.24 11:36, Tomasz Ka?mierczak wrote:
> W dniu 14.03.2024 o?16:00, Vsevolod Stakhov pisze:
>> On 14/03/2024 14:03, Tomasz Ka?mierczak wrote:
> 
> Sorry, I chose the wrong button for the reply...
> 
> 
> in other case i use CLAMAV - its really great.
> 
> in this case, the client requires one of the commercial AV.
> 
> i'm testing:
> 
> - F-Secure Atlant (successor GateKeeper) by ICAP - write to support for 
> help
> 
> - Avast - timeout error
> 
> 
> i increase task_timeout to 30s and still the same, maybe it is AVAST error?
> 
> is any chance to see what exacly rspamd send to AVAST
> 
> 
> kazix
> 

Your configuration for avast cannot work. And here is why: There is an 
old Avast plugin (w/o rest-api usage) which seem never was documented.

And there was a new one which was never merged into the main code:

https://github.com/rspamd/rspamd/pull/4284

We agreed to rework it a bit an then ...

But the documentation was merged into rspamd.com :-/


Here is what you can try now:

Use the old plugin by trying to set server to the unix socket of the 
scanner:


avast {

   symbol = "AVAST_VIRUS";
   servers = "/var/run/avast/scan.sock";
   tmpdir = '/tmp'

}

tmpdir must be accessable by both services.


Or you integrate the code from the PR above. We are also use it in 
production.

But then need to add type = "avast_rest"; to the config section.


And for the Atlant. Its strange that there seems to be no valid options 
reply (as its a must have in thr RFC). The old F-Secure ICAP was pretty 
good. Maybe only your URL (scheme).

Have you got any reply from F-Secure?


Carsten


From rspamd at jubileegroup.co.uk  Fri Mar 15 15:57:51 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Fri, 15 Mar 2024 15:57:51 +0000 (GMT)
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
 <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
 <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
Message-ID: <9fa5dc9f-b979-92d0-6726-556130686863@jubileegroup.co.uk>

Hi there,

On Fri, 15 Mar 2024, Tomasz Ka?mierczak wrote:
> ...
> in other case i use CLAMAV - its really great.
> 
> in this case, the client requires one of the commercial AV.
>
> i'm testing:
>
> - F-Secure Atlant (successor GateKeeper) by ICAP - write to support for help
>
> - Avast - timeout error

Here are my results for the most recent approximately 500 malicious
emails sent to addresses at my business and scanned by Jotti's very
useful malware scanner (https://virusscan.jotti.org/):

8<----------------------------------------------------------------------

   YES     NO      %  VENDOR (alphabetical)
--------------------------------------
    84    418     17  anti-virus.by
   367    131     74  avast.com
   335    167     67  bitdefender.com
    15    487      3  clamav.net
   245     58     81  cyren.com
   234    268     47  drweb.com
   334    167     67  escanav.com
    59     75     44  eset.com
     9    141      6  f-prot.com
   263    236     53  f-secure.com
   421     77     85  fortinet.com
   352    144     71  gdatasoftware.com
   296    205     59  ikarussecurity.com
    65    435     13  k7computing.com
   180     87     67  kaspersky.com
   169    117     59  sophos.com
    22    480      4  trendmicro.com 
--------------------------------------
  3450 + 3693 = 7143 total tests
--------------------------------------

   %   VENDOR (sort by detection rate)
--------------------------------------
84.5  fortinet.com 
80.9  cyren.com 
73.7  avast.com 
71.0  gdatasoftware.com 
67.4  kaspersky.com 
66.7  bitdefender.com 
66.7  escanav.com 
59.1  sophos.com 
59.1  ikarussecurity.com 
52.7  f-secure.com 
46.6  drweb.com 
44.0  eset.com 
16.7  anti-virus.by 
13.0  k7computing.com
  6.0  f-prot.com
  4.4  trendmicro.com
  3.0  clamav.net

8<----------------------------------------------------------------------

You can probably see why your client doesn't want to use ClamAV.  Of
the two which you are testing, my results indicate that Avast is much
better than F-Secure.  However you do need to keep in mind that these
tests are (1) only on mail and (2) only on mail sent to my business.
I have no information of similar quality for scanning filesystems.

You should also keep in mind that even on a good day, 15% of the mail
carrying malicious payloads will get past *all* avaialble anti-virus
packages.  So you can't rely on anti-virus alone.  If you do, it is
inevitable that malware will get past your defences.

All the malicious emails above were detected by my own milter, but I
do have the luxury of making the rules here.

HTH

-- 

73,
Ged.

From cr at ncxs.de  Fri Mar 15 16:18:35 2024
From: cr at ncxs.de (Carsten Rosenberg)
Date: Fri, 15 Mar 2024 17:18:35 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <9fa5dc9f-b979-92d0-6726-556130686863@jubileegroup.co.uk>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
 <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
 <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
 <9fa5dc9f-b979-92d0-6726-556130686863@jubileegroup.co.uk>
Message-ID: <d0ef02db-9c81-4f2f-8649-7cc273138fd1@ncxs.de>


On 15.03.24 16:57, G.W. Haywood wrote:
> Hi there,
> 
> On Fri, 15 Mar 2024, Tomasz Ka?mierczak wrote:
>> ...
>> in other case i use CLAMAV - its really great.
>>
>> in this case, the client requires one of the commercial AV.
>>
>> i'm testing:
>>
>> - F-Secure Atlant (successor GateKeeper) by ICAP - write to support 
>> for help
>>
>> - Avast - timeout error
> 
> Here are my results for the most recent approximately 500 malicious
> emails sent to addresses at my business and scanned by Jotti's very
> useful malware scanner (https://virusscan.jotti.org/):
> 
> 8<----------------------------------------------------------------------
> 
>  ? YES???? NO????? %? VENDOR (alphabetical)
> --------------------------------------
>  ?? 84??? 418???? 17? anti-virus.by
>  ? 367??? 131???? 74? avast.com
>  ? 335??? 167???? 67? bitdefender.com
>  ?? 15??? 487????? 3? clamav.net
>  ? 245???? 58???? 81? cyren.com
>  ? 234??? 268???? 47? drweb.com
>  ? 334??? 167???? 67? escanav.com
>  ?? 59???? 75???? 44? eset.com
>  ??? 9??? 141????? 6? f-prot.com
>  ? 263??? 236???? 53? f-secure.com
>  ? 421???? 77???? 85? fortinet.com
>  ? 352??? 144???? 71? gdatasoftware.com
>  ? 296??? 205???? 59? ikarussecurity.com
>  ?? 65??? 435???? 13? k7computing.com
>  ? 180???? 87???? 67? kaspersky.com
>  ? 169??? 117???? 59? sophos.com
>  ?? 22??? 480????? 4? trendmicro.com --------------------------------------
>  ?3450 + 3693 = 7143 total tests
> --------------------------------------
> 
>  ? %?? VENDOR (sort by detection rate)
> --------------------------------------
> 84.5? fortinet.com 80.9? cyren.com 73.7? avast.com 71.0  
> gdatasoftware.com 67.4? kaspersky.com 66.7? bitdefender.com 66.7  
> escanav.com 59.1? sophos.com 59.1? ikarussecurity.com 52.7? f-secure.com 
> 46.6? drweb.com 44.0? eset.com 16.7? anti-virus.by 13.0? k7computing.com
>  ?6.0? f-prot.com
>  ?4.4? trendmicro.com
>  ?3.0? clamav.net
> 
> 8<----------------------------------------------------------------------
> 
> You can probably see why your client doesn't want to use ClamAV.? Of
> the two which you are testing, my results indicate that Avast is much
> better than F-Secure.? However you do need to keep in mind that these
> tests are (1) only on mail and (2) only on mail sent to my business.
> I have no information of similar quality for scanning filesystems.
> 
> You should also keep in mind that even on a good day, 15% of the mail
> carrying malicious payloads will get past *all* avaialble anti-virus
> packages.? So you can't rely on anti-virus alone.? If you do, it is
> inevitable that malware will get past your defences.


For Business needs its good to combine at least 2 undependend vendors to 
cover a good portion of new samples.

An for Clamav: Add Sanesecurity and in particular Securiteinfo extra 
signatures and try again :)

> 
> All the malicious emails above were detected by my own milter, but I
> do have the luxury of making the rules here.
> 
> HTH
> 


CArsten

From list+rspamd at gcore.biz  Fri Mar 15 16:24:29 2024
From: list+rspamd at gcore.biz (Gerald Galster)
Date: Fri, 15 Mar 2024 17:24:29 +0100
Subject: [Rspamd-Users] Map rcpt and from in single map
In-Reply-To: <cd150b38-01be-411e-9f1c-43f41db1ee13@wpx.net>
References: <143fabae-5fe7-4977-9607-d104d7dac830@wpx.net>
 <898EC6AB-CFA9-44B1-814E-68302DD9A782@gcore.biz>
 <cd150b38-01be-411e-9f1c-43f41db1ee13@wpx.net>
Message-ID: <587FC28B-7E1A-483F-ACCF-5018C277E22C@gcore.biz>

> This multimap type=selector looks like should do exactly what I want but can't get it to work.
> 
> /etc/rspamd/local.d/multimap.conf
> 
> whitelist_rcpt_from {
>  type = "selector";
>  selector = 'from.lower;rcpts.lower';

   ^^^^^^ try:

selector = 'from.lower.append(";");rcpts.first.lower';

There might be more than one recipient.

>  symbol = "TEST_RCPT_FROM_WHITE";
>  map = "/var/lib/rspamd/whitelits_rcpt_from.map";
>  description = "whitelist rcpt_mail:from combination";
>  #score = -10.0;
>  action = "reject";
> }
> 
> /var/lib/rspamd/whitelits_rcpt_from.map
> from-mail at domain.com:to-mail at otherdomain.tld
> from-mail at domain.com,to-mail at otherdomain.tld

It would have been:
from-mail at domain.comto-mail@otherdomain.tld

With the append(";") added to the selector it's:
from-mail at domain.com;to-mail at otherdomain.tld

To see such details, enable multimap debugging in local.d/logging.inc and watch your log:
debug_modules=['multimap'];

Best regards,
Gerald

From rspamd at jubileegroup.co.uk  Fri Mar 15 16:53:25 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Fri, 15 Mar 2024 16:53:25 +0000 (GMT)
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <d0ef02db-9c81-4f2f-8649-7cc273138fd1@ncxs.de>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
 <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
 <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
 <9fa5dc9f-b979-92d0-6726-556130686863@jubileegroup.co.uk>
 <d0ef02db-9c81-4f2f-8649-7cc273138fd1@ncxs.de>
Message-ID: <3f3dbb9f-9055-739b-7f1a-3ca4f3ca3224@jubileegroup.co.uk>

Hi there,

On Fri, 15 Mar 2024, Carsten Rosenberg wrote:

> ...
> For Business needs its good to combine at least 2 undependend vendors to 
> cover a good portion of new samples.

I wouldn't argue that more scanners will probably catch more malware,
but the fact remains that even if you use *all* the scanners you still
won't catch all the malware with them.

> An for Clamav: Add Sanesecurity and in particular Securiteinfo extra 
> signatures and try again :)

We've used Sanesecurity for fifteen years, and we submit all the
undetected malware automatically to the AV vendors which don't detect
them (including ClamAV, Sanesecurity and Securiteinfo).  I don't have
figures to hand separately for malware detections by the Sanesecurity
and Securiteinfo signatures, however I can say that the Sanesecurity
detection rates for malware seen in email here average *roughly* 70%.
Over the past three years we've averaged about 500 reports of failed
detection per month (spam and malware combined) sent to Sanesecurity.

-- 

73,
Ged.

From usenet at schani.com  Sat Mar 16 14:20:55 2024
From: usenet at schani.com (christian)
Date: Sat, 16 Mar 2024 15:20:55 +0100
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
Message-ID: <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>

Hello Vsevolod,
thank you for your feedback signal.
First of all: I'm a Rspamd beginner and still have a lot to learn. After 
a few weeks, the filter results from Rspamd are already better than with 
my old spam filter ASSP, which I used for a few years.

The reason I'm asking about the Spamassassin integration is because I 
still can't handle a few waves of spam.
Since spamassassin also allows external filter sources (Heinlein, 
schaal-it), I thought this would give you a better handle on local 
(German) spam. I don't know if that's the case.
Unfortunately, I haven't been able to get Spamd/spamassassin to run in 
my RspamD yet. So I can't offer you any comparisons yet.
I'm currently learning the statistical function (BAYES_SPAM) and making 
sure I keep it clean, but the results are still not too good. I don't 
really know what data underlies the results. e.g. an email that has 
already undergone several checks in RspamD:

X-Spamd-Result: default: False [20.03 / 30.00];
PH_SURBL_MULTI(7.50)[dennisberrien.com:url];
NEURAL_SPAM_SHORT(3.00)[1,000];
HFILTER_HOSTNAME_UNKNOWN(2.50)[];
MISSING_MID(2.50)[];
IP_REPUTATION_SPAM(1.39)[asn: 47674(0.23), country: MO(0.01), ip: 
185.236.231.93(0.00)];
R_BAD_CTE_7BIT(1.05)[7bit,utf8];
R_NO_SPACE_IN_FROM(1.00)[];
MV_CASE(0.50)[];
FORGED_SENDER(0.30)[no-reply at ehtakoskelo.fi,return at ehtakoskelo.fi];
MIME_HTML_ONLY(0.20)[];
ONCE_RECEIVED(0.10)[];
MX_GOOD(-0.01)[];
BAYES_SPAM(-5.00)[99.99%];

But I have already learned such emails using rspamc learn_spam, and 
BAYES_SPAM still says that it is HAM. For this example email it doesn't 
matter because the other values clearly indicate spam, but I have some 
in the border area where the Bayes value is important.

I have now
BAYES_SPAM redis 5378 1
BAYES_HAM redis 5283 1

and still there are spam emails that have a HAM Bayes value.
Is only the content, i.e. words and terms, of the emails learned or is 
it also header data such as From, Env From, Country, IP?
I currently mostly achieve good results with multimap and specially 
created spam words and domain blacklists. But I always have to stay up 
to date and find the spam terms with every wave of spam and enter them 
into my MAPs.

If I have learned 500 emails where terms like
Bitcoin trading
Bitcoin\sAdvice
BlackRock
Blockchain
Blockchain assets
Cyber coins
Cyber transactions
Cyber currency
Digital\scurrencies
Japanese kitchen knives

If this happens, RspamD Bayes should recognize these emails as spam. But 
the value is still -3

The question arises as to whether my setup is correct.
I integrated RspamD into Postfix via the Milter interface.

milter_default_action = accept
milter_protocol = 6
smtpd_milters = inet:localhost:11332
non_smtpd_milters = inet:127.0.0.1:11332

And I still need those
always_bcc=mailarchive at meineDomain.de
for an archive and currently to check the filter results.

I check these emails redirected via always_bcc manually and also 
register them via rspamc learn_spam or learn_ham.
The individual users (approx. 300) are not yet able to learn Spam/Ham 
themselves (sieve to rspamc etc.)

I'm not sure whether I might be getting incorrect values into the Bayes 
database. In the last few weeks I have deleted the redis DB 2-3 times 
and started learning again and have also made a conscious effort to keep 
everything clean. But it still doesn't quite fit.

That's the reason why I tried to get Spamassassin to work, but so far 
with little success.
I will continue to observe how my results with RspamD Bayes develop and 
continue to learn.
But I'm still very happy with RspamD because the results are much better 
than with my old ASSP environment.

Thank you for your efforts.
Best regards
Christian

Am 15.03.2024 um 13:14 schrieb Vsevolod Stakhov:
> On 15/03/2024 09:55, christian via Users wrote:
>> Am 14.03.2024 um 18:51 schrieb Vsevolod Stakhov:
>>
>>> Looks like XY problem to me: why do you need SA for Bayes counting 
>>> that it uses much more stupid algorithm for it? Of course, your whole 
>>> problem looks very weird to me. The *only* reason why SA integration 
>>> exists are testing and legacy concerns (not Bayes or regexps where 
>>> Rspamd can do much better job).
>>
>> I still get a lot of spam that isn't recognized. There are batches of 
>> spam campaigns that come from different senders from different 
>> countries, with the same appearance but different words on the same 
>> topic (financial, ?hoonky? kitchen knife), which I can currently only 
>> block with multimap and regex. But after 2 days the new wave comes.
>> The statistical function (BAYES_SPAM) is of no help because the 
>> results are not correct. The email has a value of 20, through ASN, 
>> RBL, Neural and Reputation. Then BAYES_Spam comes and says the email 
>> is ok -2. Learning doesn't help. I now learn every spam email again 
>> using rspamc learn_spam. The results do not improve.
>>
>> How do you solve this?
>> Christian
> 
> 
> That's very interesting and I would like to investigate more. In fact, 
> both SA and Rspamd are using more or less the same Bayes algorithm with 
> some slight differences on tokenisation logic.
> 
> If you have samples of misclassification, could you please do the 
> following things:
> 
> 1) Enable "bayes" debugging (add "bayes" to the list of `debug_modules` 
> array in the local.d/logging.inc)
> 2) Check all logs with tag "bayes" when you scan those messages and send 
> them to me (probably via private email if there's some confidential data 
> or large attachment)
> 3) Send me both samples and your Redis dump so I can try to experiment 
> with that
> 
> Maybe (3) would be a huge overkill in terms of privacy and amount of 
> data, so I would appreciate if you can do 1-2.
> 
> Thanks in advance!

From vsevolod at rspamd.com  Sat Mar 16 14:55:28 2024
From: vsevolod at rspamd.com (Vsevolod Stakhov)
Date: Sat, 16 Mar 2024 14:55:28 +0000
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
Message-ID: <8edeb354-b02f-a20b-605d-062fefb9c5c2@rspamd.com>

On 16/03/2024 14:20, christian via Users wrote:
> Hello Vsevolod,
> thank you for your feedback signal.
> First of all: I'm a Rspamd beginner and still have a lot to learn. After 
> a few weeks, the filter results from Rspamd are already better than with 
> my old spam filter ASSP, which I used for a few years.
> 

>>
>> That's very interesting and I would like to investigate more. In fact, 
>> both SA and Rspamd are using more or less the same Bayes algorithm 
>> with some slight differences on tokenisation logic.
>>
>> If you have samples of misclassification, could you please do the 
>> following things:
>>
>> 1) Enable "bayes" debugging (add "bayes" to the list of 
>> `debug_modules` array in the local.d/logging.inc)
>> 2) Check all logs with tag "bayes" when you scan those messages and 
>> send them to me (probably via private email if there's some 
>> confidential data or large attachment)
>> 3) Send me both samples and your Redis dump so I can try to experiment 
>> with that
>>
>> Maybe (3) would be a huge overkill in terms of privacy and amount of 
>> data, so I would appreciate if you can do 1-2.
>>
>> Thanks in advance!

Again, could you please do what I have asked here? It might be very 
interesting to look at.

Another thing you could try is to use `rspamadm mime stat -b` command 
for a sample. Then, you will see the tokens Rspamd uses for that 
particular email. Afterwards, you can even check them in Redis using 
something like `HGETALL RS_<number>`, where `number` is printed by `mime 
stat`. You also will see this information if you do like I've asked in 
my previous email quoted above.

From bsd at todoo.biz  Sat Mar 16 15:46:16 2024
From: bsd at todoo.biz (bsd at todoo.biz)
Date: Sat, 16 Mar 2024 16:46:16 +0100
Subject: [Rspamd-Users] Problem with history display and recording / rspamd
 3.8.4
Message-ID: <934439AA-EAFE-4E15-BCD8-BB5A37FDAF3A@todoo.biz>

Since I have upgraded to latest rspamd 3.8.4, I have noticed this new error has popped up. 
It seems to be related to a redis error with a type which has changed. 

Unfortunately I am really bad at managing redis DB. 


Does anyone know how to properly fix this ? 

This is the error reported : 

2024-03-16 12:14:44 #2297419(controller) <f2dbc3>; lua; history_redis.lua:261: got error WRONGTYPE Operation against a key holding the wrong kind of value when getting history: no value


Thanks for your help. 
Greg

--

From rspamd at jubileegroup.co.uk  Sat Mar 16 16:00:28 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sat, 16 Mar 2024 16:00:28 +0000 (GMT)
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
Message-ID: <5b6bacce-e915-d15f-38a8-2da28b63e982@jubileegroup.co.uk>

Hi there,

On Sat, 16 Mar 2024, christian via Users wrote:

> ... e.g. an email that has already undergone several checks in RspamD:
>
> X-Spamd-Result: default: False [20.03 / 30.00];
> PH_SURBL_MULTI(7.50)[dennisberrien.com:url];
> NEURAL_SPAM_SHORT(3.00)[1,000];
> HFILTER_HOSTNAME_UNKNOWN(2.50)[];
> MISSING_MID(2.50)[];
> IP_REPUTATION_SPAM(1.39)[asn: 47674(0.23), country: MO(0.01), ip: 
> 185.236.231.93(0.00)];
> R_BAD_CTE_7BIT(1.05)[7bit,utf8];
> R_NO_SPACE_IN_FROM(1.00)[];
> MV_CASE(0.50)[];
> FORGED_SENDER(0.30)[no-reply at ehtakoskelo.fi,return at ehtakoskelo.fi];
> MIME_HTML_ONLY(0.20)[];
> ONCE_RECEIVED(0.10)[];
> MX_GOOD(-0.01)[];
> BAYES_SPAM(-5.00)[99.99%];
>
> But I have already learned such emails using rspamc learn_spam ...

I really do think that you're making it difficult for yourself.

AFAICT nothing good ever came out of AS47674.  The average DNSBL score
recorded in our database for connections from this ASN is 8.14.  That
means that on average, every connecting IP is on at least three of the
DNSBLs we use (the maximum weight for any single DNSBL here is 3.0).

There's really no point messing about with Bayes for ASNs like this
one, just drop everything from them.

If you wish I can easily provide a list of ASNs with scores greater
than whatever value you desire, which you then could drop with very
good confidence that nobody except the spammers would notice.

The 'score' here is the weighted average of the number of DNSBLs - in
our list of chosen BLs - on which an individual IP is found.  Spamhaus
'Zen' for example has a weight of 3; most of the others have weights
of only 1 or 2.  In my capacity as spam-hater-in-chief, I decide the
weights which we apply to the individual DNSBLs.  It works well now,
but I'm sure there's a lot of room for refinement.

score   count of
avg.>=   ASNs
------------------
  0.0	7283
  1.0    6406
  2.0    6197
  3.0    5985
  4.0	5705
  5.0	5417
  6.0	5047
  7.0	4647
  8.0	4252
  9.0	3797
10.0	3275
11.0	2804
12.0	2330
13.0	1826
14.0	1296
15.0	 719
16.0	 325
17.0	 155
18.0	  46
19.0	  11
20.0	   1
------------------

As you can see, of the IPs from the 7283 ASNs which have connected to
us, about 6000 typically scored three or more from the DNSBLs we use.
The vast majority of those send absolutely nothing but spam.

We tempfail at a score > 1.5.  At 4.0 and above, if the spam rules
find a hit in the individual message, we autoreport.  In the past five
years this has produced one false report (a Microsoft server, which
managed to get itself listed by a couple of well-regarded blacklists
and which sent us a DMARC failure report from a mailing list).

As of this afternoon there are 5383 ASNs with average BL *counts* > 3
(that is ASNs with IPs which, on connecting to us are typically listed
on more than three of the DNSBLs which we use).

For most of the ASNs it's fairly pointless doing the scoring exercise
every time and I'd suggest that, unless you have other priorities than
running an efficient mail service, you just drop the connection like a
hot potato as soon as it comes in.

-- 

73,
Ged.

From vsevolod at rspamd.com  Sat Mar 16 16:03:34 2024
From: vsevolod at rspamd.com (Vsevolod Stakhov)
Date: Sat, 16 Mar 2024 16:03:34 +0000
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
Message-ID: <748fda4d-d3d7-a66a-e8e4-6281a8624fdd@rspamd.com>

On 16/03/2024 14:20, christian via Users wrote:

> BAYES_SPAM(-5.00)[99.99%];

Oo, I totally missed that. Why do you have BAYES_SPAM symbol with a 
negative score?!

From usenet at schani.com  Sat Mar 16 16:16:02 2024
From: usenet at schani.com (christian)
Date: Sat, 16 Mar 2024 17:16:02 +0100
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <5b6bacce-e915-d15f-38a8-2da28b63e982@jubileegroup.co.uk>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
 <5b6bacce-e915-d15f-38a8-2da28b63e982@jubileegroup.co.uk>
Message-ID: <08c6a608-d900-40e2-8d7b-57d0eb71b584@schani.com>


Am 16.03.2024 um 17:00 schrieb G.W. Haywood:

> 
> If you wish I can easily provide a list of ASNs with scores greater
> than whatever value you desire, which you then could drop with very
> good confidence that nobody except the spammers would notice.


Very gladly,
I put together a list of the 15 worst ASNs from this website:
https://emretosunkaya.com/bad-asn-list-to-block-in-your-web-firewall-to-harden-against-malicious-attacks/

The worst for me is: AS36352 AS-COLOCROSSING


Thank you
Christian

From usenet at schani.com  Sat Mar 16 16:17:16 2024
From: usenet at schani.com (christian)
Date: Sat, 16 Mar 2024 17:17:16 +0100
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <748fda4d-d3d7-a66a-e8e4-6281a8624fdd@rspamd.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
 <748fda4d-d3d7-a66a-e8e4-6281a8624fdd@rspamd.com>
Message-ID: <23cde335-ccb9-4245-913c-9752b5dc8d92@schani.com>

Am 16.03.2024 um 17:03 schrieb Vsevolod Stakhov:
> On 16/03/2024 14:20, christian via Users wrote:
> 
>> BAYES_SPAM(-5.00)[99.99%];
> 
> Oo, I totally missed that. Why do you have BAYES_SPAM symbol with a 
> negative score?!


I think that's a mistake on my part. I don't see the details yet. Excuse me.

Christian

From rspamd at jubileegroup.co.uk  Sun Mar 17 11:04:45 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sun, 17 Mar 2024 11:04:45 +0000 (GMT)
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <08c6a608-d900-40e2-8d7b-57d0eb71b584@schani.com>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
 <5b6bacce-e915-d15f-38a8-2da28b63e982@jubileegroup.co.uk>
 <08c6a608-d900-40e2-8d7b-57d0eb71b584@schani.com>
Message-ID: <ee781771-2ccd-de9d-76d2-6435a566c36b@jubileegroup.co.uk>

Hi there,

On Sat, 16 Mar 2024, christian via Users wrote:
> Am 16.03.2024 um 17:00 schrieb G.W. Haywood:
>
>> If you wish I can easily provide a list of ASNs with scores greater
>> than whatever value you desire, which you then could drop with very
>> good confidence that nobody except the spammers would notice.
>
> Very gladly,
> I put together a list of the 15 worst ASNs from this website:
> https://emretosunkaya.com/bad-asn-list-to-block-in-your-web-firewall-to-harden-against-malicious-attacks/

Be cautious with that list.  For example from about 30,000 connections
Hetzner (AS24940) scores only 0.19 here.  We do have customers on that
AS, which will tend to skew our measurements - but not by very much.

> The worst for me is: AS36352 AS-COLOCROSSING

Yes, score here 10.48 from approaching 10,000 connections but it's far
from the worst offender that we see.  Below are those with scores more
than 4.0 and more than a thousand connections in the past year.  It's
difficult I think to call any one of them the 'worst' offender, as the
numbers of connections and what those connections try to do must both
be taken into account.  Some of them do nothing but send what I'd call
perfectly ordinary spam; some of them do nothing but make attacks which
try to compromise our servers; some of them send a mixture of malicious
mail and legitimate mail.  If anything those which send a mixture are a
bigger problem than those which have no legitimate reason to connect,
and that's the main reason that we need to make all these measurements.
If we could block all the Bad Guys all the time things would be easier.
You'll probably want to use a monospace font to see the table well.

  asnum  |                        asname                         | score | count 
--------+-------------------------------------------------------+-------+-------
   50613 | ADVANIA ISLAND EHF                                    |  4.09 |  1178
   51659 | LLC BAXET                                             |  4.40 |  1431
   14061 | DIGITALOCEAN-ASN                                      |  4.48 | 12697
    8100 | ASN-QUADRANET-GLOBAL                                  |  4.74 |  2990
   35913 | DEDIPATH-LLC                                          |  5.02 |  1134
   38365 | BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD. |  6.20 |  1807
    4837 | CHINA UNICOM CHINA169 BACKBONE                        |  6.51 |  8187
    4812 | CHINA TELECOM (GROUP)                                 |  6.63 |  1817
   12389 | ROSTELECOM                                            |  6.73 |  1315
   45090 | SHENZHEN TENCENT COMPUTER SYSTEMS COMPANY LIMITED     |  6.75 |  1922
    4134 | CHINANET                                              |  6.83 | 29141
    7922 | COMCAST-7922                                          |  7.38 |  1965
  208708 | EUROCABLE LTD                                         |  8.10 |  2984
   46573 | LAYER-HOST                                            |  8.20 |  2723
  136052 | PT CLOUD HOSTING INDONESIA                            |  8.50 |  1688
  210228 | WEB HOSTED GROUP LTD                                  |  8.55 |  7136
   42864 | GIGANET INTERNET SZOLGALTATO KFT                      |  8.88 |  3219
  200391 | KREZ 999 EOOD                                         |  9.07 |  1381
   24560 | BHARTI AIRTEL LTD., TELEMEDIA SERVICES                |  9.38 |  1753
  213035 | DES CAPITAL B.V.                                      |  9.39 | 27999
  206873 | GALAXYSTAR LLC                                        |  9.42 |  1836
    9808 | GUANGDONG MOBILE COMMUNICATION CO.LTD.                |  9.50 |  2904
    4766 | KOREA TELECOM                                         |  9.59 |  2417
   22773 | ASN-CXA-ALL-CCI-22773-RDC                             |  9.72 |  3325
  135905 | VIETNAM POSTS AND TELECOMMUNICATIONS GROUP            | 10.29 |  2782
  211760 | SUISSE LIMITED                                        | 10.44 |  2599
   36352 | AS-COLOCROSSING                                       | 10.49 |  9223
    1239 | SPRINTLINK                                            | 10.49 |  3003
  209605 | UAB HOST BALTIC                                       | 10.60 |  1349
  138687 | XDEER LIMITED                                         | 11.14 |  1793
   23650 | AS NUMBER FOR CHINANET JIANGSU PROVINCE BACKBONE      | 11.43 |  2112
  208476 | DANILENKO, ARTYOM                                     | 11.51 |  1208
  209371 | ENES KOKEN                                            | 11.62 |  3997
  211252 | DELIS LLC                                             | 11.97 | 15357
    4808 | CHINA UNICOM BEIJING PROVINCE NETWORK                 | 12.38 |  4451
  399471 | AS-SERVERION                                          | 12.41 |  1126
   35478 | BUNEA TELECOM SRL                                     | 12.73 |  2509
   17447 | NET4INDIA LTD                                         | 13.04 |  7091
   51447 | ROOTLAYER WEB SERVICES LTD.                           | 13.18 |  2697
  202306 | HOSTGLOBAL.PLUS LTD                                   | 14.20 |  4485
   17488 | HATHWAY IP OVER CABLE INTERNET                        | 14.33 |  1017
   39032 | IST TELEKOM LLC                                       | 14.98 |  2352
  207713 | GLOBAL INTERNET SOLUTIONS LLC                         | 17.48 |  1021

I think people who run these services should have to have a licence.  Then they
could have it taken away.  Pour encourager les autres, I'd start with AS207713,
and work upwards through the list, one per week, until they got the message.

HTH

-- 

73,
Ged.

From tkazmierczak at man.poznan.pl  Tue Mar 19 10:36:21 2024
From: tkazmierczak at man.poznan.pl (=?UTF-8?Q?Tomasz_Ka=C5=BAmierczak?=)
Date: Tue, 19 Mar 2024 11:36:21 +0100
Subject: [Rspamd-Users] Avast antivirus - IO timeout
In-Reply-To: <76effb4f-6a51-4cb7-87b3-0a94c06dec98@ncxs.de>
References: <1ca8065a-311f-42a1-99e0-7c84c8ac44fa@man.poznan.pl>
 <78B2AB73-E5C6-4221-B529-DF27E478239D@gcore.biz>
 <f1e5d40f-8fed-4f84-a430-4e3c3d617fbb@man.poznan.pl>
 <98574b9b-b343-9819-8e40-07666c9c1d9e@rspamd.com>
 <f96dbc5c-20fc-400e-976d-f2a66a4f6f34@man.poznan.pl>
 <76effb4f-6a51-4cb7-87b3-0a94c06dec98@ncxs.de>
Message-ID: <5db0fb04-3db7-4eb3-820f-cb72b27f88d0@man.poznan.pl>


W dniu 15.03.2024 o?16:42, Carsten Rosenberg pisze:
> Your configuration for avast cannot work. And here is why: There is an 
> old Avast plugin (w/o rest-api usage) which seem never was documented.
>
> And there was a new one which was never merged into the main code:
>
> https://github.com/rspamd/rspamd/pull/4284
>
> We agreed to rework it a bit an then ...
>
> But the documentation was merged into rspamd.com :-/
>
Yes, I am a documentation believer.

It may be integrated someday when this merge-4284 is ready for production.

and documentation should also be improved for avast {...} and 
avast_rest{...}


It would be great.

>
> Here is what you can try now:
>
> Use the old plugin by trying to set server to the unix socket of the 
> scanner:
>
>
> avast {
>
> ? symbol = "AVAST_VIRUS";
> ? servers = "/var/run/avast/scan.sock";
> ? tmpdir = '/tmp'
>
> }
>
> tmpdir must be accessable by both services.
>
I add this old style avast config and it works, thanks.

>
> Or you integrate the code from the PR above. We are also use it in 
> production.
>
> But then need to add type = "avast_rest"; to the config section.

I would prefer for officiall release, i am not a RSPAMD expert

>
>
> And for the Atlant. Its strange that there seems to be no valid 
> options reply (as its a must have in thr RFC). The old F-Secure ICAP 
> was pretty good. Maybe only your URL (scheme).
>
> Have you got any reply from F-Secure?

"To update you, I checked your query further with a Product Expert and 
we have decided to escalate your case to R&D... "

They talk to me twice, I send them logs and configuration.

I may create a new message on the Atlant Integration (ICAP) list?

Kazix


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5799 bytes
Desc: Kryptograficzna sygnatura S/MIME
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240319/9655e0b3/attachment.bin>

From bjo at schafweide.org  Wed Mar 20 12:45:13 2024
From: bjo at schafweide.org (Bjoern Franke)
Date: Wed, 20 Mar 2024 13:45:13 +0100
Subject: [Rspamd-Users] Bayes questions and observations
In-Reply-To: <ee781771-2ccd-de9d-76d2-6435a566c36b@jubileegroup.co.uk>
References: <aaa970ef-82e1-421b-928e-0b2384a380c4@schani.com>
 <39f6d81d-fe78-1849-1f20-d0ac5045c5d3@rspamd.com>
 <7625005f-d759-4fa5-9b37-de76dd66511e@schani.com>
 <654e0cc9-67a8-d12c-837f-64fe3218d49c@rspamd.com>
 <deedb74d-9dac-4610-aa63-bfef2d07f625@schani.com>
 <5b6bacce-e915-d15f-38a8-2da28b63e982@jubileegroup.co.uk>
 <08c6a608-d900-40e2-8d7b-57d0eb71b584@schani.com>
 <ee781771-2ccd-de9d-76d2-6435a566c36b@jubileegroup.co.uk>
Message-ID: <7fd05208-cafb-4da2-8ca3-b22fba53c1a3@schafweide.org>

Hi,

> Be cautious with that list.  For example from about 30,000 connections
> Hetzner (AS24940) scores only 0.19 here.  We do have customers on that
> AS, which will tend to skew our measurements - but not by very much.
> 

Ack, even that list here runs on AS24940.

Regards
Bjoern


From dieter.schuetze at beo-doc.de  Fri Mar 22 12:30:11 2024
From: dieter.schuetze at beo-doc.de (=?UTF-8?Q?Dieter_Sch=C3=BCtze?=)
Date: Fri, 22 Mar 2024 13:30:11 +0100
Subject: [Rspamd-Users] rspamd 3.8.4 error on build
Message-ID: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>

When building rspamd 3.8.4 from the sources I run into the following 
error. [ 90%] Linking CXX executable rspamd librspamd-server.so: error: 
undefined reference to 'sframe_encoder_write' librspamd-server.so: 
error: undefined reference to 'sframe_encode' librspamd-server.so: 
error: undefined reference to 'sframe_calc_fre_type' 
librspamd-server.so: error: undefined reference to 
'sframe_fde_create_func_info' librspamd-server.so: error: undefined 
reference to 'sframe_encoder_add_funcdesc' librspamd-server.so: error: 
undefined reference to 'sframe_encoder_add_fre' librspamd-server.so: 
error: undefined reference to 'sframe_decode' librspamd-server.so: 
error: undefined reference to 'sframe_decoder_get_num_fidx' 
librspamd-server.so: error: undefined reference to 'sframe_decoder_free' 
librspamd-server.so: error: undefined reference to 
'sframe_decoder_get_abi_arch' librspamd-server.so: error: undefined 
reference to 'sframe_encoder_get_abi_arch' librspamd-server.so: error: 
undefined reference to 'sframe_encoder_get_num_fidx' 
librspamd-server.so: error: undefined reference to 
'sframe_decoder_get_funcdesc' librspamd-server.so: error: undefined 
reference to 'sframe_decoder_get_hdr_size' librspamd-server.so: error: 
undefined reference to 'sframe_decoder_get_fre' librspamd-server.so: 
error: undefined reference to 'sframe_decoder_get_fixed_fp_offset' 
librspamd-server.so: error: undefined reference to 
'sframe_decoder_get_fixed_ra_offset' librspamd-server.so: error: 
undefined reference to 'sframe_encoder_free' collect2: error: ld 
returned 1 exit status Can someone give me a hint? Thank you

Regards Dieter


-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240322/29696be0/attachment.bin>

From rspamd at jubileegroup.co.uk  Fri Mar 22 13:00:42 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Fri, 22 Mar 2024 13:00:42 +0000 (GMT)
Subject: [Rspamd-Users] rspamd 3.8.4 error on build
In-Reply-To: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
References: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
Message-ID: <62d3db3f-d15d-fa14-bc11-f5e8d490df3a@jubileegroup.co.uk>

Hi there,

On Fri, 22 Mar 2024, Dieter Sch?tze via Users wrote:

> When building rspamd 3.8.4 from the sources I run into the following error. [ 
> 90%] Linking CXX executable rspamd librspamd-server.so: error: undefined 
> reference to 'sframe_encoder_write' ...
> ...
> Can someone give me a hint?

You seem to be missing at least some of the dependencies.

Did you check that you have installed all the necessary packages?

-- 

73,
Ged.

From dieter.schuetze at beo-doc.de  Fri Mar 22 13:35:04 2024
From: dieter.schuetze at beo-doc.de (=?UTF-8?Q?Dieter_Sch=C3=BCtze?=)
Date: Fri, 22 Mar 2024 14:35:04 +0100
Subject: [Rspamd-Users] rspamd 3.8.4 error on build
In-Reply-To: <62d3db3f-d15d-fa14-bc11-f5e8d490df3a@jubileegroup.co.uk>
References: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
 <62d3db3f-d15d-fa14-bc11-f5e8d490df3a@jubileegroup.co.uk>
Message-ID: <383382c4-d99a-4205-8346-c811aac0d7ba@beo-doc.de>

   Hi,
I thought I had them all

glib2-devel
lib64event-devel
lapack-devel
sodium-devel
unwind-devel
openssl-devel
pcre2-devel
lib64archive-devel
lib64pkgconf-devel
cmake
gmime3.0-devel
file-devel
ragel
icu-devel
lua-devel
hyperscan-devel
jemalloc-devel
openblas-devel
binutils-devel
sqlite3-devel
lib64spf2-devel
lib64opendkim-devel
lib64milter-devel
lib64pkgconf-devel
lib64magic-devel
lib64icu-devel
lib64zlib-devel
lib64hiredis-devel
lib64dwarf-devel


   Am 22.03.24 um 14:00 schrieb G.W. Haywood:

     Hi there,
     On Fri, 22 Mar 2024, Dieter Sch?tze via Users wrote:

     When building rspamd 3.8.4 from the sources I run into the following
     error. [ 90%] Linking CXX executable rspamd librspamd-server.so:
     error: undefined reference to 'sframe_encoder_write' ...
     ...
     Can someone give me a hint?

     You seem to be missing at least some of the dependencies.
     Did you check that you have installed all the necessary packages?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240322/46514bd3/attachment.bin>

From list+rspamd at gcore.biz  Fri Mar 22 14:49:25 2024
From: list+rspamd at gcore.biz (Gerald Galster)
Date: Fri, 22 Mar 2024 15:49:25 +0100
Subject: [Rspamd-Users] rspamd 3.8.4 error on build
In-Reply-To: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
References: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
Message-ID: <B017ECE9-5A50-4868-BBE1-45604AF443A2@gcore.biz>

> When building rspamd 3.8.4 from the sources I run into the following error.
> [ 90%] Linking CXX executable rspamd librspamd-server.so: error:
> undefined reference to 'sframe_encoder_write' librspamd-server.so:

This is probably related to binutils.

Which distribution are you using?

For RHEL9 there are extra packages installed when building the rpm:

gcc-toolset-12-runtime-12.0-6.el9.x86_64
gcc-toolset-12-gcc-12.2.1-7.4.el9.x86_64
gcc-toolset-12-libstdc++-devel-12.2.1-7.4.el9.x86_64
gcc-toolset-12-gcc-c++-12.2.1-7.4.el9.x86_64
gcc-toolset-12-annobin-docs-11.08-2.el9.noarch
gcc-toolset-12-annobin-plugin-gcc-11.08-2.el9.x86_64
gcc-toolset-12-binutils-gold-2.38-19.el9.x86_64
gcc-toolset-12-binutils-2.38-19.el9.x86_64

vs the standard binutils-devel-2.35.2-42.el9.x86_64

In that case you'd have to change the environment to use newer versions.
# source /opt/rh/gcc-toolset-12/enable
# see %build section of rspamd.spec file

Best regards,
Gerald

From rspamd at jubileegroup.co.uk  Fri Mar 22 14:58:00 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Fri, 22 Mar 2024 14:58:00 +0000 (GMT)
Subject: [Rspamd-Users] rspamd 3.8.4 error on build
In-Reply-To: <383382c4-d99a-4205-8346-c811aac0d7ba@beo-doc.de>
References: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
 <62d3db3f-d15d-fa14-bc11-f5e8d490df3a@jubileegroup.co.uk>
 <383382c4-d99a-4205-8346-c811aac0d7ba@beo-doc.de>
Message-ID: <e76fa031-a759-3e29-9d9f-d2a7d755b2de@jubileegroup.co.uk>

Hi there,

On Fri, 22 Mar 2024, Dieter Sch?tze via Users wrote:
>   Am 22.03.24 um 14:00 schrieb G.W. Haywood:
> >     On Fri, 22 Mar 2024, Dieter Sch??ze via Users wrote:
> > >
> > >   When building rspamd 3.8.4 from the sources I run into the following
> > >   error. [ 90%] Linking CXX executable rspamd librspamd-server.so:
> > >   error: undefined reference to 'sframe_encoder_write' ...
> > >   ...
> > >   Can someone give me a hint?
> >
> > You seem to be missing at least some of the dependencies.
> > Did you check that you have installed all the necessary packages?
>
> I thought I had them all
> ...

If you're sure you have all the dependencies then please let us know

1. on exactly what architecture you're building

2. exactly how you got hold of the 3.8.4. source and

3. exactly how you performed the build.

FWIW I've just built rspamd using the version from github (3.9.0) with
no problems, but I didn't try to build 3.8.4 - is there any particular
reason you need the older version?

P.S. Please don't top-post.

-- 

73,
Ged.

From usenet at schani.com  Sat Mar 23 11:32:58 2024
From: usenet at schani.com (christian)
Date: Sat, 23 Mar 2024 12:32:58 +0100
Subject: [Rspamd-Users] duplicate error in logfile
Message-ID: <e3a853ef-9795-4a3c-a593-1649c4aaceaf@schani.com>

Hello,
I have recently had around 10 "duplicate" entry warnings in my log.
   But I can't find the duplicates in the associated map file. The term 
only appears once. Then where is this duplicate? I deleted the redis 
database.
The problem that occurs now is that a service rspamd restart takes 
almost a minute.
Thanks for a tip.
Christian

2024-03-23 12:27:22 [warn] #172084(main) <uqpyz9>; map; 
rspamd_map_helper_insert_re: duplicate re entry found for map 
/etc/rspamd/maps.d/regex_body-HAM.map: geburtsurkunde (old value: '1', 
new: '1')

From rspamd at jubileegroup.co.uk  Sat Mar 23 11:55:17 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sat, 23 Mar 2024 11:55:17 +0000 (GMT)
Subject: [Rspamd-Users] duplicate error in logfile
In-Reply-To: <e3a853ef-9795-4a3c-a593-1649c4aaceaf@schani.com>
References: <e3a853ef-9795-4a3c-a593-1649c4aaceaf@schani.com>
Message-ID: <f9597058-90ef-9c22-6944-6d81dcd4eea@jubileegroup.co.uk>

Hi there,

On Sat, 23 Mar 2024, christian via Users wrote:

> I have recently had around 10 "duplicate" entry warnings in my log.
>  But I can't find the duplicates in the associated map file. The term only 
> appears once. Then where is this duplicate? I deleted the redis database.
> The problem that occurs now is that a service rspamd restart takes almost a 
> minute.
> ...
> 2024-03-23 12:27:22 [warn] #172084(main) <uqpyz9>; map; 
> rspamd_map_helper_insert_re: duplicate re entry found for map 
> /etc/rspamd/maps.d/regex_body-HAM.map: geburtsurkunde (old value: '1', new: 
> '1')

Regular expressions are tricky.  Taking the fact that you seem to be
experiencing a drop in performance together with the warnings in your
log, I suspect that you have crafted a regex which is (a) not doing
what you think it's doing and (b) extremely inefficient, so causing
excessive CPU usage.  It's easy to get regexes ranges so badly wrong
that you DOS yourself, especially if they use the '*' character.

It might be easier if you can post the map file, but if you don't want
to put it on the public list then do please feel free to send it to me
privately.  The message will probably be rejected (private mail to my
list address) but I'll see it anyway.

-- 

73,
Ged.

From tobias.westerhever at skyline.link38.eu  Sat Mar 23 12:07:00 2024
From: tobias.westerhever at skyline.link38.eu (Tobias Westerhever)
Date: Sat, 23 Mar 2024 12:07:00 +0000
Subject: [Rspamd-Users] Questions regarding how to increase rspamd's
 coverage on abused legitimate services/"living off trusted services" (LOTS)
Message-ID: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>

Hello fellow rspamd users,

above all, apologies if the questions/suggestions below have already been discussed
here before (in which case, please point me to the relevant thread, as I was unable
to find one). This e-mail is something between rspamd-users and rspamd-development,
as I hope to implement the ideas below soon, if they strike you all as sensible.

Triggered by https://lots-project.com/, I was thinking of ways to increase rspamd's
coverage on phishing or malspam campaigns that rely on the abuse of legitimate services
("living off trusted services" [LOTS], similar to the "living off the land" TTPs in the
malware ecosystem). Somewhat related, it seems like IPFS has recently gained momentum
again in spam campaigns, sometimes through URL redirectors and the like.

My ideas are as follows:

- Currently, to the best of my understanding, rspamd does attempt to dereference
  shortened URLs, and checks the FQDNs against configured DNSBLs (correct me if this
  is wrong).

  However, regexp-based checks such as for IPFS gateway URLs, etc. are not performed
  on URLs dereferenced from a shortener URL in a message. Fixing this would probably
  reduce false negatives for known bad URLs if they are being disguised by a shortener.

- rspamd maintains a list of redirectors, but not of abused legitimate services
  (such as those mentioned by https://lots-project.com/). Unless a DNSBL lists
  either the involved FQDN (appears to happen rarely due to false positives), or a
  hash of the involved URL, rspamd misses that a message contains a link involving
  an abused legitimate service.

  Maybe introducing a map of such services, including a score of "how bad" the
  situation is, would make sense - similar to attachment types. For example, while
  a OneDrive link (1drv[.]ms et al.) could be legitimate in an e-mail, there is
  very little legitimate use of distributing a *.workers[.]dev (yet another service
  dumped on the world by Cloudflare without any apparent abuse prevention whatsoever :-/ )
  directly via an e-mail.

  While blankly blocking messages based on the presence of such LOTS links is probably
  not feasible, it would at least allow for some scoring, and machine learning to
  pick up such characteristics in spam messages.

- rspamd currently checks the file suffixes and MIME types of attachments. But it
  does not try to attempt to figure out if an URL in a message would lead to the
  download of a file with a "bad" suffix (.lnk, etc.).

  Although this is not a silver bullet, adding checks for trying to determine the
  file suffix from a URL in a message could increase coverage on spam mails containing
  malicious links that are not flagged by DNSBLs and the like already.

Somewhat related are two other occasions for further tuning:

- In contrast to SpamAssassin, rspamd currently does by default resolve IP addresses
  for links in messages, and checks the reputation of these IPs against DNSBLs.

  I get that enabling this by default has a performance impact, as there can be
  dozens of links in a message, and slow DNS response times may cause a DoS against
  rspamd. But from my experience, enabling this picks up a decent amount of badness,
  pushing more messages over the edge to "spam message rejected".

  I therefore wonder if this is something that can be enabled by default again, if
  additional safeguards are in place to prevent excessive performance decrease.

- As attachment policies are increasingly tightened, PDF abuse has increased. Sometimes,
  PDFs disseminated in spam campaigns include a blurred image of the lure, overlayed
  by an IPFS gateway link. Sometimes, they directly contain JavaScript exploits, and
  so on.

  I wonder if rspamd could extract URLs from PDF attachments, and check these against
  local rules, such as regexp patterns looking for IPFS gateway URLs. Checking all
  these links against DNSBLs, however, is probably way out of questions, given that there
  can be hundredths in a single PDF file.

What do you think? Any additional improvement potential I forgot (which is very likely)?

Cheers,
Tobias

From jose.celestino at gmail.com  Sat Mar 23 12:24:49 2024
From: jose.celestino at gmail.com (jose.celestino at gmail.com)
Date: Sat, 23 Mar 2024 12:24:49 +0000
Subject: [Rspamd-Users] Questions regarding how to increase rspamd's
 coverage on abused legitimate services/"living off trusted services" (LOTS)
In-Reply-To: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
Message-ID: <CAN25MDZyAxyQD9O3k3MFd-pFD5KvD-=r1bBGO_Kvm7Rx_aR2Dw@mail.gmail.com>

On Sat, Mar 23, 2024 at 12:08?PM Tobias Westerhever via Users
<users at lists.rspamd.com> wrote:
>
> Hello fellow rspamd users,
>
...
>
> - As attachment policies are increasingly tightened, PDF abuse has increased. Sometimes,
>   PDFs disseminated in spam campaigns include a blurred image of the lure, overlayed
>   by an IPFS gateway link. Sometimes, they directly contain JavaScript exploits, and
>   so on.
>
>   I wonder if rspamd could extract URLs from PDF attachments, and check these against
>   local rules, such as regexp patterns looking for IPFS gateway URLs. Checking all
>   these links against DNSBLs, however, is probably way out of questions, given that there
>   can be hundredths in a single PDF file.

Hi,

The big issue that I'm seeing with PDFs are those encrypted with an
empty password, that rspamd skips altogether but that are opened
seemingly by the clients. You can increase the PDF_ENCRYPTED score but
there are legitimate cases for the use.

That being said spamd can extract the URLs from PDFs and you can match
them against a list (regex) with the multimap module. If I'm not
mistaken the type = "url" doesn't return the flag = content urls so
you should use a selector like:

selector = "specific_urls({need_images = true, need_content = true,
ignore_redirected = false, limit = 50})"

From rspamd at jubileegroup.co.uk  Sat Mar 23 12:56:59 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sat, 23 Mar 2024 12:56:59 +0000 (GMT)
Subject: [Rspamd-Users] Questions regarding how to increase rspamd's
 coverage on abused legitimate services/"living off trusted services" (LOTS)
In-Reply-To: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
Message-ID: <932ca1d7-9923-b46e-11be-c3bf743f1bf8@jubileegroup.co.uk>

Hi there,

On Sat, 23 Mar 2024, Tobias Westerhever via Users wrote:

> ...
> My ideas are as follows:
> ...
> ...
> What do you think?

Everything you've said is about looking at the message content.

As far as I'm concerned there are no "trusted services".

> Any additional improvement potential I forgot ...?

Look at the message headers.

The longer I work with mail abuse, the less I look at message content.
Yes of course there's something to be said for taking a quick look at
the content, but I tend not to get into it too deeply.  It's been my
experience, in more than a quarter of a century of fighting both spam
and malicious mail, that when you start to analyse message content in
depth (1) returns diminish much more rapidly than effort escalates and
(2) the effort we're talking about is both brain power and CPU cycles.

I find that I can make much more difference with much less effort by
looking at, for example, where the message came from rather than what
the message contains.

As far as I'm concerned, if a message has a URL or an attachment then
it's immediately suspect, and, for example, I tend to have lists of
things which won't be rejected rather than lists of things which will.
If nothing else this makes for very much shorter lists which are much
easier to manage.

My advice is don't go where you propose to go; it will be painful, it
isn't actually necessary, and ultimately you'll find that you'll be
fighting a losing battle.

But if you do decide to go there, by all means keep us posted. :)

-- 

73,
Ged.

From usenet at schani.com  Sat Mar 23 14:09:25 2024
From: usenet at schani.com (christian)
Date: Sat, 23 Mar 2024 15:09:25 +0100
Subject: [Rspamd-Users] Questions regarding how to increase rspamd's
 coverage on abused legitimate services/"living off trusted services" (LOTS)
In-Reply-To: <932ca1d7-9923-b46e-11be-c3bf743f1bf8@jubileegroup.co.uk>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
 <932ca1d7-9923-b46e-11be-c3bf743f1bf8@jubileegroup.co.uk>
Message-ID: <a2c608aa-7139-454a-b016-94670578e5b4@schani.com>


Am 23.03.2024 um 13:56 schrieb G.W. Haywood:
> Hi there,
> 
> On Sat, 23 Mar 2024, Tobias Westerhever via Users wrote:
> 
>> ...
>> My ideas are as follows:
>> ...
>> ...
>> What do you think?
> 
> Everything you've said is about looking at the message content.
> 
> As far as I'm concerned there are no "trusted services".
> 
>> Any additional improvement potential I forgot ...?
> 
> Look at the message headers.
> 
> The longer I work with mail abuse, the less I look at message content.
> Yes of course there's something to be said for taking a quick look at
> the content, but I tend not to get into it too deeply.? It's been my
> experience, in more than a quarter of a century of fighting both spam
> and malicious mail, that when you start to analyse message content in
> depth (1) returns diminish much more rapidly than effort escalates and
> (2) the effort we're talking about is both brain power and CPU cycles.
> 
> I find that I can make much more difference with much less effort by
> looking at, for example, where the message came from rather than what
> the message contains.
> 
> As far as I'm concerned, if a message has a URL or an attachment then
> it's immediately suspect, and, for example, I tend to have lists of
> things which won't be rejected rather than lists of things which will.
> If nothing else this makes for very much shorter lists which are much
> easier to manage.
> 
> My advice is don't go where you propose to go; it will be painful, it
> isn't actually necessary, and ultimately you'll find that you'll be
> fighting a losing battle.
> 
> But if you do decide to go there, by all means keep us posted. :)
> 

I have sent an email directly to you. Has she arrived?
Christian

From usenet at schani.com  Sat Mar 23 14:11:25 2024
From: usenet at schani.com (christian)
Date: Sat, 23 Mar 2024 15:11:25 +0100
Subject: [Rspamd-Users] duplicate error in logfile
In-Reply-To: <f9597058-90ef-9c22-6944-6d81dcd4eea@jubileegroup.co.uk>
References: <e3a853ef-9795-4a3c-a593-1649c4aaceaf@schani.com>
 <f9597058-90ef-9c22-6944-6d81dcd4eea@jubileegroup.co.uk>
Message-ID: <883a7d62-3e2d-428e-b00a-7c8e08424727@schani.com>


Am 23.03.2024 um 12:55 schrieb G.W. Haywood:
> Hi there,
> 
> On Sat, 23 Mar 2024, christian via Users wrote:
> 
>> I have recently had around 10 "duplicate" entry warnings in my log.
>> ?But I can't find the duplicates in the associated map file. The term 
>> only appears once. Then where is this duplicate? I deleted the redis 
>> database.
>> The problem that occurs now is that a service rspamd restart takes 
>> almost a minute.
>> ...
>> 2024-03-23 12:27:22 [warn] #172084(main) <uqpyz9>; map; 
>> rspamd_map_helper_insert_re: duplicate re entry found for map 
>> /etc/rspamd/maps.d/regex_body-HAM.map: geburtsurkunde (old value: '1', 
>> new: '1')
> 
> Regular expressions are tricky.? Taking the fact that you seem to be
> experiencing a drop in performance together with the warnings in your
> log, I suspect that you have crafted a regex which is (a) not doing
> what you think it's doing and (b) extremely inefficient, so causing
> excessive CPU usage.? It's easy to get regexes ranges so badly wrong
> that you DOS yourself, especially if they use the '*' character.
> 
> It might be easier if you can post the map file, but if you don't want
> to put it on the public list then do please feel free to send it to me
> privately.? The message will probably be rejected (private mail to my
> list address) but I'll see it anyway.
> 

I have sent an email directly to you. Has she arrived?
Christian

From rspamd at jubileegroup.co.uk  Sat Mar 23 16:20:36 2024
From: rspamd at jubileegroup.co.uk (G.W. Haywood)
Date: Sat, 23 Mar 2024 16:20:36 +0000 (GMT)
Subject: [Rspamd-Users] Questions regarding how to increase rspamd's
 coverage on abused legitimate services/"living off trusted services" (LOTS)
In-Reply-To: <a2c608aa-7139-454a-b016-94670578e5b4@schani.com>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
 <932ca1d7-9923-b46e-11be-c3bf743f1bf8@jubileegroup.co.uk>
 <a2c608aa-7139-454a-b016-94670578e5b4@schani.com>
Message-ID: <e1bb1462-e8d-615a-ae21-84687d87a7f@jubileegroup.co.uk>

Hi there,

On Sat, 23 Mar 2024, christian via Users wrote:
> ...
> I have sent an email directly to you. Has she arrived?

Yes, and I see that the word about which rspamd is complaining *is* in
fact in your HAM map twice.  Once with a capital letter, once without.
Of course it's in there as a German plural too, but that isn't an issue.

There are no issues with regex ranges of course, as you haven't used any.
Sorry, that was what we call a "red herring" - and I'd be surprised if
your map files were the cause of the slow startup.

-- 

73,
Ged.

From usenet at schani.com  Sat Mar 23 17:14:19 2024
From: usenet at schani.com (christian)
Date: Sat, 23 Mar 2024 18:14:19 +0100
Subject: [Rspamd-Users] autolearn spam only works partially
Message-ID: <1cd6a9d1-7987-4f93-b2e5-e6037149aac7@schani.com>

Hello,
I have the problem that the statistic autolearn function only teaches 
Ham. Spam only in small numbers. SPAM/HAM ratio 1:10. Although a lot of 
spam comes in and is marked with add header, rewrite subject or reject. 
The log file then only says autolearn=unavailable

what could that be?

Christian


Server name 	Symbol 		Type 	Learns 	Users
local		BAYES_SPAM	redis	30	1
		BAYES_HAM	redis	150	1


I specified the following in statistic.conf:


   tokenizer {
     name = "osb";
   }
   cache {
   }
   new_schema = true; # Always use new schema
   store_tokens = false; # Redefine if storing of tokens is desired
   signatures = false; # Store learn signatures
   #per_user = true; # Enable per user classifier
   min_tokens = 15;
   backend = "redis";
   min_learns = 200;

   statfile {
     symbol = "BAYES_HAM";
     spam = false;
   }
   statfile {
     symbol = "BAYES_SPAM";
     spam = true;
   }
   learn_condition = 'return require("lua_bayes_learn").can_learn';

	autolearn = true;
	autolearn {
	  spam_threshold = 6.0;
	  junk_threshold = 4.0;
	  ham_threshold = -0.5;
	  check_balance = true;
	  min_balance = 0.9;
	}

From dieter.schuetze at beo-doc.de  Sun Mar 24 14:22:05 2024
From: dieter.schuetze at beo-doc.de (=?UTF-8?Q?Dieter_Sch=C3=BCtze?=)
Date: Sun, 24 Mar 2024 15:22:05 +0100
Subject: [Rspamd-Users] rspamd 3.8.4 error on build
In-Reply-To: <e76fa031-a759-3e29-9d9f-d2a7d755b2de@jubileegroup.co.uk>
References: <9919beae-7c4b-471b-a463-f35b80e0b5df@beo-doc.de>
 <62d3db3f-d15d-fa14-bc11-f5e8d490df3a@jubileegroup.co.uk>
 <383382c4-d99a-4205-8346-c811aac0d7ba@beo-doc.de>
 <e76fa031-a759-3e29-9d9f-d2a7d755b2de@jubileegroup.co.uk>
Message-ID: <5019fa3f-c49f-40d1-bd8f-3ccff57db372@beo-doc.de>

   Hello,

   Found the missing one.
   It was gstreamer1.0-devel

   Thanks for the tip to take a closer look.

   b,t,w. i thought 3.8.4 was the last official stable version

   Am 22.03.24 um 15:58 schrieb G.W. Haywood:

     Hi there,
     On Fri, 22 Mar 2024, Dieter Sch?tze via Users wrote:

       Am 22.03.24 um 14:00 schrieb G.W. Haywood:
     >     On Fri, 22 Mar 2024, Dieter Sch??ze via Users wrote:
     > >
     > >   When building rspamd 3.8.4 from the sources I run into the
     following
     > >   error. [ 90%] Linking CXX executable rspamd
     librspamd-server.so:
     > >   error: undefined reference to 'sframe_encoder_write' ...
     > >   ...
     > >   Can someone give me a hint?
     >
     > You seem to be missing at least some of the dependencies.
     > Did you check that you have installed all the necessary packages?
     I thought I had them all
     ...

     If you're sure you have all the dependencies then please let us know
     1. on exactly what architecture you're building
     2. exactly how you got hold of the 3.8.4. source and
     3. exactly how you performed the build.
     FWIW I've just built rspamd using the version from github (3.9.0)
     with
     no problems, but I didn't try to build 3.8.4 - is there any
     particular
     reason you need the older version?
     P.S. Please don't top-post.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature.asc
Type: application/pgp-signature
Size: 840 bytes
Desc: OpenPGP digital signature
URL: <https://lists.rspamd.com/pipermail/users/attachments/20240324/ec542853/attachment.bin>

From caponecicero at gmail.com  Sun Mar 24 18:08:23 2024
From: caponecicero at gmail.com (Steve Witten)
Date: Sun, 24 Mar 2024 11:08:23 -0700
Subject: [Rspamd-Users] Recurring message in rspamd.log
In-Reply-To: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
Message-ID: <CALGB+_yj88cMUwGDEgMx0F7utvi_PcRvzV15us8D=mgfn29pzw@mail.gmail.com>

Greetings!

I have an *rspamd* installation running on *FreeBSD 14.0-RELEASE-p5*.  My
*rspamd* version is *3.8.4.*

Lately, I've been seeing lots of messages like this:

*2024-03-24 10:59:18 #1155(rspamd_proxy) <jpa7ww>; lua;
> lua_bayes_redis.lua:145: cannot get bayes statistics for BAYES_SPAM:
> timeout while connecting the server*


Is this normal?  The machine (a VPS) is pretty memory constrained...I have
rspamd and redis running on the same server.  This is next on my list to
address...

Thanks in advance.

--
Steve Witten

From caponecicero at gmail.com  Sun Mar 24 22:17:57 2024
From: caponecicero at gmail.com (Steve Witten)
Date: Sun, 24 Mar 2024 15:17:57 -0700
Subject: [Rspamd-Users] Recurring message in rspamd.log
In-Reply-To: <CALGB+_yj88cMUwGDEgMx0F7utvi_PcRvzV15us8D=mgfn29pzw@mail.gmail.com>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
 <CALGB+_yj88cMUwGDEgMx0F7utvi_PcRvzV15us8D=mgfn29pzw@mail.gmail.com>
Message-ID: <CALGB+_yhW_4=_a27x-Xwd0oc390+XXbHvuChVcrJmDUdjcxt1A@mail.gmail.com>

I found some discrepancies between *${CONFDIR}/statistic.conf* and
*${CONFDIR}/local.d/classifier-bayes.conf*.  I corrected those and
restarted both redis and rspamd but to no avail...

*${CONFDIR}/statistic.conf*

classifier "bayes" {
>   tokenizer {
>     name = "osb";
>   }
>   cache {
>   }
>   new_schema = true; # Always use new schema
>   store_tokens = false; # Redefine if storing of tokens is desired
>   signatures = false; # Store learn signatures
>   #per_user = true; # Enable per user classifier
>   min_tokens = 11;
>   backend = "redis";
>   min_learns = 200;
>   statfile {
>     symbol = "BAYES_HAM";
>     spam = false;
>   }
>   statfile {
>     symbol = "BAYES_SPAM";
>     spam = true;
>   }
>   learn_condition = 'return require("lua_bayes_learn").can_learn';
>   # Autolearn sample
>   # autolearn {
>   #  spam_threshold = 6.0; # When to learn spam (score >= threshold and
> action is reject)
>   #  junk_threshold = 4.0; # When to learn spam (score >= threshold and
> action is rewrite subject or add header, and has two or more positive
> results)
>   #  ham_threshold = -0.5; # When to learn ham (score <= threshold and
> action is no action, and score is negative or has three or more negative
> results)
>   #  check_balance = true; # Check spam and ham balance
>   #  min_balance = 0.9; # Keep diff for spam/ham learns for at least this
> value
>   #}
>   .include(try=true; priority=1)
> "$LOCAL_CONFDIR/local.d/classifier-bayes.conf"
>   .include(try=true; priority=10)
> "$LOCAL_CONFDIR/override.d/classifier-bayes.conf"
> }
> .include(try=true; priority=1) "$LOCAL_CONFDIR/local.d/statistic.conf"
> .include(try=true; priority=10) "$LOCAL_CONFDIR/override.d/statistic.conf"
>

*${CONFDIR}/local.d/classifier-bayes.conf*

classifier "bayes" {
>   tokenizer {
>     name = "osb";
>   }
>   cache {
>     backend = "redis";
>   }
>   new_schema = true; # Always use new schema
>   store_tokens = false; # Redefine if storing of tokens is desired
>   signatures = true; # Store learn signatures
>   #per_user = true; # Enable per user classifier
>   min_tokens = 11;
>   backend = "redis";
>   min_learns = 200;
>   expire = 604800; # 7 days (60 * 60 * 24 * 7)
>   statfile {
>     symbol = "BAYES_HAM";
>     spam = false;
>   }
>   statfile {
>     symbol = "BAYES_SPAM";
>     spam = true;
>   }
>   learn_condition = 'return require("lua_bayes_learn").can_learn';
>   # Autolearn sample
>   #
>   autolearn {
>     spam_threshold = 6.0; # When to learn spam (score >= threshold and
> action is reject)
>     junk_threshold = 4.0; # When to learn spam (score >= threshold and
> action is rewrite subject or add header, and has two or more positive
> results)
>     ham_threshold = -0.5; # When to learn ham (score <= threshold and
> action is no action, and score is negative or has three or more negative
> results)
>     check_balance = true; # Check spam and ham balance
>     min_balance = 0.9; # Keep diff for spam/ham learns for at least this
> value
>   }
> }


--
Steve Witten

On Sun, Mar 24, 2024 at 11:08?AM Steve Witten <caponecicero at gmail.com>
wrote:

> Greetings!
>
> I have an *rspamd* installation running on *FreeBSD 14.0-RELEASE-p5*.  My
> *rspamd* version is *3.8.4.*
>
> Lately, I've been seeing lots of messages like this:
>

<snip />

From moiseev at mezonplus.ru  Mon Mar 25 06:25:28 2024
From: moiseev at mezonplus.ru (Alexander Moisseev)
Date: Mon, 25 Mar 2024 09:25:28 +0300
Subject: [Rspamd-Users] Recurring message in rspamd.log
In-Reply-To: <CALGB+_yhW_4=_a27x-Xwd0oc390+XXbHvuChVcrJmDUdjcxt1A@mail.gmail.com>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
 <CALGB+_yj88cMUwGDEgMx0F7utvi_PcRvzV15us8D=mgfn29pzw@mail.gmail.com>
 <CALGB+_yhW_4=_a27x-Xwd0oc390+XXbHvuChVcrJmDUdjcxt1A@mail.gmail.com>
Message-ID: <6d17ba22-b3ce-8742-4b08-ea147f8a4ed1@mezonplus.ru>

On 25.03.2024 1:17, Steve Witten wrote:
> 
> *${CONFDIR}/statistic.conf*
> 
> classifier "bayes" {
>>    tokenizer {
>>      name = "osb";
>>    }
...
>>    #  min_balance = 0.9; # Keep diff for spam/ham learns for at least this
>> value
>>    #}
>>    .include(try=true; priority=1)
>> "$LOCAL_CONFDIR/local.d/classifier-bayes.conf"
>>    .include(try=true; priority=10)
>> "$LOCAL_CONFDIR/override.d/classifier-bayes.conf"
>> }
>> .include(try=true; priority=1) "$LOCAL_CONFDIR/local.d/statistic.conf"
>> .include(try=true; priority=10) "$LOCAL_CONFDIR/override.d/statistic.conf"
>>
> 
> *${CONFDIR}/local.d/classifier-bayes.conf*
> 
> classifier "bayes" {
>>    tokenizer {
>>      name = "osb";
>>    }
...
>> }

This is not correct.
1. Remove the 'classifier "bayes" {}' block declaration from the classifier-bayes.conf since the classifier-bayes.conf contents is included inside the block in statistic.conf .
2. There is no need to duplicate options already defined in classifier-bayes.conf as files are merged.

Check if Redis is running, responding to requests and match Rspamd configuration (# rspamadm configdump redis).


From caponecicero at gmail.com  Mon Mar 25 14:31:54 2024
From: caponecicero at gmail.com (Steve Witten)
Date: Mon, 25 Mar 2024 07:31:54 -0700
Subject: [Rspamd-Users] Recurring message in rspamd.log
In-Reply-To: <6d17ba22-b3ce-8742-4b08-ea147f8a4ed1@mezonplus.ru>
References: <547cc217-5775-43b9-8998-85976fd82c02@skyline.link38.eu>
 <CALGB+_yj88cMUwGDEgMx0F7utvi_PcRvzV15us8D=mgfn29pzw@mail.gmail.com>
 <CALGB+_yhW_4=_a27x-Xwd0oc390+XXbHvuChVcrJmDUdjcxt1A@mail.gmail.com>
 <6d17ba22-b3ce-8742-4b08-ea147f8a4ed1@mezonplus.ru>
Message-ID: <CALGB+_yQhTNtYTKBGX8Y3B+d0=MH+p3=bpUqUtFvKLZ+bvHj9A@mail.gmail.com>

See inline...

On Sun, Mar 24, 2024 at 11:27?PM Alexander Moisseev via Users <
users at lists.rspamd.com> wrote:

> On 25.03.2024 1:17, Steve Witten wrote:
> >
> > *${CONFDIR}/statistic.conf*
> >
> > classifier "bayes" {
> >>    tokenizer {
> >>      name = "osb";
> >>    }
>

<snip />


> This is not correct.
> 1. Remove the 'classifier "bayes" {}' block declaration from the
> classifier-bayes.conf since the classifier-bayes.conf contents is included
> inside the block in statistic.conf .
> 2. There is no need to duplicate options already defined in
> classifier-bayes.conf as files are merged.
>

This was NOT the problem at all...

Check if Redis is running, responding to requests and match Rspamd
> configuration (# rspamadm configdump redis).
>

This was the problem.  I had been running redis as a server behind a
unix-domain socket (/var/run/redis/redis.sock) and configured rspamd
appropriately.  The socket had appropriate ownership and permissions.  I
changed redis to run behind an internet-domain socket (localhost:6329),
modified rspamd appropriately, restarted everything and the problem went
away.

Using a unix-domain socket for redis USED to work perfectly...so something
changed.  I would suggest that this is a bug.

-- 
Steve Witten

From list+rspamd at gcore.biz  Mon Mar 25 23:00:37 2024
From: list+rspamd at gcore.biz (Gerald Galster)
Date: Tue, 26 Mar 2024 00:00:37 +0100
Subject: [Rspamd-Users] autolearn spam only works partially
In-Reply-To: <1cd6a9d1-7987-4f93-b2e5-e6037149aac7@schani.com>
References: <1cd6a9d1-7987-4f93-b2e5-e6037149aac7@schani.com>
Message-ID: <2D526EBD-063C-454B-BBC1-39D2DCD3445D@gcore.biz>

> I have the problem that the statistic autolearn function only teaches Ham. Spam only in small numbers.
> SPAM/HAM ratio 1:10. Although a lot of spam comes in and is marked with add header, rewrite subject or reject. The log file then only says autolearn=unavailable
> 
> what could that be?

[...]

> 	autolearn {
> 	  spam_threshold = 6.0;
> 	  junk_threshold = 4.0;
> 	  ham_threshold = -0.5;
> 	  check_balance = true;
           ^^^^^^^^
> 	  min_balance = 0.9;
           ^^^^^^^^^

You could enable debug and look for messages like "skip learning spam, balance is not satisfied".

https://github.com/rspamd/rspamd/blob/47fe3c705a1f3495b4b0b23413b74bf6d7a40803/lualib/lua_bayes_learn.lua#L129

Best regards,
Gerald