[Rspamd-Users] rspamd with clamav and clamav-unofficial-sigs from Sanesecurity

Thu Nov 25 10:22:38 UTC 2021

Hi there,

On Thu, 25 Nov 2021, Andreas Wass - Glas Gasperlmair wrote:

> some experiences with rspamd, clamav and additional clamav-unofficial-sigs 
> from Sanesecurity?
>
> Anything to configure additional, or does it work straight away after 
> installation?
>
> I'm thinking about following section in 
> /etc/clamav-unofficial-sigs/master.conf
> # Set path to clamd.pid file (see clamd.conf for path location).
> clamd_pid="/var/run/clamav/clamd.pid"
>
> On my Debian 11 Bullseye there is no clamd.pid

First of all ClamAV is a suite of tools which you can use in a number
of ways.  Since we're talking about spam control here I'll limit this
to use with mail systems.  Generally in mail systems which use ClamAV
the scanning process runs as a daemon, and another daemon hands to it
the data which will be scanned.  The scanning daemon is called clamd.
It needs some configuration, more or less all of which is in a single
file which upstream (Sourcefire) calls 'clamd.conf' but which some
package maintainers in some Linux distributions may wantonly (a) split
into pieces and (b) rename, which confuses the hell out of everybody
especially on the ClamAV mailing list.  (I don't use packages for the
things I care about, and I care about security.)  When clamd starts it
reads from its configuration file (of course you may have to find some
way of telling it what that is, or else make sure that you're using
the compiled-in default) the location AND NAME of the PID file if any,
and creates that file if it can, writing into it its PID and nothing
else.  When it stops (at least if it doesn't crash) it removes the
file.  The PID file (usually 'clamd.pid' but it doesn't have to be
that name) is not essential but it can help for example if another
process wants to know the PID of the scanning process in order to talk
to it.  There's a daemon called 'freshclam' which you can use to keep
the ClamAV database up to date, and freshclam can signal clamd to
reload the copy of the database which clamd keeps in memory when the
copy in mass storage has been updated.  The update process for the
'unofficial' databases can do that too.  They need to know the name
and location of the PID file.  There's a freshclam configuration file.
As it happens it doesn't contain the name and location of the PID file
but it *does* contain the location of the clamd configuration file
which it reads to get the PID file.  It's up to you configure the
daemons and to arrange for them to start and stop at appropriate
times, and to configure the MTA to use clamd e.g. via the provided
milter.  Otherwise, clamd will just sit there like a bump on a log and
do nothing but use a lot of memory to no purpose.

Secondly, for use with an MTA, bundled with the ClamAV distribution is
another daemon called 'clamav-milter' for interfacing between an MTA
and the clamd scanning daemon.  You won't be surprised to know that
there's configuration needed for that too, nor that it's usually in
'clamav-milter.conf'.  But you don't *have* to use clamav-milter,
other milters can do the interfacing and if it happens that rspamd
does the interfacing then you may be able to do more than just the
accept/reject/tempfail/discard kind of thing that you can do with a
simple milter - perhaps for example some kind of scoring hopefully to
get more accuracy.

Thirdly, and this means you need to think about it, the clamd daemon
uses a LOT of memory.  About a gigabyte with the official database
minimum, and more if you use 'unofficial' signatures which can come
from various sources and have even more various objectives (some look
only for malware, others for spam).  When clamd reloads the database,
it briefly uses about twice as much memory as it does normally, but
you can configure it not to do it that way at the cost of not being
able to scan anything while it's reloading.  A reload can take quite a
while (minutes) on a slow machine, or a few seconds on a faster one.

Fourthly, my experience of the signatures from Sanesecurity is very
good but don't expect too much from ClamAV.  Search my posts on the
ClamAV users' mailing list for information about detection rates.  You
can write your own rules for clamd in a variety of ways.  My own Yara
rules catch much more than even the unofficial signatures, but I know
a lot about the spam I see coming in.

Finally, I'm very aware that this might all be overwhelming at first.
When you get used to it you'll wonder what you were worried about.

HTH

-- 

73,
Ged.