[Rspamd-Users] Scan time

Srikrishnan Chitoor jvenkat74 at yahoo.com
Wed Jun 29 09:42:52 UTC 2022


Hi
  Finally identified the problem. The server is in India and there seems to be high latency for DNS tests that involve RSPAMD lists, especially the fuzzy check.
  I disabled the fuzzy check module and the time for scanning came down from average of 4.5 secs to 0.9 seconds.

  Is there any provision to mirror the RSPAMD lists locally? We can help with the Hardware if required.

  On Wednesday, June 22, 2022 at 03:26:44 PM GMT+5:30, G.W. Haywood via Users <users at lists.rspamd.com> wrote:  
 
 Hi there,

On Wed, 22 Jun 2022, Srikrishnan Chitoor via Users wrote:

> We are testing rspamd on a 4GB VPS with some custom rules and maps.
> When we send a test email, we get scan time of about 4000ms (as
> mentioned in rspamd.log) for a small test email of about 2KB size.
> Is this normal?

The VPS might be running your mail software on a top-of-the-range 5GHz
16-core CPU or it might be little better than a 4.77MHz PC.  Before
anyone can say what might or might not be normal for your system you
need to describe it in much more detail than you have.  But with that
issue ignored my answer is yes - although with yet more qualification.

The word 'scan' can mean many different things but most of the time
people use it to mean passing the message from the mail processing
system to a tool of some kind which, in addition to looking for spam,
attempts to find various types of malware.  These tools can often use
a *lot* of CPU cycles to do their work, even for a very small message,
because they're looking for a huge number (order of tens of millions)
of different threats.  The way the system is structured can make a big
difference to processing times.  For example, if the scanner needs to
read its database afresh every time it sees a new message then the
database read time can be a much bigger number than actual scan time;
but the scan can only take place after the database has loaded, so the
overall time is greater.  Most of the time the scanner will be a long-
running daemon which only reloads its database when it detects that
there has been a change in it.  You've told us nothing about all this
for your system.

> Is there any way to find out which is taking lot of time?

Yes of course - for example increase the logging verbosity configured
for the scanner - but is this putting the cart before the horse?  By
that I mean what volumes (expressed in whatever units seem appropriate
to you in your circumstances) of mail do you scan, and what throughput
do you expect that your mail system can cope with?  Occasionally here
it takes 15 to 20 seconds to get all the replies from 18 DNSBL lookups
and, in addition, there are things like reverse IP and SPF lookups but
I don't care as long as the throughput is not degraded.  The milters
which hand messages off to the scanner are runnig at least 60 parallel
processes, so as long as the average mail throughput does not exceed
very roughly three or four messages per second there's no problem, it
won't hold up the mail queue.

Scanning times don't necessarily scale with message size.  The content
type can make a big difference, especially if the scanner takes a view
based on the content type on what it will do to scan.  They mostly do
that, because there's no point scanning a plain text message for many
and various threats which consist of executable code.  Quickly looking
at our logs for this month I see that the size of messages which took
four point something seconds to scan ranges from just over 3kbytes to
just over 60kbytes.  Picking a single more or less random message size
of 130kbytes, scan times this month range from 5.4 to 20.5 seconds.

Spend some quality time with your logs.

If you want better answers, give more details in the questions. :)

> Found these in the log. Is it related to the problem described.
> If so, what can be done?

We don't yet know that there *is* a problem, so we don't yet know that
anything needs to be done.

Without supporting information "It took four seconds to scan a message"
is not an adequate problem description.

If you think there's a problem, you need to say why you think that.

-- 

73,
Ged.
-- 
Users mailing list
Users at lists.rspamd.com
https://lists.rspamd.com/mailman/listinfo/users
  


More information about the Users mailing list