[Rspamd-Users] Scan time

G.W. Haywood rspamd at jubileegroup.co.uk
Wed Jun 29 15:28:57 UTC 2022

Hello again,

On Wed, 29 Jun 2022, Srikrishnan Chitoor wrote:
> ...
> As to why we think the scan times are high: 1. We run a live system
> with qpsmtpd and scan times are much lesser (typically at around 1-2
> secs), on a system that receives around 100,000 non SPAM emails and
> rejects about 5 times more emails.
> ...

We're getting somewhere at last.

You say that have a system which takes one or two seconds to process
an email.  You say that this system receives about 100,000 non-spam +
500,000 spam emails (I assume per day, but you do need to spell these
things out clearly) which is a total of about 600,000 emails per day.

If you multiply the number of emails per day by the time per email,
that's a total of between 600,000 and 1,200,000 seconds per day.

There are 86,400 seconds in a typical day, and yet you have _not_ said
that a scan time of one to two seconds is a problem so why should this
be the case for scan times of 4.5 seconds?  It might be, it might not.
More information is needed.  Your existing system is probably handling
ten or fifteen simultaneous emails continuously.  It might be able to
handle thiry or forty.  I don't know.  Only you can know that.  Test.

You cannot take a single measurement on a single message and try to
scale that to the throughput for a daily load of half a million
messages.  Most of the time, if DNS is involved, the machines are
waiting for a response to a query that they've sent.  It might take a
couple of milliseconds to send the query but the reply might not be
received for tens of seconds (or indeed it might never be received).
While they are waiting for a reply, the machines can get on with other
things - such as processing other messages.

If you're dealing with millions of messages and you want to do testing
you need to do realistic testing.  You've given no evidence of that.

You haven't said anything which compares the scans in your existing
system and the scans in your test system.  They may be very different.

Do you have any particular scanning requirements?  It seems pointless
to compare the scan times without also comparing what the scans do for
you (or perhaps I should say what they _attempt_ to do for you, since
with email scanning that is by no means the same thing; on a good day
a mail scanner will find only four out of five of the things that it's
looking for, but on a bad day you'll need to rebuild the server).



More information about the Users mailing list