A.10. Adding SpamAssassin
Invoking SpamAssassin at SMTP-time is commonly done in either
of two ways in Exim:
Via the spam condition offered by
Exiscan-ACL. This is the mechanism we
will cover here.
Via SA-Exim, another utility written by
Marc Merlins (<marc (at) merlins.org>),
specifically for running SpamAssassin at SMTP time in Exim.
This program operates through Exim's
local_scan() interface, either patched
directly into the Exim source code, or via Marc's own
dlopen() plugin (which, by the way, is
included in Debian's exim4-daemon-light
and exim4-daemon-heavy packages).
SA-Exim offers some other features as
well, namely greylisting and
teergrubing. However, because the
scan happens after the message data has been received,
neither of these two features may be as useful as they
would be earlier in the SMTP transaction.
SA-Exim can be found at:
http://marc.merlins.org/linux/exim/sa.html.
A.10.1. Invoke SpamAssassin via Exiscan
Exiscan-ACL's
"spam" condition passes the
message through either SpamAssassin or Brightmail, and
triggers if these indicate that the message is junk. By
default, it connects to a SpamAssassin daemon
(spamd) running on
localhost. The host address and port can be
changed by adding a spamd_address setting in
the main section of the Exim
configuration file. For more information, see the
exiscan-acl-spect.txt file included with the
patch.
In our implementation, we are going to reject messages
classified as spam. However, we would like to keep a copy of
such messages in a separate mail folder, at least for the time
being. This is so that the user can periodically scan for
False Positives.
Exim offers controls that can be applied
to a message that is accepted, such as
freeze. The Exiscan-ACL patch adds one more
of these controls, namely fakereject.
This causes the following SMTP response:
550-FAKEREJECT id=message-id
550-Your message has been rejected but is being kept for evaluation.
550 If it was a legit message, it may still be delivered to the target recipient(s).
|
We can incorporate this feature into our implementation, by
inserting the following snippet in acl_data, prior to the final
accept statement:
# Invoke SpamAssassin to obtain $spam_score and $spam_report.
# Depending on the classification, $acl_m9 is set to "ham" or "spam".
#
# If the message is classified as spam, pretend to reject it.
#
warn
set acl_m9 = ham
spam = mail
set acl_m9 = spam
control = fakereject
logwrite = :reject: Rejected spam (score $spam_score): $spam_report
# Add an appropriate X-Spam-Status: header to the message.
#
warn
message = X-Spam-Status: \
${if eq {$acl_m9}{spam}{Yes}{No}} (score $spam_score)\
${if def:spam_report {: $spam_report}}
logwrite = :main: Classified as $acl_m9 (score $spam_score)
|
In this example, $acl_m9 is initially set to
"ham". Then SpamAssassin is invoked as the user
mail. If the message is classified as spam,
then $acl_m9 is set to "spam",
and the FAKEREJECT response above is issued.
Finally, an X-Spam-Status: header is added to
the message. The idea is that the Mail Delivery Agent or
the recipient's Mail User Agent can use this header to
filter junk mail into a separate folder.
A.10.2. Configure SpamAssassin
By default, SpamAssassin presents its report in a verbose,
table-like format, mainly suitable for inclusion in or
attachment to the message body. In our case, we want a terse
report, suitable for the X-Spam-Status:
header in the example above. To do this, we add the following
snippet in its site specific configuration file
(/etc/spamassassin/local.cf,
/etc/mail/spamassassin/local.cf, or similar):
### Report template
clear_report_template
report "_TESTSSCORES(, )_"
|
Also, a Bayesian scoring
feature is built in, and is turned on by default. We normally
want to turn this off, because it requires training that will
be specific to each user, and thus is not suitable for
system-wide SMTP time filtering:
### Disable Bayesian scoring
use_bayes 0
|
For these changes to take effect, you have to restart the
SpamAssassin daemon (spamd).
A.10.3. User Settings and Data
Say you have a number of users that want to specify their
individual SpamAssassin preferences, such as the spam
threshold, acceptable languages and character sets,
white/blacklisted senders, and so on. Or perhaps they really
want to be able to make use of SpamAssassin's native Bayesian
scoring (though I don't see why).
As discussed in the User Settings and Data section
earlier in the document, there is a way for this to happen.
We need to limit the number of recipients we accept per
incoming mail delivery to one. We accept the first
RCPT TO: command issued by the caller, then
defer subsequent ones using a 451 SMTP
response. As with greylisting, if the caller
is a well-behaved MTA it will know how to interpret this
response, and retry later.
A.10.3.1. Tell Exim to accept only one recipient per delivery
In the acl_rcpt_to, we insert the
following statement after validating the recipient address,
but before any accept statements pertaining
to unauthenticated deliveries from remote hosts to local
users (i.e. before any greylist checks, envelope signature
checks, etc):
# Limit the number of recipients in each incoming message to one
# to support per-user settings and data (e.g. for SpamAssassin).
#
# NOTE: Every mail sent to several users at your site will be
# delayed for 30 minutes or more per recipient. This
# significantly slow down the pace of discussion threads
# involving several internal and external parties.
#
defer
message = We only accept one recipient at a time - please try later.
condition = $recipients_count
|
A.10.3.2. Pass the recipient username to SpamAssassin
In acl_data, we modify the
spam condition given in the previous
section, so that it passes on to SpamAssassin the username
specified in the local part of the recipient address.
# Invoke SpamAssassin to obtain $spam_score and $spam_report.
# Depending on the classification, $acl_m9 is set to "ham" or "spam".
#
# We pass on the username specified in the recipient address,
# i.e. the portion before any '=' or '@' character, converted
# to lowercase. Multiple recipients should not occur, since
# we previously limited delivery to one recipient at a time.
#
# If the message is classified as spam, pretend to reject it.
#
warn
set acl_m9 = ham
spam = ${lc:${extract{1}{=@}{$recipients}{$value}{mail}}}
set acl_m9 = spam
control = fakereject
logwrite = :reject: Rejected spam (score $spam_score): $spam_report
|
Note that instead of using Exim's
${local_part:...} function to get the
username, we manually extracted the portion before any
"@" or "=" character. This is
because we will use the latter character in our envelope signature scheme, to
follow.
A.10.3.3. Enable per-user settings in SpamAssassin
Let us now again look at SpamAssassin. First of all, you
may choose to remove the use_bayes 0
setting that we previously added in its site-wide
configuration file. In any case, each user will now have
the ability to decide whether to override this setting for
themselves.
If mailboxes on your system map directly to local UNIX
accounts with home directories, you are done. By default,
the SpamAssassin daemon (spamd) performs
a setuid() to the username we pass to it,
and stores user data and settings in that user's home
directory.
If this is not the case (for instance, if your mail accounts
are managed by Cyrus SASL or by another server), you need to
tell SpamAssassin where to find each user's preferences and
data files. Also, spamd needs to keep
running as a specific local user instead of attempting to
setuid() to a non-existing user.
We do these things by specifying the options passed to
spamd at startup:
On a Debian system, edit the OPTIONS=
setting in /etc/default/spamassassin.
On a RedHat system, edit the
SPAMDOPTIONS= setting in
/etc/sysconfig/spamassassin.
Others, figure it out.
The options you need are:
-u username -
specify the user under which spamd
will run (e.g. mail)
-x - disable configuration files in
user's home directory.
--virtual-config-dir=/var/lib/spamassassin/%u
- specify where per-user settings and data are stored.
"%u" is replaced with the calling username.
spamd must be able to create or
modify this directory:
# mkdir /var/lib/spamassassin
# chown -R mail:mail /var/lib/spamassassin
|
Needless to say, after making these changes, you need to
restart spamd.
|
|