being more flexible than FEATURE(compat_check)

A user at ServerFault asked how to restrict a user to send mail only to local addresses. Normally in sendmail, user / sender filtering decisions are done using FEATURE(compat_check), but while it does provide flexibility on deciding on specific pairs which are entries in /etc/mail/access, for more flexible stuff you have to write your own version of the check_compat rule set.

check_compat‘s workspace is a string that contains the addresses given in the MAIL FROM: and RCPT TO: SMTP dialog, separated by a $|. Whenever one works with addreses in sendmail, one has to canonify them, but since whatever rule set is called within another rule set always takes one argument (workspace) we have to use macros to store the canonified addresses before proceeding to any pattern matching. So first we have to declare the macros in our

Kput macro

The above snippet has declared a map (named put) and two macros that we will use to store the canonified addresses (named put1 and put2) initialized to some non empty bogus value. Since the workspace for check_compat is in the form sender address $| recipient address, we canonify the recipient address first:

R$* $| $*               $: $1 $| $>canonify $2
R$* $| $*               $: $(put {put2} $@ $2 $) $1

Up to here the rule set puts the canonified mail address for the recipient in ${put2} and returns the sender address (the last $1 in the second line) for further processing. Therefore we are now ready to repeat the process and store the canonified sender address in ${put1}:

R$*             $: $>canonify $1
R$*             $: $(put {put1} $@ $1 $)

Macro operations return an empty string so now we have to retrieve the addresses from the macros and reconstruct a canonified workspace for any further processing:

R$*             $: $&{put1} $| $&{put2}

This results in the workspace now being in the canonified form of:

sender < @ sender . domain . > $| recipient < @ recipient . domain . >

regardless of the multitude of ways one can express an email address in. This is why we need canonification in the first place: There are many ways one can enter an address in MAIL FROM: and RCPT TO: and canonification returns an address in a single format that all the other rule sets can work with.

Now if someone wants to restrict where a user sends mail based on MAIL FROM: and the recipient domain, one can add the following lines in check_compat:

# Now we can filter on sender and recipient
Ruser < @ $=w . > $| $* < $=w . >        $#OK
Ruser < @ $=w . > $| $*                  $#discard $: $2

The above silently discards email not directed to the local domains (Class $=w). If you want to test your rule sets (sendmail -bt) you have to keep in mind that sendmail’s test mode interprets $| as two characters, so you have to use a “translate hack”:

R$* $$| $*    $: $1 $| $2

Now you can check check_compat by typing:

# sendmail -bt
> Translate,check_compat sender@address,recipient@address

and watch what happens. As always keep in mind that in the left hand side of the rules is separated from the right hand side with tabs, not spaces. So do not copy-paste. Type the code instead. Next you need to compile your and restart sendmail. In Debian as root run sendmailconfig to do this.

My eyes hurt! Can it be done another way?

Of course! You can install MIMEDefang together with sendmail and modify filter_recipient to your liking. Depending your operating system / distribution you have to check whether you need to enable filter_recipient or not. In Debian you have to edit /etc/default/mimedefang and restart the MIMEDefang daemon. After enabling it, you need to add in /etc/mail/mimedefang-filter your version for filter_recipient:

sub filter_recipient {
  my ($recipient, $sender, $ip, $hostname, $first, $helo, $rcpt_mailer, $rcpt_host, $rcpt_addr) = @_;

  $sender =~ s/^\<//;
  $sender =~ s/\>$//;
  $sender = lc $sender;
  $recipient=~ s/^\<//;
  $recipient=~ s/\>$//;
  $recipient = lc $recipient;

  # Put your conditions here

  return('CONTINUE', "ok");

You need to reload mimedefang-filter after editing this, so as root run (in Debian) /etc/init.d/mimedefang reload and check your logfiles for any errors.

memcached and MIMEDefang – a cool combination

I like milter-ahead a lot. But in our particular deployment it is not a best fit for it assumes that all the useful information for deciding whether to accept or reject email resides not on the server that it runs on, but in the servers that it queries. This is not milter-ahead’s fault. Milters have no way of expanding aliases while checking the recipient address so the programmer has to use tricks like parsing the output of sendmail -bv user@address thus running a second sendmail process for the same delivery. The alternative would be to hack milter-ahead to check with the alias database the existence of recipient addresses, but doing so the way sendmail reads the alias database is overly complex. One could also write an external daemon to monitor the alias database and inject entries in the (Berkeley DB) database maintained by milter-ahead, but that database is locked exclusively. And yes, exceptions could be entered in the access database, but that would mean maintaining two files for a single (and not so frequent) change in the alias files.

As I’ve blogged before, one of the reasons that I like MIMEDefang is that it gives the Postmaster a full programming language to filter stuff. By simply using md_check_against_smtp_server() a poor man’s non-caching version of milter-ahead is possible. Adding support to read the alias database (be it the text file or the hash table) is also trivial.

But what about the case of busy mail systems? You do not want to hammer your mail servers all the time with queries for which the answer is going to be constant for long periods of time. You need a caching mechanism. At first I thought of implementing such a mechanism the way milter-ahead does: By using a Berkeley DB database and some expiration mechanism, either from within MIMEDefang (retrieve the key and if it should have been expired by now delete it, otherwise proceed as expected) or by an external “garbage collecting” daemon. But such an interface with a clean way to enter keys and values already exists and performs well: memcached. So by using Cache::Memcached within the mimedefang-filter mimicking basic milter-ahead behavior (with caching) was done.

But what about the local aliases in the mail server? After all this was all the fuss that prompted the switch anyway. I wrote a Perl script that opened the alias database using the BerkeleyDB package. Two details need caution here:

  • The first one is ignoring the invalid @:@ entry in the alias database. You do not see it in the alias text file, but you will see it when you run praliases. Sendmail uses this entry in order to know whether the database is up-to-date or not. See the bat book for a longer discussion of this.
  • The second detail is that since the alias database is written by a C program, all strings are NULL terminated. This is not the case with strings that are used as keys and values with Perl and the BerkeleyDB package. However the Perl BerkeleyDB package provides for filters to deal with this case. You need something like:
    $db->filter_fetch_key( sub { s/\0$// } );

And then there’s the issue of making such a script a daemon. One can go the traditional way, use a daemonizer on steroids or simply use Proc::Daemon::Init and be done with it.

memcached comes handy to storing key-value pairs in many system administration tasks and I think I’m going to use it a lot more in mail filtering stuff.

check_compat vs MIMEDefang

We have a user that wishes to have messages sent from discarded by our mailservers. The natural choice for such blocks seems to be FEATURE(compat_check). In fact we had a number of other users with similar requests that were serviced this way. The problem in this case was that the xyzw part of was not constant or predictable and finite. Naturally I thought that a local version of the check_compat ruleset would suffice, since $* matches all possible such hostnames. But it seems that according to the bat book this cannot be done while also using FEATURE(compat_check):

Note that although with V8.12 and later you can still write your own check_compat rule set, doing so has been made unnecessary by the FEATURE(compat_check) (ยง7.5.7 on page 288). But also note that, as of V8.12, you cannot both declare the FEATURE(compat_check) and use this check_compat rule set.

Since I did not wish to tamper with our this time, MIMEDefang came to the rescue: filter_relay is called with arguments both the sender and the receiver and that took care of it. But again, had I chosen to write this using sendmail’s language, it might have looked ugly, but it would also have been a one-liner (ugly but elegant in its own way).

Poor man’s milter-ahead

I have blogged before that the reason that I like MIMEDefang is that it gives the Postmaster a Perl interpreter (a programming language that is) and a library of functions that can be used to filter and manipulate incoming and outgoing email.

Of the functions available I believe that md_check_against_smtp_server() deserves special mention since it can be used to quickly implement a poor man’s milter-ahead or milter-sender. Of course milter-ahead implements many features (caching among others), but with some effort most (if not all) of the functionality can be implemented withing mimedefang-filter.

Then again, milter-ahead does not cost much (€90) even for small organizations, so the not invented here syndrome can be supressed.

Why I like running MIMEDefang

I had a talk with a friend the other day and he told me that he does not run MIMEDefang on his systems. Well I do and since most people run MIMEDefang just to be able to run ClamAV and SpamAssassin from one place, I want to explain why. Because when you have a hammer it is better to have more than two nails.

I like running MIMEDefang because it gives me a Perl interpreter at hand and a set of handy routines to manipulate every message that passes through the mail server (header and body). So for any weird idea that I have and I need a proof of concept, I have a full programming language on my email server waiting to run it. And if I have performance issues, well I can always (re)write it in C.

I really think it pays off to invest sometime to learn how to change stuff in mimedefang-filter(5) and how to write your own versions of the filter_* routines.

MIMEDefang and virii

OK so you use MIMEDefang together with ClamAV[*] to check incoming messages for viral content. But given the fact that an infected machine will bomb you with many many messages, why should you check every message sent for a given time window? This is what I came up with:

The default mimedefang-filter(5) has the following check which discards viral messages:

if ($FoundVirus) {
  md_graphdefang_log('virus', $VirusName, $RelayAddr);
  md_syslog('warning', "Discarding because of virus $VirusName");
  return action_discard();

Changing it to:

if ($FoundVirus) {
  # OK log $RelayAddr
  # If you are on a Debian-like system you have to put
  # use DB_File in /etc/mail/
  # otherwise you have to put it somewhere in mimedefang-filter
  my %vbl;
  my $now;
  tie %vbl, 'DB_File', "/var/cache/local/virbl/virbl.db", O_CREAT|O_RDWR, 0644, $DB_BTREE or die;
  $now = time;
  $vbl{$RelayAddr} = $now;
  untie %vbl;
  md_graphdefang_log('virus', $VirusName, $RelayAddr);
  md_syslog('warning', "Discarding because of virus $VirusName");
  return action_discard();

logs $RelayAddr (the IP address of the infected machine) together with a timestamp in a BerkeleyDB B-Tree. In our example this is /var/cache/local/virbl/virbl.db. You have to make this file writeable by the user that runs MIMEDefang on your system. And now using the following code one can block this IP address prior to inspecting the message content:

# .db is appended by sendmail automagically
Kvirbl btree -a.FOUND /var/cache/local/virbl/virbl

# Always remember:  In sendmail the LHS and the RHS of the is 
# separated with tabs and not spaces.  So do not copy-paste this fragment,
# type it.
R$*                     $: $&{client_addr}
R$*                     $: $(virbl $1 $: $1.NOTFOUND $)
# The next line broken in two for readability
R$* . FOUND             $#error $@ 5.7.1 $: You have sent us mail containing
           a virus and are blocked from our systems for an hour.

So now you need an expiration proccess. How long shall these IP addresses remain in your database? I keep them for one hour. It seems to be a reasonable default. A simple expiry script is the following perl snippet:

use DB_File;
$db = shift or die;
$threshold = shift or die;
tie %d, 'DB_File', $db, O_RDONLY, 0644, $DB_BTREE;
$now = time;
foreach $i (keys %d) {
        $diff = $now - $d{$i};
        if ($diff > $threshold) {
                delete $d{$i};
untie %d;

You can run this script from cron every ten minutes or so. I’ve written my expiry program in C and run it every two minutes. If you also want to do this, you have to remember that the perl snippet on mimedefang-filter that logs $RelayAddr and the timestamp stores the timestamp as a string and not as an integer.

[*] There exist many HOWTOs on how to setup MIMEDefang to work with ClamAV. Just use Google.