view xml/dnsbl.in @ 75:1142e46be550

start coding on new config syntax
author carl
date Wed, 13 Jul 2005 23:04:14 -0700
parents fb8afa205293
children 81f1e400e8ab
line wrap: on
line source

<html>

<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>DNSBL Sendmail milter - Version 5.0</title>
</head>

<center>Introduction</center>
<p>This milter is released under the GPL license version 2 included in
the LICENSE file in the distribution, and also available at
<a href="http://www.gnu.org/licenses/gpl.html">http://www.gnu.org/licenses/gpl.html</a>

<p>Consider the case of a mail server that is acting as secondary MX for
a collection of clients, each of which has a collection of mail domains.
Each client may use their own collection of DNSBLs on their primary mail
server.  We present here a mechanism whereby the backup mail server can
use the correct set of DNSBLs for each recipient for each message.  As a
side-effect, it gives us the ability to customize the set of DNSBLs on a
per-recipient basis, so that fred@example.com could use SPEWS and the
SBL, where all other users @example.com use only the SBL.

<p>This milter will also decode (uuencode, base64, mime, html entity,
url encodings) and scan for HTTP and HTTPS URLs and bare hostnames in
the body of the mail.  If any of those host names have A or NS records
on the SBL (or a single configurable DNSBL), the mail will be rejected
unless previously whitelisted.  This milter also counts the number of
invalid HTML tags, and can reject mail if that count exceeds your
specified limit.

<p>The DNSBL milter reads a text configuration file (dnsbl.conf) on
startup, and whenever the config file (or any of the referenced include
files) is changed.  The entire configuration file is case insensitive.

<hr> <center>DCC Issues</center>
<p>If you are also using the <a
href="http://www.rhyolite.com/anti-spam/dcc/">DCC</a> milter, there are
a few considerations.  You may need to whitelist senders from the DCC
bulk detector, or from the DNS based lists.  Those are two very
different reasons for whitelisting.  The former is done thru the DCC
whiteclnt config file, the later is done thru the DNSBL milter config
file.

<p>You may want to blacklist some specific senders or sending domains.
This could be done thru either the DCC (on a global basis, or for a
specific single recipient).  We prefer to do such blacklisting via the
DNSBL milter config, since it can be done for a collection of recipient
mail domains.  The DCC approach has the feature that you can capture the
entire message in the DCC log files.  The DNSBL milter approach has the
feature that the mail is rejected earlier (at RCPT TO time), and the
sending machine just gets a generic "550 5.7.1 no such user" message.

<p>The DCC whiteclnt file can be included in the DNSBL milter config by
the dcc_to and dcc_from statements.  This will import the (env_to,
env_from, and substitute mail_host) entries from the DCC config into the
DNSBL config.  This allows using the DCC config as the single point for
white/blacklisting.

<p>Consider the case where you have multiple clients, each with their
own mail servers, and each running their own DCC milters.  Each client
is using the DCC facilities for envelope from/to white/blacklisting.
Presumably you can use rsync or scp to fetch copies of your clients DCC
whiteclnt files on a regular basis.  Your mail server, acting as a
backup MX for your clients, can use the DNSBL milter, and include those
client DCC config files.  The envelope from/to white/blacklisting will
be appropriately tagged and used only for the domains controlled by each
of those clients.

<hr> <center>Definitions</center>

<p>CONTEXT - a collection of parameters that defines the filtering
context to be used for a collection of envelope recipient addresses.
The context includes such things as the list of DNSBLs to be used, and
the various content filtering parameters.

<p>DNSBL - a named DNS based blocking list is defined by a dns suffix
(e.g. sbl-xbl.spamhaus.org) and a message string that is used to
generate the "550 5.7.1" smtp error return code.  The names of these
DNSBLs will be used to define the DNSBL-LISTs.

<p>DNSBL-LIST - a named list of DNSBLs that will be used for specific
recipients or recipient domains.

<p>The envelope to email address is used to find an initial filtering context.
That context then uses the envelope from email address to find the final
filtering context. The envelope from email address is checked in that context
to see if we should whitelist or blacklist the message
two names (a named DNSBL-LIST, and a named ENVELOPE-FROM-MAP).  If the
recipient is not found in the configuration, the named DEFAULT
dnsbl-list and DEFAULT envelope-from-map will be used.  When mail is
received for that recipient,

<ol>

<li>If the client has authenticated with sendmail, the mail is accepted,
the dns lists are not checked, and the body content is not scanned.

<li>The envelope to email address is used to find an initial filtering
context. We first look for a context that specified the full email address
in the env_to statement. If that is not found, we look for a context that
specified the entire domain name of the envelope recipient in the env_to
statement. If that is not found, we look for a context that specified the
user@ part of the envelope recipient in the env_to statement. If that is not
found, we use the first top level context defined in the config file.

<li>The initial filtering context may redirect to a child context based
on the values in the initial context's env_from statement.  We look for
[1) the full envelope from email address, 2) the domain name part of the
envelope from address, 3) the user@ part of the envelope from address]
in that context's env_from statement, with values that point to a child
context.  If such an entry is found, we switch to that filtering
context.

<li>We lookup [1) the full envelope from email address, 2) the domain
name part of the envelope from address, 3) the user@ part of the
envelope from address] in the filtering context env_from statement.
That results in one of (white, black, unknown, inherit).

<li>If the answer is black, mail to this recipient is rejected with "no
such user", and the dns lists are not checked.

<li>If the answer is white, mail to this recipient is accepted and the
dns lists are not checked.

<li>If the answer is unknown, we don't reject yet, but the dns lists
will be checked, and the content may be scanned.

<li>If the answer is inherit, we repeat the envelope from search in the
parent context.

<li>The dns lists specified in the filtering context are checked and the
mail is rejected if any list has an A record for the standard dns based
lookup scheme (reversed octets of the client followed by the dns
suffix).

<li>If the mail has not been accepted or rejected yet, the body content
is optionally scanned for HTTP URLs (after base64, mime and html entity
decoding), and the first &lt;configurable&gt; host names are checked for
their presence on the SBL.  If any host name is on the SBL, and it is
not on the "ignore" list, the mail is rejected.  If we are doing body
content scanning, we also scan for excessive bad html tags, and if a
&lt;configurable&gt; limit is exceeded, the mail is rejected.

</ol>

<hr> <center>Sendmail access vs. DNSBL</center>
<p>With the standard sendmail.mc dnsbl FEATURE, the dnsbl checks may be
suppressed by entries in the /etc/mail/access database.  For example,
suppose you control a /18 of address space, and have allocated some /24s
to some clients.  You have access entries like

<pre>
192.168.4   OK
192.168.17  OK
</pre>

<p>to allow those clients to smarthost thru your mail server.  Now if
one of those clients happens get infected with a virus that turns a
machine into an open proxy, and their 192.168.4.45 lands on the SBL-XBL,
you will still wind up allowing that infected machine to smarthost thru
your mail servers.

<p>With this DNSBL milter, the sendmail access database cannot override
the dnsbl checks, so that machine won't be able to send mail to or thru
your smarthost mail server (unless the virus/proxy can use smtp-auth).

<p>Using the standard sendmail features, you would add access entries to
allow hosts on your local network to relay thru your mail server.  Those
OK entries in the sendmail access database will override all the dnsbl
checks.  With this DNSBL milter, you will need to have the local users
authenticate with smtp-auth to get the same effect.  You might find <a
href="http://www.lists.dartmouth.edu/IRIA/knowledge_base/linuxinfo/sendmail-ssl-how-to.htm">
these directions</a> helpful for setting up smtp-auth if you are on RH
Linux.

<hr> <center>Installation and configuration</center>
<p>Usage:  Note that this has ONLY been tested on Linux, specifically
RedHat Linux.  In particular, this milter makes no attempt to understand
IPv6.  Your mileage will vary.  You will need at a minimum a C++
compiler with a minimally thread safe STL implementation.  The
distribution includes a test.cpp program.  If it fails this milter won't
work.  If it passes, this milter might work.

Fetch <a href="http://www.five-ten-sg.com/util/dnsbl.tar.gz">dnsbl.tar.gz</a>
and

<pre>
tar xfvz dnsbl.tar.gz
bash install.bash
</pre>

Read and understand the contents of that install.bash script before you
run it.  It may not be suitable for your system.  Modify your
sendmail.mc by removing all the "FEATURE(dnsbl" lines, add the following
line in your sendmail.mc and rebuild the .cf file

<pre>
INPUT_MAIL_FILTER(`dnsbl', `S=local:/var/run/dnsbl/dnsbl.sock, F=T, T=C:30s;S:5m;R:5m;E:5m')
</pre>

Read the sample <a
href="http://www.five-ten-sg.com/dnsbl.conf">/etc/dnsbl/dnsbl.conf</a>
file and modify it to fit your configuration.  You can test your
configuration files, and see a readable internal dump of them on stdout
with

<pre>
cd /etc/dnsbl
/usr/sbin/dnsbl -c
</pre>

You can check a specific envelope from/to pair with

<pre>
cd /etc/dnsbl
from="$1" # or your from address
to="$2"   # or your to address
/usr/sbin/dnsbl -e "$from"'|'"$to"
</pre>

<hr> <center>Performance issues</center>

<p>Consider a high volume high performance machine running sendmail.
Each sendmail process can do its own dns resolution.  Typically, such
dns resolver libraries are not thread safe, and so must be protected by
some sort of mutex in a threaded environment.  When we add a milter to
sendmail, we now have a collection of sendmail processes, and a
collection of milter threads.

<p>We will be doing a lot of dns lookups per mail message, and at least
some of those will take many tens of seconds.  If all this dns work is
serialized inside the milter, we have an upper limit of about 25K mail
messages per day.  That is clearly not sufficient for many sites.

<p>Since we want to do parallel dns resolution across those milter
threads, we add another collection of dns resolver processes.  Each
sendmail process is talking to a milter thread over a socket, and each
milter thread is talking to a dns resolver process over another socket.

<p>Suppose we are processing 20 messages per second, and each message
requires 20 seconds of dns work.  Then we will have 400 sendmail
processes, 400 milter threads, and 400 dns resolver processes.  Of
course that steady state is very unlikely to happen.

<pre>
$Id$
</pre>
</body>
</html>