comparison xml/dnsbl.in @ 108:1c7677042b78

move to autoconf/automake/docbook
author carl
date Sun, 18 Dec 2005 12:05:05 -0800
parents 586d5b58040a
children d0dad5610980
comparison
equal deleted inserted replaced
107:eeaaecda4acc 108:1c7677042b78
1 <html> 1 <reference>
2 2 <title>@PACKAGE@ Sendmail milter - Version @VERSION@</title>
3 <head> 3 <partintro>
4 <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> 4 <title>Packages</title>
5 <title>DNSBL Sendmail milter - Version 5.10</title> 5 <para>The various source and binary packages are available at <ulink
6 </head> 6 url="http://www.five-ten-sg.com/@PACKAGE@/packages">http://www.five-ten-sg.com/@PACKAGE@/packages</ulink>
7 7 The most recent documentation is available at <ulink
8 <center>Introduction</center> 8 url="http://www.five-ten-sg.com/@PACKAGE@/">http://www.five-ten-sg.com/@PACKAGE@/</ulink>
9 <p>This milter is released under the GPL license version 2 included in 9 </para>
10 the LICENSE file in the distribution, and also available at 10
11 <a href="http://www.gnu.org/licenses/gpl.html">http://www.gnu.org/licenses/gpl.html</a> 11 </partintro>
12 12
13 <p>Consider the case of a mail server that is acting as secondary MX for 13 <refentry id="@PACKAGE@.1">
14 a collection of clients, each of which has a collection of mail domains. 14 <refentryinfo>
15 Each client may use their own collection of DNSBLs on their primary mail 15 <date>2005-12-18</date>
16 server. We present here a mechanism whereby the backup mail server can 16 </refentryinfo>
17 use the correct set of DNSBLs for each recipient for each message. As a 17
18 side-effect, it gives us the ability to customize the set of DNSBLs on a 18 <refmeta>
19 per-recipient basis, so that fred@example.com could use SPEWS and the 19 <refentrytitle>@PACKAGE@</refentrytitle>
20 SBL, where all other users @example.com use only the SBL. 20 <manvolnum>1</manvolnum>
21 21 <refmiscinfo>@PACKAGE@ @VERSION@</refmiscinfo>
22 <p>This milter can also verify the envelope from/recipient pairs with 22 </refmeta>
23 the primary MX server. This allows the backup mail servers to properly 23
24 reject mail sent to invalid addresses. Otherwise, the backup mail 24 <refnamediv id='name.1'>
25 servers will accept that mail, and then generate a bounce message when 25 <refname>@PACKAGE@</refname>
26 the message is forwarded to the primary server (and rejected there with 26 <refpurpose>a sendmail milter with per-user dnsbl filtering</refpurpose>
27 no such user). 27 </refnamediv>
28 28
29 <p>This milter will also decode (uuencode, base64, mime, html entity, 29 <refsynopsisdiv id='synopsis.1'>
30 url encodings) and scan for HTTP and HTTPS URLs and bare hostnames in 30 <title>Synopsis</title>
31 the body of the mail. If any of those host names have A or NS records 31 <cmdsynopsis>
32 on the SBL (or a single configurable DNSBL), the mail will be rejected 32 <command>@PACKAGE@</command>
33 unless previously whitelisted. This milter also counts the number of 33 <arg><option>-c</option></arg>
34 invalid HTML tags, and can reject mail if that count exceeds your 34 <arg><option>-s</option></arg>
35 specified limit. 35 <arg><option>-d <replaceable class="parameter">n</replaceable></option></arg>
36 36 <arg><option>-e <replaceable class="parameter">from|to</replaceable></option></arg>
37 <p>The DNSBL milter reads a text configuration file (dnsbl.conf) on 37 <arg><option>-r <replaceable class="parameter">local-domain-socket</replaceable></option></arg>
38 startup, and whenever the config file (or any of the referenced include 38 <arg><option>-p <replaceable class="parameter">sendmail-socket</replaceable></option></arg>
39 files) is changed. The entire configuration file is case insensitive. 39 <arg><option>-t <replaceable class="parameter">timeout</replaceable></option></arg>
40 If the configuration cannot be loaded due to a syntax error, the milter 40 </cmdsynopsis>
41 will log the error and quit. If the configuration cannot be reloaded 41 </refsynopsisdiv>
42 after being modified, the milter will log the error and send an email to 42
43 root from dnsbl@$hostname. You probably want to added dnsbl@$hostname 43 <refsect1 id='options.1'>
44 to your /etc/mail/virtusertable since otherwise sendmail will reject 44 <title>Options</title>
45 that message. 45 <variablelist>
46 46 <varlistentry>
47 <hr> <center>DCC Issues</center> 47 <term>-c</term>
48 <p>If you are also using the <a 48 <listitem>
49 href="http://www.rhyolite.com/anti-spam/dcc/">DCC</a> milter, there are 49 <para>
50 a few considerations. You may need to whitelist senders from the DCC 50 Load the configuration file, print a cannonical form
51 bulk detector, or from the DNS based lists. Those are two very 51 of the configuration on stdout, and exit.
52 different reasons for whitelisting. The former is done thru the DCC 52 </para>
53 whiteclnt config file, the later is done thru the DNSBL milter config 53 </listitem>
54 file. 54 </varlistentry>
55 55 <varlistentry>
56 <p>You may want to blacklist some specific senders or sending domains. 56 <term>-s</term>
57 This could be done thru either the DCC (on a global basis, or for a 57 <listitem>
58 specific single recipient). We prefer to do such blacklisting via the 58 <para>
59 DNSBL milter config, since it can be done for a collection of recipient 59 Stress test the configuration loading code by repeating
60 mail domains. The DCC approach has the feature that you can capture the 60 the load/free cycle in an infinite loop.
61 entire message in the DCC log files. The DNSBL milter approach has the 61 </para>
62 feature that the mail is rejected earlier (at RCPT TO time), and the 62 </listitem>
63 sending machine just gets a generic "550 5.7.1 no such user" message. 63 </varlistentry>
64 64 <varlistentry>
65 <p>The DCC whiteclnt file can be included in the DNSBL milter config by 65 <term>-d <replaceable class="parameter">n</replaceable></term>
66 the dcc_to and dcc_from statements. This will import the (env_to, 66 <listitem>
67 env_from, and substitute mail_host) entries from the DCC config into the 67 <para>
68 DNSBL config. This allows using the DCC config as the single point for 68 Set the debug level to <replaceable class="parameter">n</replaceable>.
69 white/blacklisting. 69 </para>
70 70 </listitem>
71 <p>Consider the case where you have multiple clients, each with their 71 </varlistentry>
72 own mail servers, and each running their own DCC milters. Each client 72 <varlistentry>
73 is using the DCC facilities for envelope from/to white/blacklisting. 73 <term>-e <replaceable class="parameter">from|to</replaceable></term>
74 Presumably you can use rsync or scp to fetch copies of your clients DCC 74 <listitem>
75 whiteclnt files on a regular basis. Your mail server, acting as a 75 <para>
76 backup MX for your clients, can use the DNSBL milter, and include those 76 Print the results of looking up the from and to addresses in the
77 client DCC config files. The envelope from/to white/blacklisting will 77 current configuration. The | character is used to separate the from and to
78 be appropriately tagged and used only for the domains controlled by each 78 addresses in the argument to the -e switch.
79 of those clients. 79 </para>
80 80 </listitem>
81 <hr> <center>Definitions</center> 81 </varlistentry>
82 82 <varlistentry>
83 <p>CONTEXT - a collection of parameters that defines the filtering 83 <term>-r <replaceable class="parameter">local-domain-socket</replaceable></term>
84 context to be used for a collection of envelope recipient addresses. 84 <listitem>
85 The context includes such things as the list of DNSBLs to be used, and 85 <para>
86 the various content filtering parameters. 86 Set the local socket used for the connection to our own dns resolver processes.
87 87 </para>
88 <p>DNSBL - a named DNS based blocking list is defined by a dns suffix 88 </listitem>
89 (e.g. sbl-xbl.spamhaus.org) and a message string that is used to 89 </varlistentry>
90 generate the "550 5.7.1" smtp error return code. The names of these 90 <varlistentry>
91 DNSBLs will be used to define the DNSBL-LISTs. 91 <term>-p <replaceable class="parameter">sendmail-socket</replaceable></term>
92 92 <listitem>
93 <p>DNSBL-LIST - a named list of DNSBLs that will be used for specific 93 <para>
94 recipients or recipient domains. 94 Set the socket used for the milter connection to sendmail. This is either
95 95 "inet:port@ip-address" or "local:local-domain-socket-file-name".
96 <hr> <center>Filtering Procedure</center> 96 </para>
97 97 </listitem>
98 <p>If the client has authenticated with sendmail, the mail is accepted, 98 </varlistentry>
99 the filtering contexts are not used, the dns lists are not checked, and 99 <varlistentry>
100 the body content is not scanned. Otherwise, we follow these steps for 100 <term>-t <replaceable class="parameter">timeout</replaceable></term>
101 each recipient. 101 <listitem>
102 102 <para>
103 <ol> 103 Set the timeout in seconds used for communication with sendmail.
104 104 </para>
105 <li>The envelope to email address is used to find an initial filtering 105 </listitem>
106 context. We first look for a context that specified the full email 106 </varlistentry>
107 address in the env_to statement. If that is not found, we look for a 107 </variablelist>
108 context that specified the entire domain name of the envelope recipient 108 </refsect1>
109 in the env_to statement. If that is not found, we look for a context 109
110 that specified the user@ part of the envelope recipient in the env_to 110 <refsect1>
111 statement. If that is not found, we use the first top level context 111 <title>Usage</title>
112 defined in the config file. 112 <para><command>@PACKAGE@</command> -c</para>
113 113 <para><command>@PACKAGE@</command> -s</para>
114 <br><br><li>The initial filtering context may redirect to a child 114 <para><command>@PACKAGE@</command> -d 2</para>
115 context based on the values in the initial context's env_from statement. 115 <para><command>@PACKAGE@</command> -e'someone@aol.com|localname@mydomain.tld'</para>
116 We look for [1) the full envelope from email address, 2) the domain name 116 <para><command>@PACKAGE@</command> -d 10 -r /var/run/dnsbl/dnsbl.resolver.sock -p local:/var/run/dnsbl/dnsbl.sock</para>
117 part of the envelope from address, 3) the user@ part of the envelope 117 </refsect1>
118 from address] in that context's env_from statement, with values that 118
119 point to a child context. If such an entry is found, we switch to that 119 <refsect1 id='introduction.1'>
120 child filtering context. 120 <title>Introduction</title>
121 121 <para>
122 <br><br><li>We lookup [1) the full envelope from email address, 2) the 122 Consider the case of a mail server that is acting as secondary MX for a
123 domain name part of the envelope from address, 3) the user@ part of the 123 collection of clients, each of which has a collection of mail domains.
124 envelope from address] in the filtering context env_from statement. 124 Each client may use their own collection of DNSBLs on their primary mail
125 That results in one of (white, black, unknown, inherit). 125 server. We present here a mechanism whereby the backup mail server can
126 126 use the correct set of DNSBLs for each recipient for each message. As a
127 <br><br><li>If the answer is black, mail to this recipient is rejected 127 side-effect, it gives us the ability to customize the set of DNSBLs on a
128 with "no such user", and the dns lists are not checked. 128 per-recipient basis, so that fred@example.com could use SPEWS and the
129 129 SBL, where all other users @example.com use only the SBL.
130 <br><br><li>If the answer is white, mail to this recipient is accepted 130 </para>
131 and the dns lists are not checked. 131 <para>
132 132 This milter can also verify the envelope from/recipient pairs with the
133 <br><br><li>If the answer is unknown, we don't reject yet, but the dns 133 primary MX server. This allows the backup mail servers to properly
134 lists will be checked, and the content may be scanned. 134 reject mail sent to invalid addresses. Otherwise, the backup mail
135 135 servers will accept that mail, and then generate a bounce message when
136 <br><br><li>If the answer is inherit, we repeat the envelope from search 136 the message is forwarded to the primary server (and rejected there with
137 in the parent context. 137 no such user).
138 138 </para>
139 <br><br><li>The dns lists specified in the filtering context are checked 139 <para>
140 and the mail is rejected if any list has an A record for the standard 140 This milter will also decode (uuencode, base64, mime, html entity, url
141 dns based lookup scheme (reversed octets of the client followed by the 141 encodings) and scan for HTTP and HTTPS URLs and bare hostnames in the
142 dns suffix). 142 body of the mail. If any of those host names have A or NS records on
143 143 the SBL (or a single configurable DNSBL), the mail will be rejected
144 <br><br><li>If the mail has not been accepted or rejected yet, we look 144 unless previously whitelisted. This milter also counts the number of
145 for a verification context, which is the closest ancestor of the 145 invalid HTML tags, and can reject mail if that count exceeds your
146 filtering context that both specifies a verification host, and which 146 specified limit.
147 covers the envelope to address. If we find such a verification context, 147 </para>
148 and the verification host is not our own hostname, we open an smtp 148 <para>
149 conversation with that verification host. The current envelope from and 149 The DNSBL milter reads a text configuration file (dnsbl.conf) on
150 recipient to values are passed to that verification host. If we receive 150 startup, and whenever the config file (or any of the referenced include
151 a 5xy response those commands, we reject the current recipient with "no 151 files) is changed. The entire configuration file is case insensitive.
152 such user". 152 If the configuration cannot be loaded due to a syntax error, the milter
153 153 will log the error and quit. If the configuration cannot be reloaded
154 <br><br><li>If the mail has not been accepted or rejected yet, and the 154 after being modified, the milter will log the error and send an email to
155 filtering context enables content filtering, and this is the first such 155 root from dnsbl@$hostname. You probably want to added dnsbl@$hostname
156 recipient in this smtp transaction, we set the content filtering 156 to your /etc/mail/virtusertable since otherwise sendmail will reject
157 parameters from this context, and enable content filtering for the body 157 that message.
158 of this message. 158 </para>
159 159 </refsect1>
160 </ol> 160
161 161 <refsect1 id='todo.1'>
162 <p>If content filtering is enabled for this body, the mail text is 162 <title>DCC Issues</title>
163 decoded (uuencode, base64, mime, html entity, url encodings), scanned 163 <para>
164 for HTTP and HTTPS URLs, and the first &lt;configurable&gt; host names 164 If you are also using the <ulink
165 are checked for their presence on the single &lt;configurable&gt; DNSBL. 165 url="http://www.rhyolite.com/anti-spam/dcc/">DCC</ulink> milter, there
166 The only known list that is suitable for this purpose is the SBL. If 166 are a few considerations. You may need to whitelist senders from the
167 any of those host names are on that DNSBL (or have nameservers that are 167 DCC bulk detector, or from the DNS based lists. Those are two very
168 on that list), and it is not on the &lt;configurable&gt; ignore list, 168 different reasons for whitelisting. The former is done thru the DCC
169 the mail is rejected. We also scan for excessive bad html tags, and if 169 whiteclnt config file, the later is done thru the DNSBL milter config
170 a &lt;configurable&gt; limit is exceeded, the mail is rejected. 170 file.
171 171 </para>
172 <hr> <center>Sendmail access vs. DNSBL</center> 172 <para>
173 <p>With the standard sendmail.mc dnsbl FEATURE, the dnsbl checks may be 173 You may want to blacklist some specific senders or sending domains.
174 suppressed by entries in the /etc/mail/access database. For example, 174 This could be done thru either the DCC (on a global basis, or for a
175 suppose you control a /18 of address space, and have allocated some /24s 175 specific single recipient). We prefer to do such blacklisting via the
176 to some clients. You have access entries like 176 DNSBL milter config, since it can be done for a collection of recipient
177 177 mail domains. The DCC approach has the feature that you can capture the
178 <pre> 178 entire message in the DCC log files. The DNSBL milter approach has the
179 192.168.4 OK 179 feature that the mail is rejected earlier (at RCPT TO time), and the
180 192.168.17 OK 180 sending machine just gets a generic "550 5.7.1 no such user" message.
181 </pre> 181 </para>
182 182 <para>
183 <p>to allow those clients to smarthost thru your mail server. Now if 183 The DCC whiteclnt file can be included in the DNSBL milter config by the
184 one of those clients happens get infected with a virus that turns a 184 dcc_to and dcc_from statements. This will import the (env_to, env_from,
185 machine into an open proxy, and their 192.168.4.45 lands on the SBL-XBL, 185 and substitute mail_host) entries from the DCC config into the DNSBL
186 you will still wind up allowing that infected machine to smarthost thru 186 config. This allows using the DCC config as the single point for
187 your mail servers. 187 white/blacklisting.
188 188 </para>
189 <p>With this DNSBL milter, the sendmail access database cannot override 189 <para>
190 the dnsbl checks, so that machine won't be able to send mail to or thru 190 Consider the case where you have multiple clients, each with their own
191 your smarthost mail server (unless the virus/proxy can use smtp-auth). 191 mail servers, and each running their own DCC milters. Each client is
192 192 using the DCC facilities for envelope from/to white/blacklisting.
193 <p>Using the standard sendmail features, you would add access entries to 193 Presumably you can use rsync or scp to fetch copies of your clients DCC
194 allow hosts on your local network to relay thru your mail server. Those 194 whiteclnt files on a regular basis. Your mail server, acting as a
195 OK entries in the sendmail access database will override all the dnsbl 195 backup MX for your clients, can use the DNSBL milter, and include those
196 checks. With this DNSBL milter, you will need to have the local users 196 client DCC config files. The envelope from/to white/blacklisting will
197 authenticate with smtp-auth to get the same effect. You might find <a 197 be appropriately tagged and used only for the domains controlled by each
198 href="http://www.ists.dartmouth.edu/classroom/sendmail-ssl-how-to.php"> 198 of those clients.
199 these directions</a> helpful for setting up smtp-auth if you are on RH 199 </para>
200 Linux. 200 </refsect1>
201 201
202 <hr> <center>Installation and configuration</center> 202 <refsect1 id='todo.1'>
203 <p>Usage: Note that this has ONLY been tested on Linux, specifically 203 <title>Definitions</title>
204 RedHat Linux. In particular, this milter makes no attempt to understand 204 <para>
205 IPv6. Your mileage will vary. You will need at a minimum a C++ 205 CONTEXT - a collection of parameters that defines the filtering context
206 compiler with a minimally thread safe STL implementation. The 206 to be used for a collection of envelope recipient addresses. The
207 distribution includes a test.cpp program. If it fails this milter won't 207 context includes such things as the list of DNSBLs to be used, and the
208 work. If it passes, this milter might work. 208 various content filtering parameters.
209 209 </para>
210 Fetch <a href="http://www.five-ten-sg.com/util/dnsbl.tar.gz">dnsbl.tar.gz</a> 210 <para>
211 and 211 DNSBL - a named DNS based blocking list is defined by a dns suffix (e.g.
212 212 sbl-xbl.spamhaus.org) and a message string that is used to generate the
213 <pre> 213 "550 5.7.1" smtp error return code. The names of these DNSBLs will be
214 tar xfvz dnsbl.tar.gz 214 used to define the DNSBL-LISTs.
215 bash install.bash 215 </para>
216 </pre> 216 <para>
217 217 DNSBL-LIST - a named list of DNSBLs that will be used for specific
218 Read and understand the contents of that install.bash script before you 218 recipients or recipient domains.
219 run it. It may not be suitable for your system. Modify your 219 </para>
220 sendmail.mc by removing all the "FEATURE(dnsbl" lines, add the following 220 </refsect1>
221 line in your sendmail.mc and rebuild the .cf file 221
222 222 <refsect1 id='todo.1'>
223 <pre> 223 <title>Filtering Procedure</title>
224 INPUT_MAIL_FILTER(`dnsbl', `S=local:/var/run/dnsbl/dnsbl.sock, F=T, T=C:30s;S:5m;R:5m;E:5m') 224 <para>
225 </pre> 225 If the client has authenticated with sendmail, the mail is accepted, the
226 226 filtering contexts are not used, the dns lists are not checked, and the
227 Read the sample <a 227 body content is not scanned. Otherwise, we follow these steps for each
228 href="http://www.five-ten-sg.com/dnsbl/dnsbl.conf">/etc/dnsbl/dnsbl.conf</a> 228 recipient.
229 file and modify it to fit your configuration. You can test your 229 </para>
230 configuration files, and see a readable internal dump of them on stdout 230 <orderedlist>
231 with 231 <listitem>
232 232 The envelope to email address is used to find an initial filtering
233 <pre> 233 context. We first look for a context that specified the full email
234 cd /etc/dnsbl 234 address in the env_to statement. If that is not found, we look for a
235 /usr/sbin/dnsbl -c 235 context that specified the entire domain name of the envelope recipient
236 </pre> 236 in the env_to statement. If that is not found, we look for a context
237 237 that specified the user@ part of the envelope recipient in the env_to
238 You can check a specific envelope from/to pair with 238 statement. If that is not found, we use the first top level context
239 239 defined in the config file.
240 <pre> 240 </listitem>
241 cd /etc/dnsbl 241 <listitem>
242 from="$1" # or your from address 242 The initial filtering context may redirect to a child context based on
243 to="$2" # or your to address 243 the values in the initial context's env_from statement. We look for [1)
244 /usr/sbin/dnsbl -e "$from"'|'"$to" 244 the full envelope from email address, 2) the domain name part of the
245 </pre> 245 envelope from address, 3) the user@ part of the envelope from address]
246 246 in that context's env_from statement, with values that point to a child
247 <hr> <center>Performance issues</center> 247 context. If such an entry is found, we switch to that child filtering
248 248 context.
249 <p>Consider a high volume high performance machine running sendmail. 249 </listitem>
250 Each sendmail process can do its own dns resolution. Typically, such 250 <listitem>
251 dns resolver libraries are not thread safe, and so must be protected by 251 We lookup [1) the full envelope from email address, 2) the domain name
252 some sort of mutex in a threaded environment. When we add a milter to 252 part of the envelope from address, 3) the user@ part of the envelope
253 sendmail, we now have a collection of sendmail processes, and a 253 from address] in the filtering context env_from statement. That results
254 collection of milter threads. 254 in one of (white, black, unknown, inherit).
255 255 </listitem>
256 <p>We will be doing a lot of dns lookups per mail message, and at least 256 <listitem>
257 some of those will take many tens of seconds. If all this dns work is 257 If the answer is black, mail to this recipient is rejected with "no such
258 serialized inside the milter, we have an upper limit of about 25K mail 258 user", and the dns lists are not checked.
259 messages per day. That is clearly not sufficient for many sites. 259 </listitem>
260 260 <listitem>
261 <p>Since we want to do parallel dns resolution across those milter 261 If the answer is white, mail to this recipient is accepted and the dns
262 threads, we add another collection of dns resolver processes. Each 262 lists are not checked.
263 sendmail process is talking to a milter thread over a socket, and each 263 </listitem>
264 milter thread is talking to a dns resolver process over another socket. 264 <listitem>
265 265 If the answer is unknown, we don't reject yet, but the dns lists will be
266 <p>Suppose we are processing 20 messages per second, and each message 266 checked, and the content may be scanned.
267 requires 20 seconds of dns work. Then we will have 400 sendmail 267 <listitem>
268 processes, 400 milter threads, and 400 dns resolver processes. Of 268 If the answer is inherit, we repeat the envelope from search in the
269 course that steady state is very unlikely to happen. 269 parent context.
270 270 </listitem>
271 <hr> <center>Rejected Ideas</center> 271 <listitem>
272 272 The dns lists specified in the filtering context are checked and the
273 <p>The following ideas have been considered and rejected. 273 mail is rejected if any list has an A record for the standard dns based
274 274 lookup scheme (reversed octets of the client followed by the dns
275 <p>Add max_recipients for each mail domain to the configuration. 275 suffix).
276 Recipients in excess of that limit will be rejected, and all the 276 </listitem>
277 recipients in that domain will be removed if there are some other 277 <listitem>
278 whitelisted recipients. Current spammers *very* rarely send more than 278 If the mail has not been accepted or rejected yet, we look for a
279 ten recipients in a single smtp transaction, so this won't stop 279 verification context, which is the closest ancestor of the filtering
280 any significant amount of spam. 280 context that both specifies a verification host, and which covers the
281 281 envelope to address. If we find such a verification context, and the
282 <p>Add poison addresses to the configuration. If any recipient is 282 verification host is not our own hostname, we open an smtp conversation
283 poison, all recipients are rejected even if they would be whitelisted, 283 with that verification host. The current envelope from and recipient to
284 and the data is rejected if sent. I have a collection of spam trap 284 values are passed to that verification host. If we receive a 5xy
285 addresses that would be suitable for such use. Based on my log files, 285 response those commands, we reject the current recipient with "no such
286 any mail to those spam trap addresses is rejected based on either dnsbl 286 user".
287 lookups or the DCC. So this won't result in blocking any additional 287 </listitem>
288 spam. 288 <listitem>
289 289 If the mail has not been accepted or rejected yet, and the filtering
290 <p>Add an option to only allow one recipient if the return path is 290 context enables content filtering, and this is the first such recipient
291 empty. Based on my log files, there is no mail that violates this 291 in this smtp transaction, we set the content filtering parameters from
292 check. 292 this context, and enable content filtering for the body of this message.
293 293 </listitem>
294 <p>Reject the mail if the envelope from domain name contains any MX 294 </orderedlist>
295 records pointing to 127.0.0.0/8. I don't see any significant amount of spam 295 <para>
296 sent with such domain names. 296 If content filtering is enabled for this body, the mail text is decoded
297 297 (uuencode, base64, mime, html entity, url encodings), scanned for HTTP
298 <hr> <center>Future work</center> 298 and HTTPS URLs, and the first &lt;configurable&gt; host names are
299 299 checked for their presence on the single &lt;configurable&gt; DNSBL.
300 <p>The following ideas are under consideration. 300 The only known list that is suitable for this purpose is the SBL. If
301 301 any of those host names are on that DNSBL (or have nameservers that are
302 <p>Add a per-context option to reject mail if the number of digits in 302 on that list), and it is not on the &lt;configurable&gt; ignore list,
303 the reverse dns client name exceeds some threshold. 303 the mail is rejected. We also scan for excessive bad html tags, and if
304 304 a &lt;configurable&gt; limit is exceeded, the mail is rejected.
305 <pre> 305 </para>
306 $Id$ 306 </refsect1>
307 </pre> 307
308 </body> 308 <refsect1>
309 </html> 309 <title>Sendmail access vs. DNSBL</title>
310 <para>
311 With the standard sendmail.mc dnsbl FEATURE, the dnsbl checks may be
312 suppressed by entries in the /etc/mail/access database. For example,
313 suppose you control a /18 of address space, and have allocated some /24s
314 to some clients. You have access entries like
315 <screen>
316 192.168.4 OK
317 192.168.17 OK
318 </screen>
319 </para>
320 <para>
321 to allow those clients to smarthost thru your mail server. Now if one
322 of those clients happens get infected with a virus that turns a machine
323 into an open proxy, and their 192.168.4.45 lands on the SBL-XBL, you
324 will still wind up allowing that infected machine to smarthost thru your
325 mail servers.
326 </para>
327 <para>
328 With this DNSBL milter, the sendmail access database cannot override the
329 dnsbl checks, so that machine won't be able to send mail to or thru your
330 smarthost mail server (unless the virus/proxy can use smtp-auth).
331 </para>
332 <para>
333 Using the standard sendmail features, you would add access entries to
334 allow hosts on your local network to relay thru your mail server. Those
335 OK entries in the sendmail access database will override all the dnsbl
336 checks. With this DNSBL milter, you will need to have the local users
337 authenticate with smtp-auth to get the same effect. You might find
338 <ulink
339 url="http://www.ists.dartmouth.edu/classroom/sendmail-ssl-how-to.php">
340 these directions</ulink> helpful for setting up smtp-auth if you are on
341 RH Linux.
342 </para>
343 </refsect1>
344
345 <refsect1>
346 <title>Installation and configuration</title>
347 <para>
348 This is a standard GNU autoconf/automake installation, so the normal
349 <screen>
350 ./configure
351 make
352 su
353 make install
354 </screen>
355 works. "make chkconfig" will setup the init.d runlevel scripts.
356 </para>
357 <para>
358 Note that this has ONLY been tested on Linux, specifically RedHat Linux.
359 In particular, this milter makes no attempt to understand IPv6. Your
360 mileage will vary. You will need at a minimum a C++ compiler with a
361 minimally thread safe STL implementation. The distribution includes a
362 test.cpp program. If it fails this milter won't work. If it passes,
363 this milter might work.
364 </para>
365 <para>
366 Modify your sendmail.mc by removing all the "FEATURE(dnsbl" lines, add
367 the following line in your sendmail.mc and rebuild the .cf file
368 </para>
369 <para>
370 <screen>
371 INPUT_MAIL_FILTER(`dnsbl', `S=local:/var/run/dnsbl/dnsbl.sock, F=T, T=C:30s;S:5m;R:5m;E:5m')
372 </screen>
373 </para>
374 <para>
375 Modify the default <citerefentry>
376 <refentrytitle>@PACKAGE@.conf</refentrytitle> <manvolnum>5</manvolnum>
377 </citerefentry> configuration.
378 </para>
379
380
381 <refsect1 id='todo.1'>
382 <title>Performance Issues</title>
383 <para>
384 Consider a high volume high performance machine running sendmail. Each
385 sendmail process can do its own dns resolution. Typically, such dns
386 resolver libraries are not thread safe, and so must be protected by some
387 sort of mutex in a threaded environment. When we add a milter to
388 sendmail, we now have a collection of sendmail processes, and a
389 collection of milter threads.
390 </para>
391 <para>
392 We will be doing a lot of dns lookups per mail message, and at least
393 some of those will take many tens of seconds. If all this dns work is
394 serialized inside the milter, we have an upper limit of about 25K mail
395 messages per day. That is clearly not sufficient for many sites.
396 </para>
397 <para>
398 Since we want to do parallel dns resolution across those milter threads,
399 we add another collection of dns resolver processes. Each sendmail
400 process is talking to a milter thread over a socket, and each milter
401 thread is talking to a dns resolver process over another socket.
402 </para>
403 <para>
404 Suppose we are processing 20 messages per second, and each message
405 requires 20 seconds of dns work. Then we will have 400 sendmail
406 processes, 400 milter threads, and 400 dns resolver processes. Of
407 course that steady state is very unlikely to happen.
408 </para>
409 </refsect1>
410
411
412 <refsect1 id='todo.1'>
413 <title>Rejected Ideas</title>
414 <para>
415 The following ideas have been considered and rejected.
416 </para>
417 <para>
418 Add max_recipients for each mail domain to the configuration.
419 Recipients in excess of that limit will be rejected, and all the
420 recipients in that domain will be removed if there are some other
421 whitelisted recipients. Current spammers *very* rarely send more than
422 ten recipients in a single smtp transaction, so this won't stop any
423 significant amount of spam.
424 </para>
425 <para>
426 Add poison addresses to the configuration. If any recipient is
427 poison, all recipients are rejected even if they would be whitelisted,
428 and the data is rejected if sent. I have a collection of spam trap
429 addresses that would be suitable for such use. Based on my log files,
430 any mail to those spam trap addresses is rejected based on either dnsbl
431 lookups or the DCC. So this won't result in blocking any additional
432 spam.
433 </para>
434 <para>
435 Add an option to only allow one recipient if the return path is
436 empty. Based on my log files, there is no mail that violates this
437 check.
438 </para>
439 <para>
440 Reject the mail if the envelope from domain name contains any MX
441 records pointing to 127.0.0.0/8. I don't see any significant amount of
442 spam sent with such domain names.
443 </para>
444 </refsect1>
445
446 <refsect1 id='todo.1'>
447 <title>TODO</title>
448 <para>
449 The following ideas are under consideration.
450 </para>
451 <para>
452 Add a per-context option to reject mail if the number of digits in
453 the reverse dns client name exceeds some threshold.
454 </para>
455 </refsect1>
456
457 <refsect1>
458 <title>Configuration</title>
459 <para>
460 The configuration file is documented in <citerefentry>
461 <refentrytitle>@PACKAGE@.conf</refentrytitle> <manvolnum>5</manvolnum>
462 </citerefentry>. Any change to the config file, or any file included
463 from that config file, will cause it to be reloaded within three
464 minutes.
465 </para>
466 </refsect1>
467
468 <refsect1>
469 <title>Copyright</title>
470 <para>
471 Copyright (C) 2005 by 510 Software Group &lt;carl@five-ten-sg.com&gt;
472 </para>
473 <para>
474 This program is free software; you can redistribute it and/or modify it
475 under the terms of the GNU General Public License as published by the
476 Free Software Foundation; either version 2, or (at your option) any
477 later version.
478 </para>
479 <para>
480 You should have received a copy of the GNU General Public License along
481 with this program; see the file COPYING. If not, please write to the
482 Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
483 </para>
484 </refsect1>
485
486 <refsect1>
487 <para>
488 $Id$
489 </para>
490 </refsect1>
491 </refentry>
492
493
494 <refentry id="@PACKAGE@.conf.5">
495 <refentryinfo>
496 <date>2005-12-18</date>
497 </refentryinfo>
498
499 <refmeta>
500 <refentrytitle>@PACKAGE@.conf</refentrytitle>
501 <manvolnum>5</manvolnum>
502 <refmiscinfo>@PACKAGE@ @VERSION@</refmiscinfo>
503 </refmeta>
504
505 <refnamediv id='name.5'>
506 <refname>@PACKAGE@.conf</refname>
507 <refpurpose>configuration file for @PACKAGE@</refpurpose>
508 </refnamediv>
509
510 <refsynopsisdiv id='synopsis.5'>
511 <title>Synopsis</title>
512 <cmdsynopsis>
513 <command>@PACKAGE@.conf</command>
514 </cmdsynopsis>
515 </refsynopsisdiv>
516
517 <refsect1 id='description.5'>
518 <title>Description</title>
519 <para>The <command>@PACKAGE@.conf</command> configuration file is
520 specified by this partial bnf description.</para>
521
522 <literallayout class="monospaced"><![CDATA[
523 CONFIG = {CONTEXT ";"}+
524 CONTEXT = "context" NAME "{" {STATEMENT}+ "}"
525 STATEMENT = (DNSBL | DNSBLLIST | CONTENT | ENV-TO | VERIFY | CONTEXT | ENV-FROM) ";"
526
527 DNSBL = "dnsbl" NAME DNSPREFIX ERROR-MSG
528
529 DNSBLLIST = "dnsbl_list" {NAME}+
530
531 CONTENT = "content" ("on" | "off") "{" {CONTENT-ST}+ "}"
532 CONTENT-ST = (FILTER | IGNORE | TLD | HTML-TAGS | HTML-LIMIT | HOST-LIMIT) ";"
533 FILTER = "filter" DNSPREFIX ERROR-MSG
534 IGNORE = "ignore" "{" {HOSTNAME [";"]}+ "}"
535 TLD = "tld" "{" {TLD [";"]}+ "}"
536 HTML-TAGS = "html_tags" "{" {HTMLTAG [";"]}+ "}"
537 ERROR-MSG = string containing exactly two %s replacement tokens for the client ip address
538
539 HTML-LIMIT = "html_limit" ("on" INTEGER ERROR-MSG | "off")
540
541 HOST-LIMIT = "host_limit" ("on" INTEGER ERROR-MSG | "off" | "soft" INTEGER)
542
543 ENV-TO = "env_to" "{" {(TO-ADDR | DCC-TO)}+ "}"
544 TO-ADDR = ADDRESS [";"]
545 DCC-TO = "dcc_to" ("ok" | "many") "{" DCCINCLUDEFILE "}" ";"
546
547 VERIFY = "verify" HOSTNAME ";"
548
549 ENV_FROM = "env_from" [DEFAULT] "{" {(FROM-ADDR | DCC-FROM)}+ "}"
550 FROM-ADDR = ADDRESS VALUE [";"]
551 DCC-FROM = "dcc_from" "{" DCCINCLUDEFILE "}" ";"
552 DEFAULT = ("white" | "black" | "unknown" | "inherit" | "")
553 ADDRESS = (USER@ | DOMAIN | USER@DOMAIN)
554 VALUE = ("white" | "black" | "unknown" | CHILD-CONTEXT-NAME)]]></literallayout>
555 </refsect1>
556
557 <refsect1 id='sample.5'>
558 <title>Sample</title>
559 <literallayout class="monospaced"><![CDATA[
560 context sample {
561 dnsbl local blackholes.five-ten-sg.com "Mail from %s rejected - local; see http://www.five-ten-sg.com/blackhole.php?%s";
562 dnsbl sbl sbl-xbl.spamhaus.org "Mail from %s rejected - sbl; see http://www.spamhaus.org/query/bl?ip=%s";
563 dnsbl xbl xbl.spamhaus.org "Mail from %s rejected - xbl; see http://www.spamhaus.org/query/bl?ip=%s";
564 dnsbl dul dul.dnsbl.sorbs.net "Mail from %s rejected - dul; see http://www.sorbs.net/lookup.shtml?%s";
565 dnsbl_list local sbl dul;
566
567 content on {
568 filter sbl-xbl.spamhaus.org "Mail containing %s rejected - sbl; see http://www.spamhaus.org/query/bl?ip=%s";
569 ignore { include "hosts-ignore.conf"; };
570 tld { include "tld.conf"; };
571 html_tags { include "html-tags.conf"; };
572 html_limit on 20 "Mail containing excessive bad html tags rejected";
573 html_limit off;
574 host_limit on 20 "Mail containing excessive host names rejected";
575 host_limit soft 20;
576 };
577
578 env_to {
579 # child contexts are not allowed to specify recipient addresses outside these domains
580 # leave this outer global context env_to empty to allow arbitrary recipients in child contexts
581 mydomain.com;
582 customer1.com;
583 customer1a.com;
584 customer1b.com;
585 customer2.com;
586 customer2a.com;
587 customer2b.com;
588 };
589
590 context whitelist {
591 content off {};
592 env_to {
593 # dcc_to ok { include "/var/dcc/whitecommon"; }; # copy the dcc OK values (env_to) into this context
594 };
595 env_from white {}; # white forces all unmatched from addresses (everyone in this case) to be whitelisted
596 # so all mail TO these env_to addresses is accepted
597 };
598
599 context abuse {
600 dnsbl_list xbl;
601 content off {};
602 env_to {
603 abuse@; # no content filtering on abuse reports
604 postmaster@; # ""
605 };
606 env_from unknown {}; # ignore all parent white/black listing
607 };
608
609 context minimal {
610 dnsbl_list sbl dul;
611 content on {};
612 env_to {
613 sales@mydomain.com;
614 };
615 };
616
617 context blacklist {
618 env_to {
619 dcc_to many { include "/var/dcc/whitecommon"; }; # copy the dcc MANY values (env_to) into this context
620 old-employee@mydomain.com;
621 };
622 env_from black {}; # black forces all unmatched from addresses (everyone in this case) to be blacklisted
623 # so all mail TO these env_to addresses is rejected
624 };
625
626 context vp { # special context for the vp
627 env_to {
628 vp@mydomain.com;
629 };
630 env_from inherit {
631 nai.com black; # the vp does not like nai
632 yahoo.com unknown; # override parent context blacklisting
633 mother@spammyisp.com white; # suppress dnsbl checking
634 };
635 };
636
637 context customer1 {
638 dnsbl_list sbl dul;
639 env_to {
640 customer1.com;
641 customer1a.com;
642 customer1b.com;
643 };
644
645 verify mail.customer1.com;
646
647 context customer1a {
648 env_to {
649 customer1a.com;
650 }
651 env_from black { # blacklist everything
652 first@acceptable.com unknown; # except these specific envelope senders
653 second@another.com unknown;
654 yahoo.com inherit; # delegate to the parent
655 };
656 };
657
658 env_from { # default value of the default is inherit
659 yahoo.com black; # no mail from yahoo
660 first@yahoo.com unknown; # except this one
661 };
662 };
663
664 context customer2 {
665 dnsbl_list sbl;
666 env_to {
667 customer2.com;
668 customer2a.com;
669 customer2b.com;
670 };
671 };
672
673 env_from unknown {
674 dcc_from { include "/var/dcc/whitecommon"; }; # copy the dcc OK/MANY values (env_from, substitute mail_host) into this context
675 abuse@ abuse; # replies to abuse reports use the abuse context
676 yahoo.com black; # don't take mail from yahoo
677 spammer@example.com black;
678 };
679 };]]></literallayout>
680 </refsect1>
681
682 <refsect1>
683 <para>
684 $Id$
685 </para>
686 </refsect1>
687
688 </refentry>
689 </reference>