Mercurial > dnsbl
annotate xml/dnsbl.in @ 98:91c27c00048f
tokenizer errors now go thru syslog to be visible during config file reloads in normal operation
author | carl |
---|---|
date | Thu, 22 Sep 2005 21:57:08 -0700 |
parents | 53a2fbe3f761 |
children | f8963ddf7143 |
rev | line source |
---|---|
94 | 1 <html> |
2 | |
3 <head> | |
4 <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> | |
98
91c27c00048f
tokenizer errors now go thru syslog to be visible during config file reloads in normal operation
carl
parents:
95
diff
changeset
|
5 <title>DNSBL Sendmail milter - Version 5.6</title> |
94 | 6 </head> |
7 | |
8 <center>Introduction</center> | |
9 <p>This milter is released under the GPL license version 2 included in | |
10 the LICENSE file in the distribution, and also available at | |
11 <a href="http://www.gnu.org/licenses/gpl.html">http://www.gnu.org/licenses/gpl.html</a> | |
12 | |
13 <p>Consider the case of a mail server that is acting as secondary MX for | |
14 a collection of clients, each of which has a collection of mail domains. | |
15 Each client may use their own collection of DNSBLs on their primary mail | |
16 server. We present here a mechanism whereby the backup mail server can | |
17 use the correct set of DNSBLs for each recipient for each message. As a | |
18 side-effect, it gives us the ability to customize the set of DNSBLs on a | |
19 per-recipient basis, so that fred@example.com could use SPEWS and the | |
20 SBL, where all other users @example.com use only the SBL. | |
21 | |
22 <p>This milter can also verify the envelope from/recipient pairs with | |
23 the primary MX server. This allows the backup mail servers to properly | |
24 reject mail sent to invalid addresses. Otherwise, the backup mail | |
25 servers will accept that mail, and then generate a bounce message when | |
26 the message is forwarded to the primary server (and rejected there with | |
27 no such user). | |
28 | |
29 <p>This milter will also decode (uuencode, base64, mime, html entity, | |
30 url encodings) and scan for HTTP and HTTPS URLs and bare hostnames in | |
31 the body of the mail. If any of those host names have A or NS records | |
32 on the SBL (or a single configurable DNSBL), the mail will be rejected | |
33 unless previously whitelisted. This milter also counts the number of | |
34 invalid HTML tags, and can reject mail if that count exceeds your | |
35 specified limit. | |
36 | |
37 <p>The DNSBL milter reads a text configuration file (dnsbl.conf) on | |
38 startup, and whenever the config file (or any of the referenced include | |
39 files) is changed. The entire configuration file is case insensitive. | |
40 If the configuration cannot be loaded due to a syntax error, the milter | |
41 will log the error and quit. If the configuration cannot be reloaded | |
42 after being modified, the milter will log the error and send an email to | |
43 root from dnsbl@$hostname. You probably want to added dnsbl@$hostname | |
44 to your /etc/mail/virtusertable since otherwise sendmail will reject | |
45 that message. | |
46 | |
47 <hr> <center>DCC Issues</center> | |
48 <p>If you are also using the <a | |
49 href="http://www.rhyolite.com/anti-spam/dcc/">DCC</a> milter, there are | |
50 a few considerations. You may need to whitelist senders from the DCC | |
51 bulk detector, or from the DNS based lists. Those are two very | |
52 different reasons for whitelisting. The former is done thru the DCC | |
53 whiteclnt config file, the later is done thru the DNSBL milter config | |
54 file. | |
55 | |
56 <p>You may want to blacklist some specific senders or sending domains. | |
57 This could be done thru either the DCC (on a global basis, or for a | |
58 specific single recipient). We prefer to do such blacklisting via the | |
59 DNSBL milter config, since it can be done for a collection of recipient | |
60 mail domains. The DCC approach has the feature that you can capture the | |
61 entire message in the DCC log files. The DNSBL milter approach has the | |
62 feature that the mail is rejected earlier (at RCPT TO time), and the | |
63 sending machine just gets a generic "550 5.7.1 no such user" message. | |
64 | |
65 <p>The DCC whiteclnt file can be included in the DNSBL milter config by | |
66 the dcc_to and dcc_from statements. This will import the (env_to, | |
67 env_from, and substitute mail_host) entries from the DCC config into the | |
68 DNSBL config. This allows using the DCC config as the single point for | |
69 white/blacklisting. | |
70 | |
71 <p>Consider the case where you have multiple clients, each with their | |
72 own mail servers, and each running their own DCC milters. Each client | |
73 is using the DCC facilities for envelope from/to white/blacklisting. | |
74 Presumably you can use rsync or scp to fetch copies of your clients DCC | |
75 whiteclnt files on a regular basis. Your mail server, acting as a | |
76 backup MX for your clients, can use the DNSBL milter, and include those | |
77 client DCC config files. The envelope from/to white/blacklisting will | |
78 be appropriately tagged and used only for the domains controlled by each | |
79 of those clients. | |
80 | |
81 <hr> <center>Definitions</center> | |
82 | |
83 <p>CONTEXT - a collection of parameters that defines the filtering | |
84 context to be used for a collection of envelope recipient addresses. | |
85 The context includes such things as the list of DNSBLs to be used, and | |
86 the various content filtering parameters. | |
87 | |
88 <p>DNSBL - a named DNS based blocking list is defined by a dns suffix | |
89 (e.g. sbl-xbl.spamhaus.org) and a message string that is used to | |
90 generate the "550 5.7.1" smtp error return code. The names of these | |
91 DNSBLs will be used to define the DNSBL-LISTs. | |
92 | |
93 <p>DNSBL-LIST - a named list of DNSBLs that will be used for specific | |
94 recipients or recipient domains. | |
95 | |
96 <hr> <center>Filtering Procedure</center> | |
97 | |
98 <p>If the client has authenticated with sendmail, the mail is accepted, | |
99 the filtering contexts are not used, the dns lists are not checked, and | |
100 the body content is not scanned. Otherwise, we follow these steps for | |
101 each recipient. | |
102 | |
103 <ol> | |
104 | |
105 <li>The envelope to email address is used to find an initial filtering | |
106 context. We first look for a context that specified the full email | |
107 address in the env_to statement. If that is not found, we look for a | |
108 context that specified the entire domain name of the envelope recipient | |
109 in the env_to statement. If that is not found, we look for a context | |
110 that specified the user@ part of the envelope recipient in the env_to | |
111 statement. If that is not found, we use the first top level context | |
112 defined in the config file. | |
113 | |
114 <br><br><li>The initial filtering context may redirect to a child | |
115 context based on the values in the initial context's env_from statement. | |
116 We look for [1) the full envelope from email address, 2) the domain name | |
117 part of the envelope from address, 3) the user@ part of the envelope | |
118 from address] in that context's env_from statement, with values that | |
119 point to a child context. If such an entry is found, we switch to that | |
120 child filtering context. | |
121 | |
122 <br><br><li>We lookup [1) the full envelope from email address, 2) the | |
123 domain name part of the envelope from address, 3) the user@ part of the | |
124 envelope from address] in the filtering context env_from statement. | |
125 That results in one of (white, black, unknown, inherit). | |
126 | |
127 <br><br><li>If the answer is black, mail to this recipient is rejected | |
128 with "no such user", and the dns lists are not checked. | |
129 | |
130 <br><br><li>If the answer is white, mail to this recipient is accepted | |
131 and the dns lists are not checked. | |
132 | |
133 <br><br><li>If the answer is unknown, we don't reject yet, but the dns | |
134 lists will be checked, and the content may be scanned. | |
135 | |
136 <br><br><li>If the answer is inherit, we repeat the envelope from search | |
137 in the parent context. | |
138 | |
139 <br><br><li>The dns lists specified in the filtering context are checked | |
140 and the mail is rejected if any list has an A record for the standard | |
141 dns based lookup scheme (reversed octets of the client followed by the | |
142 dns suffix). | |
143 | |
95 | 144 <br><br><li>If the mail has not been accepted or rejected yet, we look |
145 for a verification context, which is the closest ancestor of the | |
146 filtering context that both specifies a verification host, and which | |
147 covers the envelope to address. If we find such a verification context, | |
148 and the verification host is not our own hostname, we open an smtp | |
149 conversation with that verification host. The current envelope from and | |
150 recipient to values are passed to that verification host. If we receive | |
151 a 5xy response those commands, we reject the current recipient with "no | |
152 such user". | |
94 | 153 |
154 <br><br><li>If the mail has not been accepted or rejected yet, and the | |
155 filtering context enables content filtering, and this is the first such | |
156 recipient in this smtp transaction, we set the content filtering | |
157 parameters from this context, and enable content filtering for the body | |
158 of this message. | |
159 | |
160 </ol> | |
161 | |
162 <p>If content filtering is enabled for this body, the mail text is | |
163 decoded (uuencode, base64, mime, html entity, url encodings), scanned | |
164 for HTTP and HTTPS URLs, and the first <configurable> host names | |
165 are checked for their presence on the single <configurable> DNSBL. | |
166 The only known list that is suitable for this purpose is the SBL. If | |
167 any of those host names are on that DNSBL (or have nameservers that are | |
168 on that list), and it is not on the <configurable> ignore list, | |
169 the mail is rejected. We also scan for excessive bad html tags, and if | |
170 a <configurable> limit is exceeded, the mail is rejected. | |
171 | |
172 <hr> <center>Sendmail access vs. DNSBL</center> | |
173 <p>With the standard sendmail.mc dnsbl FEATURE, the dnsbl checks may be | |
174 suppressed by entries in the /etc/mail/access database. For example, | |
175 suppose you control a /18 of address space, and have allocated some /24s | |
176 to some clients. You have access entries like | |
177 | |
178 <pre> | |
179 192.168.4 OK | |
180 192.168.17 OK | |
181 </pre> | |
182 | |
183 <p>to allow those clients to smarthost thru your mail server. Now if | |
184 one of those clients happens get infected with a virus that turns a | |
185 machine into an open proxy, and their 192.168.4.45 lands on the SBL-XBL, | |
186 you will still wind up allowing that infected machine to smarthost thru | |
187 your mail servers. | |
188 | |
189 <p>With this DNSBL milter, the sendmail access database cannot override | |
190 the dnsbl checks, so that machine won't be able to send mail to or thru | |
191 your smarthost mail server (unless the virus/proxy can use smtp-auth). | |
192 | |
193 <p>Using the standard sendmail features, you would add access entries to | |
194 allow hosts on your local network to relay thru your mail server. Those | |
195 OK entries in the sendmail access database will override all the dnsbl | |
196 checks. With this DNSBL milter, you will need to have the local users | |
197 authenticate with smtp-auth to get the same effect. You might find <a | |
198 href="http://www.ists.dartmouth.edu/classroom/sendmail-ssl-how-to.php"> | |
199 these directions</a> helpful for setting up smtp-auth if you are on RH | |
200 Linux. | |
201 | |
202 <hr> <center>Installation and configuration</center> | |
203 <p>Usage: Note that this has ONLY been tested on Linux, specifically | |
204 RedHat Linux. In particular, this milter makes no attempt to understand | |
205 IPv6. Your mileage will vary. You will need at a minimum a C++ | |
206 compiler with a minimally thread safe STL implementation. The | |
207 distribution includes a test.cpp program. If it fails this milter won't | |
208 work. If it passes, this milter might work. | |
209 | |
210 Fetch <a href="http://www.five-ten-sg.com/util/dnsbl.tar.gz">dnsbl.tar.gz</a> | |
211 and | |
212 | |
213 <pre> | |
214 tar xfvz dnsbl.tar.gz | |
215 bash install.bash | |
216 </pre> | |
217 | |
218 Read and understand the contents of that install.bash script before you | |
219 run it. It may not be suitable for your system. Modify your | |
220 sendmail.mc by removing all the "FEATURE(dnsbl" lines, add the following | |
221 line in your sendmail.mc and rebuild the .cf file | |
222 | |
223 <pre> | |
224 INPUT_MAIL_FILTER(`dnsbl', `S=local:/var/run/dnsbl/dnsbl.sock, F=T, T=C:30s;S:5m;R:5m;E:5m') | |
225 </pre> | |
226 | |
227 Read the sample <a | |
228 href="http://www.five-ten-sg.com/dnsbl.conf">/etc/dnsbl/dnsbl.conf</a> | |
229 file and modify it to fit your configuration. You can test your | |
230 configuration files, and see a readable internal dump of them on stdout | |
231 with | |
232 | |
233 <pre> | |
234 cd /etc/dnsbl | |
235 /usr/sbin/dnsbl -c | |
236 </pre> | |
237 | |
238 You can check a specific envelope from/to pair with | |
239 | |
240 <pre> | |
241 cd /etc/dnsbl | |
242 from="$1" # or your from address | |
243 to="$2" # or your to address | |
244 /usr/sbin/dnsbl -e "$from"'|'"$to" | |
245 </pre> | |
246 | |
247 <hr> <center>Performance issues</center> | |
248 | |
249 <p>Consider a high volume high performance machine running sendmail. | |
250 Each sendmail process can do its own dns resolution. Typically, such | |
251 dns resolver libraries are not thread safe, and so must be protected by | |
252 some sort of mutex in a threaded environment. When we add a milter to | |
253 sendmail, we now have a collection of sendmail processes, and a | |
254 collection of milter threads. | |
255 | |
256 <p>We will be doing a lot of dns lookups per mail message, and at least | |
257 some of those will take many tens of seconds. If all this dns work is | |
258 serialized inside the milter, we have an upper limit of about 25K mail | |
259 messages per day. That is clearly not sufficient for many sites. | |
260 | |
261 <p>Since we want to do parallel dns resolution across those milter | |
262 threads, we add another collection of dns resolver processes. Each | |
263 sendmail process is talking to a milter thread over a socket, and each | |
264 milter thread is talking to a dns resolver process over another socket. | |
265 | |
266 <p>Suppose we are processing 20 messages per second, and each message | |
267 requires 20 seconds of dns work. Then we will have 400 sendmail | |
268 processes, 400 milter threads, and 400 dns resolver processes. Of | |
269 course that steady state is very unlikely to happen. | |
270 | |
271 <hr> <center>Rejected Ideas</center> | |
272 | |
273 <p>The following ideas have been considered and rejected. | |
274 | |
275 <p>Add max_recipients for each mail domain to the configuration. | |
276 Recipients in excess of that limit will be rejected, and all the | |
277 recipients in that domain will be removed if there are some other | |
278 whitelisted recipients. Current spammers *very* rarely send more than | |
279 ten recipients in a single smtp transaction, so this won't stop | |
280 any significant amount of spam. | |
281 | |
282 <p>Add poison addresses to the configuration. If any recipient is | |
283 poison, all recipients are rejected even if they would be whitelisted, | |
284 and the data is rejected if sent. I have a collection of spam trap | |
285 addresses that would be suitable for such use. Based on my log files, | |
286 any mail to those spam trap addresses is rejected based on either dnsbl | |
287 lookups or the DCC. So this won't result in blocking any additional | |
288 spam. | |
289 | |
290 <p>Add an option to only allow one recipient if the return path is | |
291 empty. Based on my log files, there is no mail that violates this | |
292 check. | |
293 | |
294 <p>Reject the mail if the envelope from domain name contains any MX | |
295 records pointing to 127.0.0.0/8. I don't see any significant amount of spam | |
296 sent with such domain names. | |
297 | |
298 | |
299 <pre> | |
300 $Id$ | |
301 </pre> | |
302 </body> | |
303 </html> |