Mercurial > dnsbl
annotate xml/dnsbl.in @ 87:7a432c2b473f
add multiple debug syslog levels, remove duplicate dnsbl definitions
author | carl |
---|---|
date | Tue, 19 Jul 2005 22:55:07 -0700 |
parents | db85c53e3d90 |
children | 7245c45cef7a |
rev | line source |
---|---|
0 | 1 <html> |
2 | |
3 <head> | |
4 <meta http-equiv="Content-Type" content="text/html; charset=windows-1252"> | |
87
7a432c2b473f
add multiple debug syslog levels, remove duplicate dnsbl definitions
carl
parents:
81
diff
changeset
|
5 <title>DNSBL Sendmail milter - Version 5.1</title> |
0 | 6 </head> |
7 | |
12 | 8 <center>Introduction</center> |
0 | 9 <p>This milter is released under the GPL license version 2 included in |
10 the LICENSE file in the distribution, and also available at | |
11 <a href="http://www.gnu.org/licenses/gpl.html">http://www.gnu.org/licenses/gpl.html</a> | |
12 | |
12 | 13 <p>Consider the case of a mail server that is acting as secondary MX for |
14 a collection of clients, each of which has a collection of mail domains. | |
15 Each client may use their own collection of DNSBLs on their primary mail | |
16 server. We present here a mechanism whereby the backup mail server can | |
17 use the correct set of DNSBLs for each recipient for each message. As a | |
0 | 18 side-effect, it gives us the ability to customize the set of DNSBLs on a |
19 per-recipient basis, so that fred@example.com could use SPEWS and the | |
20 SBL, where all other users @example.com use only the SBL. | |
21 | |
68 | 22 <p>This milter will also decode (uuencode, base64, mime, html entity, |
23 url encodings) and scan for HTTP and HTTPS URLs and bare hostnames in | |
24 the body of the mail. If any of those host names have A or NS records | |
25 on the SBL (or a single configurable DNSBL), the mail will be rejected | |
34 | 26 unless previously whitelisted. This milter also counts the number of |
27 invalid HTML tags, and can reject mail if that count exceeds your | |
28 specified limit. | |
11 | 29 |
6 | 30 <p>The DNSBL milter reads a text configuration file (dnsbl.conf) on |
31 startup, and whenever the config file (or any of the referenced include | |
32 files) is changed. The entire configuration file is case insensitive. | |
0 | 33 |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
34 <hr> <center>DCC Issues</center> |
0 | 35 <p>If you are also using the <a |
36 href="http://www.rhyolite.com/anti-spam/dcc/">DCC</a> milter, there are | |
37 a few considerations. You may need to whitelist senders from the DCC | |
38 bulk detector, or from the DNS based lists. Those are two very | |
39 different reasons for whitelisting. The former is done thru the DCC | |
40 whiteclnt config file, the later is done thru the DNSBL milter config | |
5 | 41 file. |
0 | 42 |
43 <p>You may want to blacklist some specific senders or sending domains. | |
44 This could be done thru either the DCC (on a global basis, or for a | |
45 specific single recipient). We prefer to do such blacklisting via the | |
13 | 46 DNSBL milter config, since it can be done for a collection of recipient |
47 mail domains. The DCC approach has the feature that you can capture the | |
0 | 48 entire message in the DCC log files. The DNSBL milter approach has the |
49 feature that the mail is rejected earlier (at RCPT TO time), and the | |
50 sending machine just gets a generic "550 5.7.1 no such user" message. | |
51 | |
75 | 52 <p>The DCC whiteclnt file can be included in the DNSBL milter config by |
53 the dcc_to and dcc_from statements. This will import the (env_to, | |
54 env_from, and substitute mail_host) entries from the DCC config into the | |
55 DNSBL config. This allows using the DCC config as the single point for | |
56 white/blacklisting. | |
5 | 57 |
58 <p>Consider the case where you have multiple clients, each with their | |
59 own mail servers, and each running their own DCC milters. Each client | |
60 is using the DCC facilities for envelope from/to white/blacklisting. | |
6 | 61 Presumably you can use rsync or scp to fetch copies of your clients DCC |
5 | 62 whiteclnt files on a regular basis. Your mail server, acting as a |
63 backup MX for your clients, can use the DNSBL milter, and include those | |
75 | 64 client DCC config files. The envelope from/to white/blacklisting will |
65 be appropriately tagged and used only for the domains controlled by each | |
66 of those clients. | |
5 | 67 |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
68 <hr> <center>Definitions</center> |
75 | 69 |
70 <p>CONTEXT - a collection of parameters that defines the filtering | |
71 context to be used for a collection of envelope recipient addresses. | |
72 The context includes such things as the list of DNSBLs to be used, and | |
73 the various content filtering parameters. | |
74 | |
0 | 75 <p>DNSBL - a named DNS based blocking list is defined by a dns suffix |
76 (e.g. sbl-xbl.spamhaus.org) and a message string that is used to | |
77 generate the "550 5.7.1" smtp error return code. The names of these | |
78 DNSBLs will be used to define the DNSBL-LISTs. | |
79 | |
80 <p>DNSBL-LIST - a named list of DNSBLs that will be used for specific | |
81 recipients or recipient domains. | |
82 | |
76 | 83 <hr> <center>Filtering Procedure</center> |
84 | |
85 <p>If the client has authenticated with sendmail, the mail is accepted, | |
86 the dns lists are not checked, and the body content is not scanned. | |
87 Otherwise, we follow these steps for each recipient. | |
0 | 88 |
89 <ol> | |
90 | |
75 | 91 <li>The envelope to email address is used to find an initial filtering |
76 | 92 context. We first look for a context that specified the full email |
93 address in the env_to statement. If that is not found, we look for a | |
94 context that specified the entire domain name of the envelope recipient | |
95 in the env_to statement. If that is not found, we look for a context | |
96 that specified the user@ part of the envelope recipient in the env_to | |
97 statement. If that is not found, we use the first top level context | |
98 defined in the config file. | |
0 | 99 |
76 | 100 <br><br><li>The initial filtering context may redirect to a child |
101 context based on the values in the initial context's env_from statement. | |
102 We look for [1) the full envelope from email address, 2) the domain name | |
103 part of the envelope from address, 3) the user@ part of the envelope | |
104 from address] in that context's env_from statement, with values that | |
105 point to a child context. If such an entry is found, we switch to that | |
106 child filtering context. | |
75 | 107 |
76 | 108 <br><br><li>We lookup [1) the full envelope from email address, 2) the |
109 domain name part of the envelope from address, 3) the user@ part of the | |
75 | 110 envelope from address] in the filtering context env_from statement. |
111 That results in one of (white, black, unknown, inherit). | |
112 | |
76 | 113 <br><br><li>If the answer is black, mail to this recipient is rejected |
114 with "no such user", and the dns lists are not checked. | |
75 | 115 |
76 | 116 <br><br><li>If the answer is white, mail to this recipient is accepted |
117 and the dns lists are not checked. | |
0 | 118 |
76 | 119 <br><br><li>If the answer is unknown, we don't reject yet, but the dns |
120 lists will be checked, and the content may be scanned. | |
0 | 121 |
76 | 122 <br><br><li>If the answer is inherit, we repeat the envelope from search |
123 in the parent context. | |
0 | 124 |
76 | 125 <br><br><li>The dns lists specified in the filtering context are checked |
126 and the mail is rejected if any list has an A record for the standard | |
127 dns based lookup scheme (reversed octets of the client followed by the | |
128 dns suffix). | |
129 | |
130 <br><br><li>If the mail has not been accepted or rejected yet, and the | |
131 filtering context enables content filtering, and this is the first such | |
132 recipient in this smtp transaction, we set the content filtering parameters | |
133 from this context, and enable content filtering for this body. | |
11 | 134 |
0 | 135 </ol> |
136 | |
76 | 137 <p>If content filtering is enabled for this body, the mail text is |
138 decoded (uuencode, base64, mime, html entity, url encodings), scanned | |
139 for HTTP and HTTPS URLs, and the first <configurable> host names | |
140 are checked for their presence on the single <configurable> DNSBL. | |
141 The only known list that is suitable for this purpose is the SBL. If | |
142 any of those host names are on that DNSBL (or have nameservers that are | |
143 on that list), and it is not on the <configurable> ignore list, | |
144 the mail is rejected. We also scan for excessive bad html tags, and if | |
145 a <configurable> limit is exceeded, the mail is rejected. | |
146 | |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
147 <hr> <center>Sendmail access vs. DNSBL</center> |
12 | 148 <p>With the standard sendmail.mc dnsbl FEATURE, the dnsbl checks may be |
149 suppressed by entries in the /etc/mail/access database. For example, | |
150 suppose you control a /18 of address space, and have allocated some /24s | |
151 to some clients. You have access entries like | |
0 | 152 |
12 | 153 <pre> |
154 192.168.4 OK | |
155 192.168.17 OK | |
156 </pre> | |
157 | |
158 <p>to allow those clients to smarthost thru your mail server. Now if | |
13 | 159 one of those clients happens get infected with a virus that turns a |
160 machine into an open proxy, and their 192.168.4.45 lands on the SBL-XBL, | |
161 you will still wind up allowing that infected machine to smarthost thru | |
162 your mail servers. | |
12 | 163 |
164 <p>With this DNSBL milter, the sendmail access database cannot override | |
165 the dnsbl checks, so that machine won't be able to send mail to or thru | |
15 | 166 your smarthost mail server (unless the virus/proxy can use smtp-auth). |
167 | |
168 <p>Using the standard sendmail features, you would add access entries to | |
169 allow hosts on your local network to relay thru your mail server. Those | |
170 OK entries in the sendmail access database will override all the dnsbl | |
171 checks. With this DNSBL milter, you will need to have the local users | |
172 authenticate with smtp-auth to get the same effect. You might find <a | |
81 | 173 href="http://www.ists.dartmouth.edu/classroom/sendmail-ssl-how-to.php"> |
15 | 174 these directions</a> helpful for setting up smtp-auth if you are on RH |
175 Linux. | |
12 | 176 |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
177 <hr> <center>Installation and configuration</center> |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
178 <p>Usage: Note that this has ONLY been tested on Linux, specifically |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
179 RedHat Linux. In particular, this milter makes no attempt to understand |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
180 IPv6. Your mileage will vary. You will need at a minimum a C++ |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
181 compiler with a minimally thread safe STL implementation. The |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
182 distribution includes a test.cpp program. If it fails this milter won't |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
183 work. If it passes, this milter might work. |
0 | 184 |
185 Fetch <a href="http://www.five-ten-sg.com/util/dnsbl.tar.gz">dnsbl.tar.gz</a> | |
186 and | |
187 | |
188 <pre> | |
189 tar xfvz dnsbl.tar.gz | |
190 bash install.bash | |
191 </pre> | |
192 | |
193 Read and understand the contents of that install.bash script before you | |
194 run it. It may not be suitable for your system. Modify your | |
195 sendmail.mc by removing all the "FEATURE(dnsbl" lines, add the following | |
196 line in your sendmail.mc and rebuild the .cf file | |
197 | |
198 <pre> | |
50 | 199 INPUT_MAIL_FILTER(`dnsbl', `S=local:/var/run/dnsbl/dnsbl.sock, F=T, T=C:30s;S:5m;R:5m;E:5m') |
0 | 200 </pre> |
201 | |
202 Read the sample <a | |
44 | 203 href="http://www.five-ten-sg.com/dnsbl.conf">/etc/dnsbl/dnsbl.conf</a> |
6 | 204 file and modify it to fit your configuration. You can test your |
13 | 205 configuration files, and see a readable internal dump of them on stdout |
6 | 206 with |
207 | |
208 <pre> | |
44 | 209 cd /etc/dnsbl |
210 /usr/sbin/dnsbl -c | |
6 | 211 </pre> |
212 | |
75 | 213 You can check a specific envelope from/to pair with |
214 | |
215 <pre> | |
216 cd /etc/dnsbl | |
217 from="$1" # or your from address | |
218 to="$2" # or your to address | |
219 /usr/sbin/dnsbl -e "$from"'|'"$to" | |
220 </pre> | |
221 | |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
222 <hr> <center>Performance issues</center> |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
223 |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
224 <p>Consider a high volume high performance machine running sendmail. |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
225 Each sendmail process can do its own dns resolution. Typically, such |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
226 dns resolver libraries are not thread safe, and so must be protected by |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
227 some sort of mutex in a threaded environment. When we add a milter to |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
228 sendmail, we now have a collection of sendmail processes, and a |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
229 collection of milter threads. |
0 | 230 |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
231 <p>We will be doing a lot of dns lookups per mail message, and at least |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
232 some of those will take many tens of seconds. If all this dns work is |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
233 serialized inside the milter, we have an upper limit of about 25K mail |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
234 messages per day. That is clearly not sufficient for many sites. |
0 | 235 |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
236 <p>Since we want to do parallel dns resolution across those milter |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
237 threads, we add another collection of dns resolver processes. Each |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
238 sendmail process is talking to a milter thread over a socket, and each |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
239 milter thread is talking to a dns resolver process over another socket. |
6 | 240 |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
241 <p>Suppose we are processing 20 messages per second, and each message |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
242 requires 20 seconds of dns work. Then we will have 400 sendmail |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
243 processes, 400 milter threads, and 400 dns resolver processes. Of |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
244 course that steady state is very unlikely to happen. |
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
245 |
76 | 246 <hr> <center>Rejected Ideas</center> |
247 | |
248 <p>The following ideas have been considered and rejected. | |
249 | |
250 <p>Add max_recipients for each mail domain to the configuration. | |
251 Recipients in excess of that limit will be rejected, and all the | |
252 recipients in that domain will be removed if there are some other | |
253 whitelisted recipients. Current spammers *very* rarely send more than | |
254 ten recipients in a single smtp transaction, so this won't stop | |
255 any significant amount of spam. | |
256 | |
257 <p>Add poison addresses to the configuration. If any recipient is | |
258 poison, all recipients are rejected even if they would be whitelisted, | |
259 and the data is rejected if sent. I have a collection of spam trap | |
260 addresses that would be suitable for such use. Based on my log files, | |
261 any mail to those spam trap addresses is rejected based on either dnsbl | |
262 lookups or the DCC. So this won't result in blocking any additional | |
263 spam. | |
264 | |
265 <p>Add an option to only allow one recipient if the return path is | |
266 empty. Based on my log files, there is no mail that violates this | |
267 check. | |
268 | |
269 <p>Reject the mail if the envelope from domain name contains any MX | |
270 records pointing to 127.0.0.0/8. I don't see any significant amount of spam | |
271 sent with such domain names. | |
272 | |
273 | |
59
510a511ad554
Add resolver processes to allow better performance on busy machines
carl
parents:
57
diff
changeset
|
274 <pre> |
2 | 275 $Id$ |
4 | 276 </pre> |
0 | 277 </body> |
278 </html> |