annotate xml/dnsbl.in @ 24:2e23b7184d2b

start coding for bad html tag detection
author carl
date Wed, 19 May 2004 21:40:50 -0700
parents b8f5fa3dd5b8
children 43a4f6b3e668
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
96a9758165cd Initial revision
carl
parents:
diff changeset
1 <html>
96a9758165cd Initial revision
carl
parents:
diff changeset
2
96a9758165cd Initial revision
carl
parents:
diff changeset
3 <head>
96a9758165cd Initial revision
carl
parents:
diff changeset
4 <meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
96a9758165cd Initial revision
carl
parents:
diff changeset
5 <title>DNSBL Sendmail milter</title>
96a9758165cd Initial revision
carl
parents:
diff changeset
6 </head>
96a9758165cd Initial revision
carl
parents:
diff changeset
7
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
8 <center>Introduction</center>
0
96a9758165cd Initial revision
carl
parents:
diff changeset
9 <p>This milter is released under the GPL license version 2 included in
96a9758165cd Initial revision
carl
parents:
diff changeset
10 the LICENSE file in the distribution, and also available at
96a9758165cd Initial revision
carl
parents:
diff changeset
11 <a href="http://www.gnu.org/licenses/gpl.html">http://www.gnu.org/licenses/gpl.html</a>
96a9758165cd Initial revision
carl
parents:
diff changeset
12
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
13 <p>Consider the case of a mail server that is acting as secondary MX for
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
14 a collection of clients, each of which has a collection of mail domains.
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
15 Each client may use their own collection of DNSBLs on their primary mail
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
16 server. We present here a mechanism whereby the backup mail server can
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
17 use the correct set of DNSBLs for each recipient for each message. As a
0
96a9758165cd Initial revision
carl
parents:
diff changeset
18 side-effect, it gives us the ability to customize the set of DNSBLs on a
96a9758165cd Initial revision
carl
parents:
diff changeset
19 per-recipient basis, so that fred@example.com could use SPEWS and the
96a9758165cd Initial revision
carl
parents:
diff changeset
20 SBL, where all other users @example.com use only the SBL.
96a9758165cd Initial revision
carl
parents:
diff changeset
21
16
2ae8d953f1d0 add scanning for bare hostnames
carl
parents: 15
diff changeset
22 <p>This milter will also decode (base64, mime, html entity) and scan for
19
b8f5fa3dd5b8 fix problems in the state transitions causing impossible states
carl
parents: 16
diff changeset
23 HTTP and HTTPS URLs and bare hostnames in the body of the mail. If any
b8f5fa3dd5b8 fix problems in the state transitions causing impossible states
carl
parents: 16
diff changeset
24 of those host names have A records on the SBL (or a single configurable
24
2e23b7184d2b start coding for bad html tag detection
carl
parents: 19
diff changeset
25 list), the mail will be rejected unless previously whitelisted. This
2e23b7184d2b start coding for bad html tag detection
carl
parents: 19
diff changeset
26 milter also counts the number of invalid HTML tags, and can reject mail
2e23b7184d2b start coding for bad html tag detection
carl
parents: 19
diff changeset
27 if that count exceeds your specified limit.
11
2c206836b4cc integration work on url scanner
carl
parents: 6
diff changeset
28
6
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
29 <p>The DNSBL milter reads a text configuration file (dnsbl.conf) on
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
30 startup, and whenever the config file (or any of the referenced include
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
31 files) is changed. The entire configuration file is case insensitive.
0
96a9758165cd Initial revision
carl
parents:
diff changeset
32
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
33 <hr>
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
34 <center>DCC Issues</center>
0
96a9758165cd Initial revision
carl
parents:
diff changeset
35 <p>If you are also using the <a
96a9758165cd Initial revision
carl
parents:
diff changeset
36 href="http://www.rhyolite.com/anti-spam/dcc/">DCC</a> milter, there are
96a9758165cd Initial revision
carl
parents:
diff changeset
37 a few considerations. You may need to whitelist senders from the DCC
96a9758165cd Initial revision
carl
parents:
diff changeset
38 bulk detector, or from the DNS based lists. Those are two very
96a9758165cd Initial revision
carl
parents:
diff changeset
39 different reasons for whitelisting. The former is done thru the DCC
96a9758165cd Initial revision
carl
parents:
diff changeset
40 whiteclnt config file, the later is done thru the DNSBL milter config
5
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
41 file.
0
96a9758165cd Initial revision
carl
parents:
diff changeset
42
96a9758165cd Initial revision
carl
parents:
diff changeset
43 <p>You may want to blacklist some specific senders or sending domains.
96a9758165cd Initial revision
carl
parents:
diff changeset
44 This could be done thru either the DCC (on a global basis, or for a
96a9758165cd Initial revision
carl
parents:
diff changeset
45 specific single recipient). We prefer to do such blacklisting via the
13
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
46 DNSBL milter config, since it can be done for a collection of recipient
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
47 mail domains. The DCC approach has the feature that you can capture the
0
96a9758165cd Initial revision
carl
parents:
diff changeset
48 entire message in the DCC log files. The DNSBL milter approach has the
96a9758165cd Initial revision
carl
parents:
diff changeset
49 feature that the mail is rejected earlier (at RCPT TO time), and the
96a9758165cd Initial revision
carl
parents:
diff changeset
50 sending machine just gets a generic "550 5.7.1 no such user" message.
96a9758165cd Initial revision
carl
parents:
diff changeset
51
5
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
52 <p>There is an option to reference the DCC whiteclnt file (via an
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
53 include_dcc line) in the DNSBL milter config. This will import the
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
54 (env_to, env_from, and substitute mail_host) entries from the DCC config
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
55 into the DNSBL config. This allows using the DCC config as the single
13
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
56 point for white/blacklisting. When used in this manner, the whitelist
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
57 env_to entries from the DCC config become global whitelist entries in
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
58 the DNSBL config.
5
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
59
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
60 <p>Consider the case where you have multiple clients, each with their
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
61 own mail servers, and each running their own DCC milters. Each client
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
62 is using the DCC facilities for envelope from/to white/blacklisting.
6
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
63 Presumably you can use rsync or scp to fetch copies of your clients DCC
5
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
64 whiteclnt files on a regular basis. Your mail server, acting as a
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
65 backup MX for your clients, can use the DNSBL milter, and include those
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
66 client DCC config files. The envelope to white/blacklisting will be
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
67 global for your system, but the envelope from white/blacklisting will be
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
68 appropriately tagged and used only for the domains controlled by each of
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
69 those clients.
793ac9cc114d updates to use dcc conf files
carl
parents: 4
diff changeset
70
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
71 <hr>
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
72 <center>Definitions</center>
0
96a9758165cd Initial revision
carl
parents:
diff changeset
73 <p>DNSBL - a named DNS based blocking list is defined by a dns suffix
96a9758165cd Initial revision
carl
parents:
diff changeset
74 (e.g. sbl-xbl.spamhaus.org) and a message string that is used to
96a9758165cd Initial revision
carl
parents:
diff changeset
75 generate the "550 5.7.1" smtp error return code. The names of these
96a9758165cd Initial revision
carl
parents:
diff changeset
76 DNSBLs will be used to define the DNSBL-LISTs.
96a9758165cd Initial revision
carl
parents:
diff changeset
77
96a9758165cd Initial revision
carl
parents:
diff changeset
78 <p>DNSBL-LIST - a named list of DNSBLs that will be used for specific
96a9758165cd Initial revision
carl
parents:
diff changeset
79 recipients or recipient domains.
96a9758165cd Initial revision
carl
parents:
diff changeset
80
96a9758165cd Initial revision
carl
parents:
diff changeset
81 <p>ENVELOPE-FROM-MAP - a named collection of mappings (key->value pairs)
96a9758165cd Initial revision
carl
parents:
diff changeset
82 from envelope-from values to the WHITE, BLACK, or DEFAULT keywords. The
96a9758165cd Initial revision
carl
parents:
diff changeset
83 names of these maps will be used for specific recipients or recipient
96a9758165cd Initial revision
carl
parents:
diff changeset
84 domains.
96a9758165cd Initial revision
carl
parents:
diff changeset
85
96a9758165cd Initial revision
carl
parents:
diff changeset
86 <p>The configuration file maps each recipient (or recipient domain) to
96a9758165cd Initial revision
carl
parents:
diff changeset
87 two names (a named DNSBL-LIST, and a named ENVELOPE-FROM-MAP). If the
96a9758165cd Initial revision
carl
parents:
diff changeset
88 recipient is not found in the configuration, the named DEFAULT
96a9758165cd Initial revision
carl
parents:
diff changeset
89 dnsbl-list and DEFAULT envelope-from-map will be used. When mail is
96a9758165cd Initial revision
carl
parents:
diff changeset
90 received for that recipient,
96a9758165cd Initial revision
carl
parents:
diff changeset
91
96a9758165cd Initial revision
carl
parents:
diff changeset
92 <ol>
96a9758165cd Initial revision
carl
parents:
diff changeset
93
96a9758165cd Initial revision
carl
parents:
diff changeset
94 <li>If the client has authenticated with sendmail, the mail is accepted
96a9758165cd Initial revision
carl
parents:
diff changeset
95 and the dns lists are not checked.
96a9758165cd Initial revision
carl
parents:
diff changeset
96
96a9758165cd Initial revision
carl
parents:
diff changeset
97 <li>If either one is BLACK, mail to this recipient is rejected with "no
96a9758165cd Initial revision
carl
parents:
diff changeset
98 such user", and the dns lists are not checked.
96a9758165cd Initial revision
carl
parents:
diff changeset
99
96a9758165cd Initial revision
carl
parents:
diff changeset
100 <li>If the envelope-from-map name is WHITE, mail to this recipient is
96a9758165cd Initial revision
carl
parents:
diff changeset
101 accepted and the dns lists are not checked.
96a9758165cd Initial revision
carl
parents:
diff changeset
102
96a9758165cd Initial revision
carl
parents:
diff changeset
103 <li>If the envelope-from-map exists, the map is checked for the presence
96a9758165cd Initial revision
carl
parents:
diff changeset
104 of the sender. A WHITE or BLACK answer is definitive and the dns lists
96a9758165cd Initial revision
carl
parents:
diff changeset
105 are not checked.
96a9758165cd Initial revision
carl
parents:
diff changeset
106
96a9758165cd Initial revision
carl
parents:
diff changeset
107 <li>If the dnsbl-list name is WHITE, the dns lists are not checked and
96a9758165cd Initial revision
carl
parents:
diff changeset
108 the mail is accepted. Otherwise, the dns lists are checked and the mail
96a9758165cd Initial revision
carl
parents:
diff changeset
109 is rejected if any list has an A record for the standard dns based
96a9758165cd Initial revision
carl
parents:
diff changeset
110 lookup scheme (reversed octets of the client followed by the dns
96a9758165cd Initial revision
carl
parents:
diff changeset
111 suffix).
96a9758165cd Initial revision
carl
parents:
diff changeset
112
11
2c206836b4cc integration work on url scanner
carl
parents: 6
diff changeset
113 <li>If the mail has not been accepted or rejected yet, the body content
2c206836b4cc integration work on url scanner
carl
parents: 6
diff changeset
114 is scanned for HTTP URLs (after base64, mime and html entity decoding),
2c206836b4cc integration work on url scanner
carl
parents: 6
diff changeset
115 and the first 20 host names are checked for their presence on the SBL.
2c206836b4cc integration work on url scanner
carl
parents: 6
diff changeset
116 If any host name is on the SBL, the mail is rejected.
2c206836b4cc integration work on url scanner
carl
parents: 6
diff changeset
117
0
96a9758165cd Initial revision
carl
parents:
diff changeset
118 </ol>
96a9758165cd Initial revision
carl
parents:
diff changeset
119
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
120 <hr>
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
121 <center>Sendmail access vs. DNSBL</center>
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
122 <p>With the standard sendmail.mc dnsbl FEATURE, the dnsbl checks may be
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
123 suppressed by entries in the /etc/mail/access database. For example,
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
124 suppose you control a /18 of address space, and have allocated some /24s
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
125 to some clients. You have access entries like
0
96a9758165cd Initial revision
carl
parents:
diff changeset
126
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
127 <pre>
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
128 192.168.4 OK
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
129 192.168.17 OK
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
130 </pre>
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
131
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
132 <p>to allow those clients to smarthost thru your mail server. Now if
13
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
133 one of those clients happens get infected with a virus that turns a
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
134 machine into an open proxy, and their 192.168.4.45 lands on the SBL-XBL,
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
135 you will still wind up allowing that infected machine to smarthost thru
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
136 your mail servers.
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
137
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
138 <p>With this DNSBL milter, the sendmail access database cannot override
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
139 the dnsbl checks, so that machine won't be able to send mail to or thru
15
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
140 your smarthost mail server (unless the virus/proxy can use smtp-auth).
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
141
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
142 <p>Using the standard sendmail features, you would add access entries to
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
143 allow hosts on your local network to relay thru your mail server. Those
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
144 OK entries in the sendmail access database will override all the dnsbl
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
145 checks. With this DNSBL milter, you will need to have the local users
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
146 authenticate with smtp-auth to get the same effect. You might find <a
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
147 href="http://www.ists.dartmouth.edu/IRIA/knowledge_base/linuxinfo/sendmail-ssh-how-to.htm">
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
148 these directions</a> helpful for setting up smtp-auth if you are on RH
6a21f7a3b002 add reference to starttls directions for rh8
carl
parents: 14
diff changeset
149 Linux.
12
6ac6d6b822ce fix memory leak with duplicate url host names,
carl
parents: 11
diff changeset
150
13
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
151 <hr> <center>Installation and configuration</center> <p>Usage: Note
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
152 that this has ONLY been tested on Linux, specifically RedHat Linux. In
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
153 particular, this milter makes no attempt to understand IPv6. Your
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
154 mileage will vary. You will need at a minimum a C++ compiler with a
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
155 minimally thread safe STL implementation. The distribution includes a
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
156 test.cpp program. If it fails this milter won't work. If it passes,
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
157 this milter might work.
0
96a9758165cd Initial revision
carl
parents:
diff changeset
158
96a9758165cd Initial revision
carl
parents:
diff changeset
159 Fetch <a href="http://www.five-ten-sg.com/util/dnsbl.tar.gz">dnsbl.tar.gz</a>
96a9758165cd Initial revision
carl
parents:
diff changeset
160 and
96a9758165cd Initial revision
carl
parents:
diff changeset
161
96a9758165cd Initial revision
carl
parents:
diff changeset
162 <pre>
96a9758165cd Initial revision
carl
parents:
diff changeset
163 tar xfvz dnsbl.tar.gz
96a9758165cd Initial revision
carl
parents:
diff changeset
164 bash install.bash
96a9758165cd Initial revision
carl
parents:
diff changeset
165 </pre>
96a9758165cd Initial revision
carl
parents:
diff changeset
166
96a9758165cd Initial revision
carl
parents:
diff changeset
167 Read and understand the contents of that install.bash script before you
96a9758165cd Initial revision
carl
parents:
diff changeset
168 run it. It may not be suitable for your system. Modify your
96a9758165cd Initial revision
carl
parents:
diff changeset
169 sendmail.mc by removing all the "FEATURE(dnsbl" lines, add the following
96a9758165cd Initial revision
carl
parents:
diff changeset
170 line in your sendmail.mc and rebuild the .cf file
96a9758165cd Initial revision
carl
parents:
diff changeset
171
96a9758165cd Initial revision
carl
parents:
diff changeset
172 <pre>
14
443aa0e8c6fa changes suggested by Nigel Horne
carl
parents: 13
diff changeset
173 INPUT_MAIL_FILTER(`dnsbl', `S=local:/var/run/dnsbl.sock, F=T, T=C:30s;S:2m;R:2m;E:5m')
0
96a9758165cd Initial revision
carl
parents:
diff changeset
174 </pre>
96a9758165cd Initial revision
carl
parents:
diff changeset
175
96a9758165cd Initial revision
carl
parents:
diff changeset
176 Read the sample <a
96a9758165cd Initial revision
carl
parents:
diff changeset
177 href="http://www.five-ten-sg.com/dnsbl.conf">var/dnsbl/dnsbl.conf</a>
6
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
178 file and modify it to fit your configuration. You can test your
13
2752e512fd32 finish documentation
carl
parents: 12
diff changeset
179 configuration files, and see a readable internal dump of them on stdout
6
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
180 with
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
181
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
182 <pre>
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
183 cd /var/dnsbl
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
184 ./dnsbl -c
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
185 </pre>
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
186
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
187 <pre>
0
96a9758165cd Initial revision
carl
parents:
diff changeset
188
96a9758165cd Initial revision
carl
parents:
diff changeset
189
6
cea50d98a6cf start work on content url scanner
carl
parents: 5
diff changeset
190
2
9bcd5ef11279 no message
carl
parents: 0
diff changeset
191 $Id$
4
15a7e942adec updates to use dcc conf files
carl
parents: 2
diff changeset
192 </pre>
0
96a9758165cd Initial revision
carl
parents:
diff changeset
193 </body>
96a9758165cd Initial revision
carl
parents:
diff changeset
194 </html>