diff xml/routeflapper.in @ 0:48d06780cf77

initial version
author Carl Byington <carl@five-ten-sg.com>
date Tue, 13 May 2008 14:03:10 -0700
parents
children 47f787af96c1
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/xml/routeflapper.in	Tue May 13 14:03:10 2008 -0700
@@ -0,0 +1,337 @@
+<reference>
+    <title>@PACKAGE@</title>
+    <partintro>
+        <title>Packages</title>
+
+        <para>The various source and binary packages are available at <ulink
+        url="http://www.five-ten-sg.com/@PACKAGE@/packages/">http://www.five-ten-sg.com/@PACKAGE@/packages/</ulink>
+        The most recent documentation is available at <ulink
+        url="http://www.five-ten-sg.com/@PACKAGE@/">http://www.five-ten-sg.com/@PACKAGE@/</ulink>
+        </para>
+
+        <para>A <ulink
+        url="http://www.selenic.com/mercurial/wiki/">Mercurial</ulink> source
+        code repository for this project is available at <ulink
+        url="http://hg.five-ten-sg.com/@PACKAGE@/">http://hg.five-ten-sg.com/@PACKAGE@/</ulink>.
+        </para>
+
+    </partintro>
+
+    <refentry id="@PACKAGE@.1">
+        <refentryinfo>
+            <date>2008-04-12</date>
+        </refentryinfo>
+
+        <refmeta>
+            <refentrytitle>@PACKAGE@</refentrytitle>
+            <manvolnum>1</manvolnum>
+            <refmiscinfo>@PACKAGE@ @VERSION@</refmiscinfo>
+        </refmeta>
+
+        <refnamediv id='name.1'>
+            <refname>@PACKAGE@</refname>
+            <refpurpose>detects suspicious routes</refpurpose>
+        </refnamediv>
+
+        <refsynopsisdiv id='synopsis.1'>
+            <title>Synopsis</title>
+            <cmdsynopsis>
+                <command>@PACKAGE@</command>
+                <arg><option>-c</option></arg>
+                <arg><option>-d <replaceable class="parameter">n</replaceable></option></arg>
+            </cmdsynopsis>
+        </refsynopsisdiv>
+
+        <refsect1 id='description.1'>
+            <title>Description</title>
+
+            <para><command>@PACKAGE@</command> is a daemon that monitors BGP
+            updates and SMTP connections to discover whether SMTP connections are
+            coming from ip addresses whose best route is suspicious.  </para>
+
+            <para>The <citerefentry> <refentrytitle>@PACKAGE@.conf</refentrytitle>
+            <manvolnum>5</manvolnum> </citerefentry> file specifies the syslog files
+            to be monitored, and the regular expressions (<citerefentry>
+            <refentrytitle>regex</refentrytitle> <manvolnum>7</manvolnum>
+            </citerefentry>) to be applied to new lines in those files.  </para>
+
+            <para>The discussion has focused on syslog files, but any ascii text
+            file can be used, so long as some other process appends lines to that
+            file, and those lines containing bgp updates can be matched
+            with some regular expression.</para>
+
+            <para>Considering syslog files in particular, these are normally rotated
+            via logrotate.  <command>@PACKAGE@</command> properly detects and
+            handles this case by closing the old file, and reopening the newly
+            created file.</para>
+        </refsect1>
+
+        <refsect1 id='options.1'>
+            <title>Options</title>
+            <variablelist>
+                <varlistentry>
+                    <term>-c</term>
+                    <listitem>
+                        <para>
+                            Load the configuration file, print a cannonical form
+                            of the configuration on stdout, and exit.
+                       </para>
+                   </listitem>
+                </varlistentry>
+                <varlistentry>
+                    <term>-d <replaceable class="parameter">n</replaceable></term>
+                    <listitem>
+                        <para>
+                            Set the debug level to <replaceable class="parameter">n</replaceable>.
+                        </para>
+                    </listitem>
+                </varlistentry>
+            </variablelist>
+        </refsect1>
+
+        <refsect1 id='usage.1'>
+            <title>Usage</title>
+            <para><command>@PACKAGE@</command> -d 2</para>
+        </refsect1>
+
+        <refsect1 id='configuration.1'>
+            <title>Configuration</title>
+            <para>
+                The configuration file is documented in <citerefentry>
+                <refentrytitle>@PACKAGE@.conf</refentrytitle> <manvolnum>5</manvolnum>
+                </citerefentry>.  Any change to the config file will cause it to be
+                reloaded within three minutes.
+            </para>
+        </refsect1>
+
+        <refsect1 id='introduction.1'>
+            <title>Introduction</title>
+            <para>
+                Consider the hypothetical case of a spammer who is connected via a
+                provider that does not filter BGP routing announcements. The spammer
+                then has some options to announce ip address space to be used for
+                sending spam. Note that we only consider cases where the spammer
+                simply wants to anonymously use some ip address space. This is very
+                different from the case where the attacker wants to use some specific
+                address space belonging to another organization in order to impersonate
+                some service provided by that other organization.
+            </para>
+
+            <para>
+                They can announce a more specific route, for example a /24, inside a
+                larger block. For example, consider 169.232.0.0/16. If the spammer
+                pokes around, they can probably find an unused /24 in there. So they
+                announce 169.232.240.0/24 and then send spam from that block. There
+                are two problems with this scheme. First, the announcement of such a
+                smaller block may be filtered out by many BGP routers, reducing their
+                reachability to their spam targets. Second, they may have made a
+                mistake, and that /24 is actually in use by some UCLA service that
+                will notice their hijack.
+            </para>
+
+            <para>
+                They can announce a less specific route, for example a /16, covering
+                some individual smaller blocks. For example, they could announce
+                52.129.0.0/16.  The spammer could then avoid the four existing
+                announcements inside that block, and instead spam from
+                52.129.128.0/17. That gives them 32K ip addresses to work with. The
+                advantage here is that their announcement of a large block won't be
+                filtered out by as many (if any) BGP routers, giving them better reachability
+                to their spam targets. And they know they won't interfere with any
+                existing use of that address space, since there was no previous BGP
+                announcement of that /17 or any subset of it.
+            </para>
+
+            <para>
+                Or they can simply announce a prefix that is not assigned to anyone.
+                For example, they could simply start announcing 185.10.0.0/16. This
+                has many of the same advantages as the previous scheme, but some BGP
+                routers may be configured to drop such bogon announcements, again
+                potentially reducing their reachability to their spam targets.
+            </para>
+
+            <para>
+                In each of these cases, the spammer can use BGP to announce some
+                address space, then send spam from those addresses, and then withdraw
+                the route annoucement. This would make it difficult for the recipient of
+                such spam to determine who actually sent it.
+            </para>
+
+            <para>
+                In a paper from 2006 published at <ulink
+                url="http://www-static.cc.gatech.edu/~feamster/publications/p396-ramachandran.pdf">
+                http://www-static.cc.gatech.edu/~feamster/publications/p396-ramachandran.pdf
+                </ulink>, Ramachandran and Feamster claim evidence for the statement
+                that spammers are using such short-lived bogus BGP route announcements
+                to send spam from hijacked parts of the IPv4 address space.
+            </para>
+
+            <para>
+                The question is, are spammers actually doing this today, or is this
+                just a hypothetical spam tactic that they could use in the future?  To
+                help answer that question, this package monitors BGP annoucements,
+                classifies some of them as suspicious, and logs instances of SMTP
+                connections from suspicious prefixes.
+            </para>
+
+            <para>
+                We track the history of the AS adjacency graph, by computing the union
+                of all AS adjacent pairs over all the announced prefixes. For example,
+                137.169.0.0/16 is currently announced here with an AS path of '22298
+                19080 3549 6517 14981', so we add (22298,19080) (19080,3549)
+                (3549,6517) and (6517,14981) as valid adjacent AS pairs.
+                We also track the history of the origin AS for each announced prefix. Both
+                the origin AS and the AS adjacency pairs are tracked via the following
+                algorithm that runs every hour.
+            </para>
+
+            <para>
+                For each prefix, (prefix[*] *= 0.99) to exponentially decay the current
+                prefix origin counts. Then, for each prefix, if the prefix is announced,
+                (prefix[current.origin]++) increments the hourly count for the current origin.
+                The decay factor of 0.99 gives the counts a half life of about 69 hours.
+                The same is done with the hourly counts for each observed AS adjacent pair.
+            </para>
+
+            <para>
+                A prefix announcement is suspicious if the prefix[origin] count is less
+                than 3.0, or if the AS path contains any adjacent AS pair with a count
+                less than 3.0.
+            </para>
+
+            <para>
+                <ulink url="http://phas.netsec.colostate.edu/">PHAS</ulink> is another
+                system that attempts to detect address space hijacking, but it is not
+                correlated with SMTP connections or spam attempts.
+            </para>
+
+            <para>
+                <ulink url="http://cs.unm.edu/~karlinjf/IAR/index.php">IAR</ulink> is
+                another system that attempts to detect address space hijacking, but it
+                is not correlated with SMTP connections or spam attempts. IAR uses
+                methods detailed in <ulink
+                url="http://www.cs.unm.edu/~treport/tr/06-06/pgbgp3.pdf">PGBGP</ulink>
+                to detect suspicious routes. One problem with PGBGP as applied to our
+                hypothetical spammer problem, is that PGBGP is primarily looking for
+                hijacks where the attacker actually wants some specific ip address
+                space, either for a denial of service, or to impersonate the actual
+                owner.  Our hypothetical spammer does not care about that - they only
+                care about sending spam anonymously. In particular, PGBGP ignores
+                super-prefix hijacks, but it seems likely that that is the preferred
+                method for our hypothetical spammer. However, the PGBGP paper does provide
+                useful data on the required timescale to avoid most of the normal AS
+                origin changes.
+            </para>
+        </refsect1>
+
+        <refsect1 id='todo.1'>
+            <title>TODO</title>
+            <para>
+                None.
+            </para>
+        </refsect1>
+
+        <refsect1 id='copyright.1'>
+            <title>Copyright</title>
+            <para>
+                Copyright (C) 2008 by 510 Software Group &lt;carl@five-ten-sg.com&gt;
+            </para>
+            <para>
+                This program is free software; you can redistribute it and/or modify it
+                under the terms of the GNU General Public License as published by the
+                Free Software Foundation; either version 3, or (at your option) any
+                later version.
+            </para>
+            <para>
+                You should have received a copy of the GNU General Public License along
+                with this program; see the file COPYING.  If not, please write to the
+                Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
+            </para>
+        </refsect1>
+
+        <refsect1 id='version.1'>
+            <title>Version</title>
+            <para>
+                @VERSION@
+            </para>
+        </refsect1>
+    </refentry>
+
+
+    <refentry id="@PACKAGE@.conf.5">
+        <refentryinfo>
+            <date>2008-04-12</date>
+        </refentryinfo>
+
+        <refmeta>
+            <refentrytitle>@PACKAGE@.conf</refentrytitle>
+            <manvolnum>5</manvolnum>
+            <refmiscinfo>@PACKAGE@ @VERSION@</refmiscinfo>
+        </refmeta>
+
+        <refnamediv id='name.5'>
+            <refname>@PACKAGE@.conf</refname>
+            <refpurpose>configuration file for @PACKAGE@</refpurpose>
+        </refnamediv>
+
+        <refsynopsisdiv id='synopsis.5'>
+            <title>Synopsis</title>
+            <cmdsynopsis>
+                <command>@PACKAGE@.conf</command>
+            </cmdsynopsis>
+        </refsynopsisdiv>
+
+        <refsect1 id='description.5'>
+            <title>Description</title>
+            <para>The <command>@PACKAGE@.conf</command> configuration file is
+            specified by this partial bnf description. The entire config file
+            is case sensitive. All the keywords are lower case.
+            </para>
+
+            <literallayout class="monospaced"><![CDATA[
+CONFIG    := {FILE}+
+FILE      := "file" FILENAME "{" PATTERN+ "};"
+PATTERN   := PATH | ANNOUNCE | WITHDRAW | IP
+PATH      := "path" REGEX "{" INDEXPATH         '}' ";"
+ANNOUNCE  := "path" REGEX "{" INDEXVAL INDEXLEN '}' ";"
+WITHDRAW  := "path" REGEX "{" INDEXVAL INDEXLEN '}' ";"
+IP        := "path" REGEX "{" INDEXIP           '}' ";"
+INDEXPATH := "index_path"   REGEX-INTEGER-VALUE ";"
+INDEXVAL  := "index_value"  REGEX-INTEGER-VALUE ";"
+INDEXLEN  := "index_length" REGEX-INTEGER-VALUE ";"
+INDEXIP   := "index_ip"     REGEX-INTEGER-VALUE ";"]]></literallayout>
+        </refsect1>
+
+        <refsect1 id='sample.5'>
+            <title>Sample</title>
+            <literallayout class="monospaced"><![CDATA[
+file "/var/log/bgp" {
+    path " rcvd UPDATE w.* path (([0-9]| )*[0-9])" {
+        index_path 1;
+    };
+    announce " rcvd (([0-9]|\.)*)/([0-9]*)$" {
+        index_value  1;
+        index_length 3;
+    };
+    withdraw " rcvd UPDATE about (([0-9]|\.)*)/([0-9]*) -- withdrawn" {
+        index_value  1;
+        index_length 3;
+    };
+};
+
+file "/var/log/maillog" {
+    ip "NOQUEUE: connect from.* \[(.*)\]" {
+        index_ip 1;
+    };
+};]]></literallayout>
+        </refsect1>
+
+        <refsect1 id='version.5'>
+            <title>Version</title>
+            <para>
+                @VERSION@
+            </para>
+        </refsect1>
+
+    </refentry>
+</reference>