view xml/routeflapper.in @ 1:47f787af96c1

update documentation to match code
author Carl Byington <carl@five-ten-sg.com>
date Tue, 13 May 2008 15:46:53 -0700
parents 48d06780cf77
children 180d26aa2a17
line wrap: on
line source

<reference>
    <title>@PACKAGE@</title>
    <partintro>
        <title>Packages</title>

        <para>The various source and binary packages are available at <ulink
        url="http://www.five-ten-sg.com/@PACKAGE@/packages/">http://www.five-ten-sg.com/@PACKAGE@/packages/</ulink>
        The most recent documentation is available at <ulink
        url="http://www.five-ten-sg.com/@PACKAGE@/">http://www.five-ten-sg.com/@PACKAGE@/</ulink>
        </para>

        <para>A <ulink
        url="http://www.selenic.com/mercurial/wiki/">Mercurial</ulink> source
        code repository for this project is available at <ulink
        url="http://hg.five-ten-sg.com/@PACKAGE@/">http://hg.five-ten-sg.com/@PACKAGE@/</ulink>.
        </para>

    </partintro>

    <refentry id="@PACKAGE@.1">
        <refentryinfo>
            <date>2008-05-13</date>
        </refentryinfo>

        <refmeta>
            <refentrytitle>@PACKAGE@</refentrytitle>
            <manvolnum>1</manvolnum>
            <refmiscinfo>@PACKAGE@ @VERSION@</refmiscinfo>
        </refmeta>

        <refnamediv id='name.1'>
            <refname>@PACKAGE@</refname>
            <refpurpose>detects suspicious routes</refpurpose>
        </refnamediv>

        <refsynopsisdiv id='synopsis.1'>
            <title>Synopsis</title>
            <cmdsynopsis>
                <command>@PACKAGE@</command>
                <arg><option>-c</option></arg>
                <arg><option>-d <replaceable class="parameter">n</replaceable></option></arg>
            </cmdsynopsis>
        </refsynopsisdiv>

        <refsect1 id='description.1'>
            <title>Description</title>

            <para><command>@PACKAGE@</command> is a daemon that monitors BGP
            updates and SMTP connections to discover whether SMTP connections are
            coming from ip addresses whose best route is suspicious.  </para>

            <para>The <citerefentry> <refentrytitle>@PACKAGE@.conf</refentrytitle>
            <manvolnum>5</manvolnum> </citerefentry> file specifies the syslog files
            to be monitored, and the regular expressions (<citerefentry>
            <refentrytitle>regex</refentrytitle> <manvolnum>7</manvolnum>
            </citerefentry>) to be applied to new lines in those files.  </para>

            <para>The discussion has focused on syslog files, but any ascii text
            file can be used, so long as some other process appends lines to that
            file, and those lines containing bgp updates can be matched
            with some regular expression.</para>

            <para>Considering syslog files in particular, these are normally rotated
            via logrotate.  <command>@PACKAGE@</command> properly detects and
            handles this case by closing the old file, and reopening the newly
            created file.</para>
        </refsect1>

        <refsect1 id='options.1'>
            <title>Options</title>
            <variablelist>
                <varlistentry>
                    <term>-c</term>
                    <listitem>
                        <para>
                            Load the configuration file, print a cannonical form
                            of the configuration on stdout, and exit.
                       </para>
                   </listitem>
                </varlistentry>
                <varlistentry>
                    <term>-d <replaceable class="parameter">n</replaceable></term>
                    <listitem>
                        <para>
                            Set the debug level to <replaceable class="parameter">n</replaceable>.
                        </para>
                    </listitem>
                </varlistentry>
            </variablelist>
        </refsect1>

        <refsect1 id='usage.1'>
            <title>Usage</title>
            <para><command>@PACKAGE@</command> -d 2</para>
        </refsect1>

        <refsect1 id='configuration.1'>
            <title>Configuration</title>
            <para>
                The configuration file is documented in <citerefentry>
                <refentrytitle>@PACKAGE@.conf</refentrytitle> <manvolnum>5</manvolnum>
                </citerefentry>.  Any change to the config file will cause it to be
                reloaded within three minutes.
            </para>
        </refsect1>

        <refsect1 id='introduction.1'>
            <title>Introduction</title>
            <para>
                Consider the hypothetical case of a spammer who is connected via a
                provider that does not filter BGP routing announcements. The spammer
                then has some options to announce ip address space to be used for
                sending spam. Note that we only consider cases where the spammer
                simply wants to anonymously use some ip address space. This is very
                different from the case where the attacker wants to use some specific
                address space belonging to another organization in order to impersonate
                some service provided by that other organization.
            </para>

            <para>
                They can announce a more specific route, for example a /24, inside a
                larger block. For example, consider 169.232.0.0/16. If the spammer
                pokes around, they can probably find an unused /24 in there. So they
                announce 169.232.240.0/24 and then send spam from that block. There
                are two problems with this scheme. First, the announcement of such a
                smaller block may be filtered out by many BGP routers, reducing their
                reachability to their spam targets. Second, they may have made a
                mistake, and that /24 is actually in use by some UCLA service that
                will notice their hijack.
            </para>

            <para>
                They can announce a less specific route, for example a /16, covering
                some individual smaller blocks. For example, they could announce
                52.129.0.0/16.  The spammer could then avoid the four existing
                announcements inside that block, and instead spam from
                52.129.128.0/17. That gives them 32K ip addresses to work with. The
                advantage here is that their announcement of a large block won't be
                filtered out by as many (if any) BGP routers, giving them better reachability
                to their spam targets. And they know they won't interfere with any
                existing use of that address space, since there was no previous BGP
                announcement of that /17 or any subset of it.
            </para>

            <para>
                Or they can simply announce a prefix that is not assigned to anyone.
                For example, they could simply start announcing 185.10.0.0/16. This
                has many of the same advantages as the previous scheme, but some BGP
                routers may be configured to drop such bogon announcements, again
                potentially reducing their reachability to their spam targets.
            </para>

            <para>
                In each of these cases, the spammer can use BGP to announce some
                address space, then send spam from those addresses, and then withdraw
                the route annoucement. This would make it difficult for the recipient of
                such spam to determine who actually sent it.
            </para>

            <para>
                In a paper from 2006 published at <ulink
                url="http://www-static.cc.gatech.edu/~feamster/publications/p396-ramachandran.pdf">
                http://www-static.cc.gatech.edu/~feamster/publications/p396-ramachandran.pdf
                </ulink>, Ramachandran and Feamster claim evidence for the statement
                that spammers are using such short-lived bogus BGP route announcements
                to send spam from hijacked parts of the IPv4 address space.
            </para>

            <para>
                The question is, are spammers actually doing this today, or is this
                just a hypothetical spam tactic that they could use in the future?  To
                help answer that question, this package monitors BGP annoucements,
                classifies some of them as suspicious, and logs instances of SMTP
                connections from suspicious prefixes.
            </para>

            <para>
                We track the history of the AS adjacency graph, by computing the union
                of all AS adjacent pairs over all the announced prefixes. For example,
                137.169.0.0/16 is currently announced here with an AS path of '22298
                19080 3549 6517 14981', so we add (22298,19080) (19080,3549)
                (3549,6517) and (6517,14981) as valid adjacent AS pairs.
                We also track the history of the origin AS for each announced prefix. Both
                the origin AS and the AS adjacency pairs are tracked via the following
                algorithm that runs every hour.
            </para>

            <para>
                For each prefix, (prefix[*] *= 0.99) to exponentially decay the current
                prefix origin counts. Then, for each prefix, if the prefix is announced,
                (prefix[current.origin]++) increments the hourly count for the current origin.
                The decay factor of 0.99 gives the counts a half life of about 69 hours.
                The same is done with the hourly counts for each observed AS adjacent pair.
            </para>

            <para>
                A prefix announcement is suspicious if the prefix[origin] count is less
                than 3.0, or if the AS path contains any adjacent AS pair with a count
                less than 3.0.
            </para>

            <para>
                <ulink url="http://phas.netsec.colostate.edu/">PHAS</ulink> is another
                system that attempts to detect address space hijacking, but it is not
                correlated with SMTP connections or spam attempts.
            </para>

            <para>
                <ulink url="http://cs.unm.edu/~karlinjf/IAR/index.php">IAR</ulink> is
                another system that attempts to detect address space hijacking, but it
                is not correlated with SMTP connections or spam attempts. IAR uses
                methods detailed in <ulink
                url="http://www.cs.unm.edu/~treport/tr/06-06/pgbgp3.pdf">PGBGP</ulink>
                to detect suspicious routes. One problem with PGBGP as applied to our
                hypothetical spammer problem, is that PGBGP is primarily looking for
                hijacks where the attacker actually wants some specific ip address
                space, either for a denial of service, or to impersonate the actual
                owner.  Our hypothetical spammer does not care about that - they only
                care about sending spam anonymously. In particular, PGBGP ignores
                super-prefix hijacks, but it seems likely that that is the preferred
                method for our hypothetical spammer. However, the PGBGP paper does provide
                useful data on the required timescale to avoid most of the normal AS
                origin changes.
            </para>
        </refsect1>

        <refsect1 id='todo.1'>
            <title>TODO</title>
            <para>
                None.
            </para>
        </refsect1>

        <refsect1 id='copyright.1'>
            <title>Copyright</title>
            <para>
                Copyright (C) 2008 by 510 Software Group &lt;carl@five-ten-sg.com&gt;
            </para>
            <para>
                This program is free software; you can redistribute it and/or modify it
                under the terms of the GNU General Public License as published by the
                Free Software Foundation; either version 3, or (at your option) any
                later version.
            </para>
            <para>
                You should have received a copy of the GNU General Public License along
                with this program; see the file COPYING.  If not, please write to the
                Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
            </para>
        </refsect1>

        <refsect1 id='version.1'>
            <title>Version</title>
            <para>
                @VERSION@
            </para>
        </refsect1>
    </refentry>


    <refentry id="@PACKAGE@.conf.5">
        <refentryinfo>
            <date>2008-05-13</date>
        </refentryinfo>

        <refmeta>
            <refentrytitle>@PACKAGE@.conf</refentrytitle>
            <manvolnum>5</manvolnum>
            <refmiscinfo>@PACKAGE@ @VERSION@</refmiscinfo>
        </refmeta>

        <refnamediv id='name.5'>
            <refname>@PACKAGE@.conf</refname>
            <refpurpose>configuration file for @PACKAGE@</refpurpose>
        </refnamediv>

        <refsynopsisdiv id='synopsis.5'>
            <title>Synopsis</title>
            <cmdsynopsis>
                <command>@PACKAGE@.conf</command>
            </cmdsynopsis>
        </refsynopsisdiv>

        <refsect1 id='description.5'>
            <title>Description</title>
            <para>The <command>@PACKAGE@.conf</command> configuration file is
            specified by this partial bnf description. The entire config file
            is case sensitive. All the keywords are lower case.
            </para>

            <literallayout class="monospaced"><![CDATA[
CONFIG    := {FILE}+
FILE      := "file" FILENAME "{" PATTERN+ "};"
PATTERN   := RESET | PATH | ANNOUNCE | WITHDRAW | IP
RESET     := "reset"    REGEX "{"  "}" ";"
PATH      := "path"     REGEX "{" INDEXPATH         "}" ";"
ANNOUNCE  := "announce" REGEX "{" INDEXVAL INDEXLEN "}" ";"
WITHDRAW  := "withdraw" REGEX "{" INDEXVAL INDEXLEN "}" ";"
IP        := "ip"       REGEX "{" INDEXIP           "}" ";"
INDEXPATH := "index_path"   REGEX-INTEGER-VALUE ";"
INDEXVAL  := "index_value"  REGEX-INTEGER-VALUE ";"
INDEXLEN  := "index_length" REGEX-INTEGER-VALUE ";"
INDEXIP   := "index_ip"     REGEX-INTEGER-VALUE ";"]]></literallayout>
        </refsect1>

        <refsect1 id='sample.5'>
            <title>Sample</title>
            <literallayout class="monospaced"><![CDATA[
file "/var/log/bgp" {
    reset "ADJCHANGE: neighbor .* Up" {};
    path " rcvd UPDATE w.* path (([0-9]| )*[0-9])" {
        index_path 1;
    };
    announce " rcvd (([0-9]|\.)*)/([0-9]*)$" {
        index_value  1;
        index_length 3;
    };
    withdraw " rcvd UPDATE about (([0-9]|\.)*)/([0-9]*) -- withdrawn" {
        index_value  1;
        index_length 3;
    };
};

file "/var/log/maillog" {
    ip "NOQUEUE: connect from.* \[(.*)\]" {
        index_ip 1;
    };
};]]></literallayout>
        </refsect1>

        <refsect1 id='version.5'>
            <title>Version</title>
            <para>
                @VERSION@
            </para>
        </refsect1>

    </refentry>
</reference>