Detecting IP Address Filters

Until recently IP network operators were encouraged to set up so-called “bogon address filters” at the edge of their networks. These filters were intended to discard all incoming traffic where the source address in the IP header was from a block of addresses that was known to be unallocated. The inference was that a matching packet was either an unintentional leak from some privately addressed network domain or was generated using source address spoofing. In either case there is no point in delivering the packet, since it comes from a demonstrably fictitious source.

The initial use of these bogon filters was to mark those /8 address blocks in the IPv4 address space that were held by the IANA, although some tools were subsequently developed (and deployed in more limited contexts) to also mark as bogons address blocks held as unallocated by the Regional Address Registries (RIRs), and other tools were developed to mark address blocks that were consistently used to originate SPAM or malicious traffic, based on collected statistics and other heuristics.

A persistent issue with the use of these bogon filters is that they can lose currency. When addresses change from “unallocated” to “in-use” (such as an IANA allocation action, as happened in the past, or a RIR assignment or allocation of addresses to a Local Internet Registry) or when an address block already in use changes hands, then the rationale for maintaining the entry in a bogon filter almost certainly changes.

Another major issue with bogon filters is that, in general, they are not centrally managed. Many ISPs use local procedures to maintain their bogon filters. So when an address block changes hands, how can the new address holder work out where there are bogon filters that is listing the address block, and how can they get the filters changed?

Some of the Regional Internet Registries, notably the RIPE NCC, have been operating a “Debogon Project” where addresses from selected address blocks are advertised, and end users can test reachability into these address blocks via conventional use of ping and traceroute. Additional active tests can be carried out from the announced blocks to known test networks and devices, performing a test in the opposite direction.

The approach described here is somewhat different, in that it does not rely on the use of cooperating users, nor on the deployment of probes or other network edge devices. Instead, we have used online advertisement placement to co-opt end users to undertake basic reachability tests as a background test in their browsers. The remainder of the document describes the methodology used in these tests and the results.

Reachability Test Setup

The approach used here is to employ the Internet at large to perform the reachability tests. The basic method is to get the client’s browser to fetch a number of a 1 pixel image files under the control of a script that was obtained from a known working IP address. All image files, except the final image, are sourced from domain names that map to IP addresses drawn from the address blocks being tested. The final image is sourced from the same address block as that from which the experiment was delivered from in the first place, so it acts as an end marker to indicate that the user has not interrupted the browser’s actions mid way through the tests.

This test methodology can be achieved in a number of ways, including in javascript or in flash code embedded into a web page. However with such an approach, then the clients who undertake the reachability test are restricted to the set of clients who visit a web page that has the test included in the source of the page. As we were wanting to test a collection of clients from across the entire Internet, we were interested in a way to present these tests in a broader fashion than could be possible with a small number of appropriately equipped web sites.

The approach taken here is to use Google’s Image Advertisements as the vehicle for presenting the reachability tests to end client browsers. Embedded in the ad material is a segment of flash code. The flash code is executed by the client’s browser whenever the ad is delivered to the client, and the code will execute irrespective of whether the ad is “clicked” or not. The flash code fetches a control file from the experiment controller, using a control IP address. This control file includes a number of image URLs, which form the core of the experiment. An example of the control URL and the generated file is shown in Figure 1.

    
    http://www.a.rqa.rand.apnic.net/measureipv6.cgi
    
    rqaa	http://t2.u2980881285.s1326250419.i1110.v1010.a.rqa.rand.apnic.net/1x1.png?
                t2.u2980881285.s1326250419.i1110.v1010.rqaa
    rqab	http://t2.u2980881285.s1326250419.i1110.v1010.b.rqa.rand.apnic.net/1x1.png?
                t2.u2980881285.s1326250419.i1110.v1010.rqab
    rqac	http://t2.u2980881285.s1326250419.i1110.v1010.c.rqa.rand.apnic.net/1x1.png?
                t2.u2980881285.s1326250419.i1110.v1010.rqac
    results	http://results.c.rqa.rand.apnic.net/1x1.png?t2.u2980881285.s1326250419.
                i1110.v1010&r=
    
        Figure 1. Example of an Experiment Control File
    

In this example there are two address reachability tests being conducted (“rqaa” and “rqab”), and a control test (“rqac”). The parameters to the test include the identification of the test version, the time the test was conducted and a random number that allows us to recombine the individual experiments back into a single test set when performing analysis of the web server’s log files. The random number in the domain name also prevents pre-cached experiments from being invoked, since every test-set has a different unique DNS name using this method.

The images are 1×1 pixel images, and are fetched asynchronously from page rendering, and additionally are not included in the displayed page. This is to minimize any additional delay on the page view from the testing, and largely takes place without any visible impact on the users browser. The same mechanism is employed by Google Analytics and other page view collection methods.

In this experiment three tests use three wildcard domain names, namely *.a.rqa.rand.apnic.net, *.b.rqa.rand.apnic.net, and *.c.rqa.rand.apnic.net. These domain names map to the following IP addresses, from within the associated network address blocks:

       Test Address Address Prefix
  rqaa 110.76.136.1 110.76.136.0/22
  rqab 103.246.136.1 103.246.136.0/22
  rqac 203.133.248.12 203.133.248.0/24 (control network)

The server environment was configured such that these test address prefixes were being advertised as reachable from this server, and the server itself was configured with secondary IP addresses that match the three rqa addresses.

The Apache web server of the system was configured with a wildcard virtual server host of the form *.rqa.rand.apnic.net, so that all the images are served from the same server

The Apache web server logs were used for reachability analysis.

If a client successfully performs the entire test then the web logs will contain a sequence of records, an example of which is shown in Figure 2.

    
    110.67.xxx.xxx - - [02/Jan/2012:00:01:38 +0000] "GET /crossdomain.xml
    	HTTP/1.1" 200 726 "-"
    110.67.xxx.xxx - - [02/Jan/2012:00:01:38 +0000] "GET /measureipv6.cgi?&hash=3103003790
     	HTTP/1.1" 200 140
     	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN" 
    110.67.xxx.xxx - - [02/Jan/2012:00:01:39 +0000] "GET /crossdomain.xml
    	HTTP/1.1" 200 726 "-" 
    110.67.xxx.xxx - - [02/Jan/2012:00:01:39 +0000] "GET /1x1.png?t2.u940792609.
    	s1325462498.i1110.v1010.rqab 
    	HTTP/1.1" 200 157 
    	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN" 
    110.67.xxx.xxx - - [02/Jan/2012:00:01:40 +0000] "GET /crossdomain.xml 
    	HTTP/1.1" 200 726 "-"
    110.67.xxx.xxx - - [02/Jan/2012:00:01:41 +0000] "GET /1x1.png?t2.u940792609.
    	s1325462498.i1110.v1010.rqac
    	HTTP/1.1" 200 157
    	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN"
    110.67.xxx.xxx - - [02/Jan/2012:00:01:41 +0000] "GET /crossdomain.xml 
    	HTTP/1.1" 200 726 "-"
    110.67.xxx.xxx - - [02/Jan/2012:00:01:41 +0000] "GET /1x1.png?t2.u940792609.
    	s1325462498.i1110.v1010.rqaa
     	HTTP/1.1" 200 157
    	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN" 
    
        Figure 2 - The Web Server's Log of a Reachability Test
    

Because the ad itself is served from a remote site, all fetches from the control server are in fact two fetches, the first for the crossdomain policy file to determine if fetches from this site are permitted, and the second being the data file itself.

In the example shown here the experiment generated 8 fetches. The first pair of fetches retrieved the list of URIs to fetch (measurev6.cgi). The next 3 pairs of fetches performed the reachability test itself.

What we are looking for are origin AS’s where there is a systematic failure to retrieve the first, second or both of the first two images, yet a consistent ability to fetch from the control address.

Reachability Test Results

The test was run over 22 days. During that time 513,141 individual client IP addresses performed the reachability test. These IP addresses were sourced from 9,665 originating Autonomous Systems.

In order to eliminate potential problems with individual tests not completing (due to the client moving to a different webpage before the text was complete, for example), we have filtered out all originating AS’s where there are only 1 or 2 individual tests performed. This left 507,121 tests, spanning 4,839 originating AS’s.

3 AS’s were unable to reach the “a” and “b” target (indicating some form of filter on both the 110.76.136.0/22 and 103.246.136.0/22 address blocks) These AS’s were:

       AS Number of Tests Country AS Description
  23771 26 CN SXBCTV-AP SXBCTV ,Internet Service Provider
  38283 3 CN CHINANET-SCIDC-AS-AP CHINANET SiChuan Telecom Internet Data Center
  45143 50 SG SINGTELMOBILE-AS-AP SINGTEL MOBILE INTERNET SERVICE PROVIDER

There were no origin AS where more than 2 individual experiments were able to reach the “b” target, but unable to reach the “a” target.

16 AS’s were able to reach the “a” target, but unable to reach the “b” target (the prefix 103.246.136.0./22 appears to be filtered). These AS’s were:

       AS Number of Tests Country AS Description
  4835 10 CN CHINANET-IDC-SN China Telecom (Group)
  6983 9 US ITCDELTA – ITC Deltacom
  9051 17 LB IncoNet Data Management sal
  14868 55 BR Companhia Paranaense de Energia – COPEL
  19332 7 MX Marcatel Com, S.A. de C.V.
  23252 3 CA IK – WTC Communications
  25248 77 CZ BLUETONE-AS Ceske Radiokomunikace a.s.
  28142 4 BR DIGITAL DESIGN – SERVICOS DE INFORMATICA LTDA
  28378 4 MX TV Rey de Occidente, S.A. de C.V.
  28678 3 PL KOSMAN-EDU-AS Technical University of Koszalin
  39566 12 PL TRUSTNET-PL-AS trustnet.pl / smarthost.pl hosting datacenter
  41088 13 CZ CZNSYS N-SYS s.r.o.
  43153 9 PL SFERANET-AS SferaNET Sp. z o.o.
  44651 12 HU COMUNIQUE Com.unique Telekommunikacios Szolgaltato Kft.
  49289 11 PL NITKA-NET ELPRO – Elektronika Profesjonalna Waldemar Nitka
  197100 4 ES Esystel Servicios Multimedia, SL

If one were to assume that this is a representative sample of reachability by origin AS, then of the total population of 39,908 AS’s some 66 AS’s could be assumed to be filtering 103.246.136.0/22, and 12 AS’s could be assumed to be filtering both 110.76.136.0/22 and 103.246.136.0/22.

Consideration of Experimental Results

The experiment raises a number of questions, which we will briefly discuss here.

The first question is, can these experimental results be verified by other means?

A simple means of validation of the result is by using ping, with the source address option. We were able to validate a small number of addresses chosen from random from the client set, that where the end systems were responsive to ping packets sources from the “control” address of 203.133.248.12, they were unresponsive to pings when using the tested addresses as a source address. It is noted that not all addresses are responsive to ping probes, due to firewalls and filters, so this by no means a precise form of validation.

    
    $ ping 194.126.xx.xx
    PING 194.126.xx.xx (194.126.xx.xx): 56 data bytes
    64 bytes from 194.126.xx.xx: icmp_seq=0 ttl=41 time=372.199 ms
    64 bytes from 194.126.xx.xx: icmp_seq=1 ttl=41 time=372.317 ms
    64 bytes from 194.126.xx.xx: icmp_seq=2 ttl=41 time=372.270 ms
    ^C
    --- 194.126.xx.xx ping statistics ---
    3 packets transmitted, 3 packets received, 0.0% packet loss
    round-trip min/avg/max/stddev = 372.199/372.262/372.317/0.049 ms
    
    
    $ ping -S 103.246.136.1 194.126.xx.xx
    PING 194.126.xx.xx (194.126.xx.xx) from 103.246.136.1: 56 data bytes
    ^C
    --- 194.126.xx.xx ping statistics ---
    4 packets transmitted, 0 packets received, 100.0% packet loss
    
        Figure 3.  Example of ping probes
    

The second question relates to the question of where the client may be positioned in the inter-AS routing space. In this case we are interested to establish whether there is a common AS path segment for a number of AS’s that show evidence of a bogon filter, or whether there is no common path segment.

 

     AS   AS Path from Server to Client
    Common Upstream Path AS Transit Path
  23771     4608 1221 4637     4134 4837 18118 23771
  38283     4608 1221 4637     4134 38283
  45143     4608 1221 4637     7473 9506 45143
  4835     4608 1221 4637     4134 4835
  6983     4608 1221 4637     2828 6983
  9051     4608 1221 4637     3356 42020 9051
  14868     4608 1221 4637     3549 14868
  19332     4608 1221 4637     174 19332
  23252     4608 1221 4637     3356 23252
  25248     4608 1221 4637     3549 5588 25248
  28142     4608 1221 4637     3549 14868 28142
  28378     4608 1221 4637     3356 19332 28378
  28678     4608 1221 4637     3356 8501 28678
  39566     4608 1221 4637     6453 24724 15694 39566
  41088     4608 1221 4637     174 41088
  43153     4608 1221 4637     3320 20804 43153
  44651     4608 1221 4637     3356 8928 47169 44651
  49289     4608 1221 4637     3549 5588 8246 49289
  197100     4608 1221 4637     174 197100

There is no clear evidence that there is the systematic deployment of packet filters in the AS transit paths to these origin AS’s.

Conclusions

There is no obvious evidence that either of the tested prefixes (110.76.136.0/22 and 103.246.136.0/22) are the subject of widespread bogon filtering in the Internet.

There was some observed evidence of filtering of these prefixes, and of a total of 507,450 tests, some 329 tests, or 0.065% of the test population, showed some evidence of filtering of these prefixes.

There is also some evidence that the prefix 103.246.136.0/22 has a slightly greater level of filtering than the 110.76.136.0/22 prefix). This could be due to the fact that 110.0.0.0/8 was allocated to APNIC in November 2008, while 103.0.0.0/8 was allocated to APNIC in February 2011. This may the result of irregularly maintained bogon filters where the filter has not been updated to reflect the current IANA address allocation status.

The experimental technique of using advertisement placement to perform a large number of client-based reachability tests appears to be a relatively effective test methodology. It does not require investment in deployment of a large set of test units across the Internet, and because it uses a client-initiated test methodology it bypasses the limitations of NAT traversal which is associated with server-side initiated reachability tests. The ad placement approach also circumvents the common forms of packet filtering of ICMP, as all the test traffic sits within TCP port 80 transactions.

There is further work to be done to consider how representative this set of tested clients is in terms of the entire population of Internet clients, but as an initial exercise in showing how advertisement placement can be used as a useful mechanism to support some forms of network measurement, this has been a highly informative exercise.