Detecting IP Address Filters

Until recently IP network operators were encouraged to set up so-called “bogon address filters” at the edge of their networks. These filters were intended to discard all incoming traffic where the source address in the IP header was from a block of addresses that was known to be unallocated. The inference was that a matching packet was either an unintentional leak from some privately addressed network domain or was generated using source address spoofing. In either case there is no point in delivering the packet, since it comes from a demonstrably fictitious source.

The initial use of these bogon filters was to mark those /8 address blocks in the IPv4 address space that were held by the IANA, although some tools were subsequently developed (and deployed in more limited contexts) to also mark as bogons address blocks held as unallocated by the Regional Address Registries (RIRs), and other tools were developed to mark address blocks that were consistently used to originate SPAM or malicious traffic, based on collected statistics and other heuristics.

A persistent issue with the use of these bogon filters is that they can lose currency. When addresses change from “unallocated” to “in-use” (such as an IANA allocation action, as happened in the past, or a RIR assignment or allocation of addresses to a Local Internet Registry) or when an address block already in use changes hands, then the rationale for maintaining the entry in a bogon filter almost certainly changes.

Another major issue with bogon filters is that, in general, they are not centrally managed. Many ISPs use local procedures to maintain their bogon filters. So when an address block changes hands, how can the new address holder work out where there are bogon filters that is listing the address block, and how can they get the filters changed?

Some of the Regional Internet Registries, notably the RIPE NCC, have been operating a “Debogon Project” where addresses from selected address blocks are advertised, and end users can test reachability into these address blocks via conventional use of ping and traceroute. Additional active tests can be carried out from the announced blocks to known test networks and devices, performing a test in the opposite direction.

The approach described here is somewhat different, in that it does not rely on the use of cooperating users, nor on the deployment of probes or other network edge devices. Instead, we have used online advertisement placement to co-opt end users to undertake basic reachability tests as a background test in their browsers. The remainder of the document describes the methodology used in these tests and the results.

Reachability Test Setup

The approach used here is to employ the Internet at large to perform the reachability tests. The basic method is to get the client’s browser to fetch a number of a 1 pixel image files under the control of a script that was obtained from a known working IP address. All image files, except the final image, are sourced from domain names that map to IP addresses drawn from the address blocks being tested. The final image is sourced from the same address block as that from which the experiment was delivered from in the first place, so it acts as an end marker to indicate that the user has not interrupted the browser’s actions mid way through the tests.

This test methodology can be achieved in a number of ways, including in javascript or in flash code embedded into a web page. However with such an approach, then the clients who undertake the reachability test are restricted to the set of clients who visit a web page that has the test included in the source of the page. As we were wanting to test a collection of clients from across the entire Internet, we were interested in a way to present these tests in a broader fashion than could be possible with a small number of appropriately equipped web sites.

The approach taken here is to use Google’s Image Advertisements as the vehicle for presenting the reachability tests to end client browsers. Embedded in the ad material is a segment of flash code. The flash code is executed by the client’s browser whenever the ad is delivered to the client, and the code will execute irrespective of whether the ad is “clicked” or not. The flash code fetches a control file from the experiment controller, using a control IP address. This control file includes a number of image URLs, which form the core of the experiment. An example of the control URL and the generated file is shown in Figure 1.


http://www.a.rqa.rand.apnic.net/measureipv6.cgi

rqaa	http://t2.u2980881285.s1326250419.i1110.v1010.a.rqa.rand.apnic.net/1x1.png?
            t2.u2980881285.s1326250419.i1110.v1010.rqaa
rqab	http://t2.u2980881285.s1326250419.i1110.v1010.b.rqa.rand.apnic.net/1x1.png?
            t2.u2980881285.s1326250419.i1110.v1010.rqab
rqac	http://t2.u2980881285.s1326250419.i1110.v1010.c.rqa.rand.apnic.net/1x1.png?
            t2.u2980881285.s1326250419.i1110.v1010.rqac
results	http://results.c.rqa.rand.apnic.net/1x1.png?t2.u2980881285.s1326250419.
            i1110.v1010&r=

    Figure 1. Example of an Experiment Control File

In this example there are two address reachability tests being conducted (“rqaa” and “rqab”), and a control test (“rqac”). The parameters to the test include the identification of the test version, the time the test was conducted and a random number that allows us to recombine the individual experiments back into a single test set when performing analysis of the web server’s log files. The random number in the domain name also prevents pre-cached experiments from being invoked, since every test-set has a different unique DNS name using this method.

The images are 1×1 pixel images, and are fetched asynchronously from page rendering, and additionally are not included in the displayed page. This is to minimize any additional delay on the page view from the testing, and largely takes place without any visible impact on the users browser. The same mechanism is employed by Google Analytics and other page view collection methods.

In this experiment three tests use three wildcard domain names, namely *.a.rqa.rand.apnic.net, *.b.rqa.rand.apnic.net, and *.c.rqa.rand.apnic.net. These domain names map to the following IP addresses, from within the associated network address blocks:

Test	Address	Address Prefix
rqaa	110.76.136.1	110.76.136.0/22
rqab	103.246.136.1	103.246.136.0/22
rqac	203.133.248.12	203.133.248.0/24 (control network)

The server environment was configured such that these test address prefixes were being advertised as reachable from this server, and the server itself was configured with secondary IP addresses that match the three rqa addresses.

The Apache web server of the system was configured with a wildcard virtual server host of the form *.rqa.rand.apnic.net, so that all the images are served from the same server

The Apache web server logs were used for reachability analysis.

If a client successfully performs the entire test then the web logs will contain a sequence of records, an example of which is shown in Figure 2.


110.67.xxx.xxx - - [02/Jan/2012:00:01:38 +0000] "GET /crossdomain.xml
	HTTP/1.1" 200 726 "-"
110.67.xxx.xxx - - [02/Jan/2012:00:01:38 +0000] "GET /measureipv6.cgi?&hash=3103003790
 	HTTP/1.1" 200 140
 	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN" 
110.67.xxx.xxx - - [02/Jan/2012:00:01:39 +0000] "GET /crossdomain.xml
	HTTP/1.1" 200 726 "-" 
110.67.xxx.xxx - - [02/Jan/2012:00:01:39 +0000] "GET /1x1.png?t2.u940792609.
	s1325462498.i1110.v1010.rqab 
	HTTP/1.1" 200 157 
	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN" 
110.67.xxx.xxx - - [02/Jan/2012:00:01:40 +0000] "GET /crossdomain.xml 
	HTTP/1.1" 200 726 "-"
110.67.xxx.xxx - - [02/Jan/2012:00:01:41 +0000] "GET /1x1.png?t2.u940792609.
	s1325462498.i1110.v1010.rqac
	HTTP/1.1" 200 157
	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN"
110.67.xxx.xxx - - [02/Jan/2012:00:01:41 +0000] "GET /crossdomain.xml 
	HTTP/1.1" 200 726 "-"
110.67.xxx.xxx - - [02/Jan/2012:00:01:41 +0000] "GET /1x1.png?t2.u940792609.
	s1325462498.i1110.v1010.rqaa
 	HTTP/1.1" 200 157
	"http://pagead2.googlesyndication.com/pagead/imgad?id=CICAgICQro9QEN" 

    Figure 2 - The Web Server's Log of a Reachability Test

Because the ad itself is served from a remote site, all fetches from the control server are in fact two fetches, the first for the crossdomain policy file to determine if fetches from this site are permitted, and the second being the data file itself.

In the example shown here the experiment generated 8 fetches. The first pair of fetches retrieved the list of URIs to fetch (measurev6.cgi). The next 3 pairs of fetches performed the reachability test itself.

What we are looking for are origin AS’s where there is a systematic failure to retrieve the first, second or both of the first two images, yet a consistent ability to fetch from the control address.

Reachability Test Results

The test was run over 22 days. During that time 513,141 individual client IP addresses performed the reachability test. These IP addresses were sourced from 9,665 originating Autonomous Systems.

In order to eliminate potential problems with individual tests not completing (due to the client moving to a different webpage before the text was complete, for example), we have filtered out all originating AS’s where there are only 1 or 2 individual tests performed. This left 507,121 tests, spanning 4,839 originating AS’s.

3 AS’s were unable to reach the “a” and “b” target (indicating some form of filter on both the 110.76.136.0/22 and 103.246.136.0/22 address blocks) These AS’s were:

AS	Number of Tests	Country	AS Description
23771	26	CN	SXBCTV-AP SXBCTV ,Internet Service Provider
38283	3	CN	CHINANET-SCIDC-AS-AP CHINANET SiChuan Telecom Internet Data Center
45143	50	SG	SINGTELMOBILE-AS-AP SINGTEL MOBILE INTERNET SERVICE PROVIDER

There were no origin AS where more than 2 individual experiments were able to reach the “b” target, but unable to reach the “a” target.

16 AS’s were able to reach the “a” target, but unable to reach the “b” target (the prefix 103.246.136.0./22 appears to be filtered). These AS’s were:

AS	Number of Tests	Country	AS Description
4835	10	CN	CHINANET-IDC-SN China Telecom (Group)
6983	9	US	ITCDELTA – ITC Deltacom
9051	17	LB	IncoNet Data Management sal
14868	55	BR	Companhia Paranaense de Energia – COPEL
19332	7	MX	Marcatel Com, S.A. de C.V.
23252	3	CA	IK – WTC Communications
25248	77	CZ	BLUETONE-AS Ceske Radiokomunikace a.s.
28142	4	BR	DIGITAL DESIGN – SERVICOS DE INFORMATICA LTDA
28378	4	MX	TV Rey de Occidente, S.A. de C.V.
28678	3	PL	KOSMAN-EDU-AS Technical University of Koszalin
39566	12	PL	TRUSTNET-PL-AS trustnet.pl / smarthost.pl hosting datacenter
41088	13	CZ	CZNSYS N-SYS s.r.o.
43153	9	PL	SFERANET-AS SferaNET Sp. z o.o.
44651	12	HU	COMUNIQUE Com.unique Telekommunikacios Szolgaltato Kft.
49289	11	PL	NITKA-NET ELPRO – Elektronika Profesjonalna Waldemar Nitka
197100	4	ES	Esystel Servicios Multimedia, SL

If one were to assume that this is a representative sample of reachability by origin AS, then of the total population of 39,908 AS’s some 66 AS’s could be assumed to be filtering 103.246.136.0/22, and 12 AS’s could be assumed to be filtering both 110.76.136.0/22 and 103.246.136.0/22.

Consideration of Experimental Results

The experiment raises a number of questions, which we will briefly discuss here.

The first question is, can these experimental results be verified by other means?

A simple means of validation of the result is by using ping, with the source address option. We were able to validate a small number of addresses chosen from random from the client set, that where the end systems were responsive to ping packets sources from the “control” address of 203.133.248.12, they were unresponsive to pings when using the tested addresses as a source address. It is noted that not all addresses are responsive to ping probes, due to firewalls and filters, so this by no means a precise form of validation.


$ ping 194.126.xx.xx
PING 194.126.xx.xx (194.126.xx.xx): 56 data bytes
64 bytes from 194.126.xx.xx: icmp_seq=0 ttl=41 time=372.199 ms
64 bytes from 194.126.xx.xx: icmp_seq=1 ttl=41 time=372.317 ms
64 bytes from 194.126.xx.xx: icmp_seq=2 ttl=41 time=372.270 ms
^C
--- 194.126.xx.xx ping statistics ---
3 packets transmitted, 3 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 372.199/372.262/372.317/0.049 ms


$ ping -S 103.246.136.1 194.126.xx.xx
PING 194.126.xx.xx (194.126.xx.xx) from 103.246.136.1: 56 data bytes
^C
--- 194.126.xx.xx ping statistics ---
4 packets transmitted, 0 packets received, 100.0% packet loss

    Figure 3.  Example of ping probes

The second question relates to the question of where the client may be positioned in the inter-AS routing space. In this case we are interested to establish whether there is a common AS path segment for a number of AS’s that show evidence of a bogon filter, or whether there is no common path segment.

AS		AS Path from Server to Client
	Common Upstream Path				AS Transit Path
23771		4608	1221	4637	4134	4837	18118	23771
38283		4608	1221	4637	4134	38283
45143		4608	1221	4637	7473	9506	45143
4835		4608	1221	4637	4134	4835
6983		4608	1221	4637	2828	6983
9051		4608	1221	4637	3356	42020	9051
14868		4608	1221	4637	3549	14868
19332		4608	1221	4637	174	19332
23252		4608	1221	4637	3356	23252
25248		4608	1221	4637	3549	5588	25248
28142		4608	1221	4637	3549	14868	28142
28378		4608	1221	4637	3356	19332	28378
28678		4608	1221	4637	3356	8501	28678
39566		4608	1221	4637	6453	24724	15694	39566
41088		4608	1221	4637	174	41088
43153		4608	1221	4637	3320	20804	43153
44651		4608	1221	4637	3356	8928	47169	44651
49289		4608	1221	4637	3549	5588	8246	49289
197100		4608	1221	4637	174	197100

There is no clear evidence that there is the systematic deployment of packet filters in the AS transit paths to these origin AS’s.

Conclusions

There is no obvious evidence that either of the tested prefixes (110.76.136.0/22 and 103.246.136.0/22) are the subject of widespread bogon filtering in the Internet.

There was some observed evidence of filtering of these prefixes, and of a total of 507,450 tests, some 329 tests, or 0.065% of the test population, showed some evidence of filtering of these prefixes.

There is also some evidence that the prefix 103.246.136.0/22 has a slightly greater level of filtering than the 110.76.136.0/22 prefix). This could be due to the fact that 110.0.0.0/8 was allocated to APNIC in November 2008, while 103.0.0.0/8 was allocated to APNIC in February 2011. This may the result of irregularly maintained bogon filters where the filter has not been updated to reflect the current IANA address allocation status.

The experimental technique of using advertisement placement to perform a large number of client-based reachability tests appears to be a relatively effective test methodology. It does not require investment in deployment of a large set of test units across the Internet, and because it uses a client-initiated test methodology it bypasses the limitations of NAT traversal which is associated with server-side initiated reachability tests. The ad placement approach also circumvents the common forms of packet filtering of ICMP, as all the test traffic sits within TCP port 80 transactions.

There is further work to be done to consider how representative this set of tested clients is in terms of the entire population of Internet clients, but as an initial exercise in showing how advertisement placement can be used as a useful mechanism to support some forms of network measurement, this has been a highly informative exercise.

APNIC

Labs

Javascript is disabled

Detecting IP Address Filters

Reachability Test Setup

Reachability Test Results

Consideration of Experimental Results

Conclusions

Contact us