How Big is that Network?

How “big” is a network? How many customers are served by an Internet Service Provider?

While some network operators openly publish such numbers, other operators regard such numbers as commercially sensitive information. There are a number of techniques used to estimate the relative size of each Service Provider from public information sources, including the number of IP addresses that are announced by the network, the number of transit customers who use the network, and so on, but the widespread use of NATs in IPv4, the varying IPv6 address plans used by IPv6 service providers, and the varying use of Autonomous Systems (ASes) by retail Service Providers add some considerable uncertainty to such indirect measurement exercises.

The approach described here uses the data generated from the use of an online advertisement placement system to provide the basic input to a measurement process. In this case Google’s Ad delivery network has been used in a long running non-targeted advertisement placement program which is aimed at collecting a very large collection of user’s IP addresses over an extended period. We use this to measure the deployment of IPv6, the extent of use of DNSSEC validation in the Internet, and other forms of measurement of the adoption of various technologies. Using the data from the BGP routing system each user IP address gathered from the ad placement can be mapped to an originating AS number.

If the advertisement placement strategy were such that each part of the Internet was targeted uniformly for ad placement, irrespective of location, than these counts of advertisement impressions per origin AS would be a good indicator of the relative size of each AS in terms of the population of customers served by each AS. However this is not the case, and we have observed that advertisements are placed with different levels of relative intensity in different countries, and in order to compile a uniform estimate of the customer population served by each AS we need to compensate for this.

The data set used to normalise the original ad impression numbers on a country-by-country basis is the estimate of Internet Users per country, published by the ITU-T (http://www.itu.int/net4/itu-d/icteye/). We assume that the Google ad placement process is uniformly distributed within each country, and assume that the ITU-T estimates of user population are an acceptable estimate. This then allows us to estimate the relative size of each AS in terms of the estimated population of users served by each AS.

This is an estimate of customer populations per AS; it makes a number of critical assumptions, and has a number of weaknesses. This approach assumes that each AS is located within a single country, and its customers are also located in that same country. While this is so in many cases, there are a number of cases where large retail service providers span a number of countries with a single AS. This approach also does not use secure connections to the measurement server. While care has been taken with use of unique URLs in the measurement, it still admits the use of web proxy middleware, and the measurement approach is biased towards overcounting in networks that use web proxy services. This is particularly a problem when the web proxy is located in a different AS than the end customers. The instrumentation in the ad is not accessible in all forms of mobile devices, and this approach tends to undercount the customers in service networks with high populations of mobile users.

As well as an internet-wide report, there are also views on a country-by-country basis. but here the assumptions behind the measurement add to the uncertainty levels, particularly the observation that this measurement approach assumes that ASes and their customers are located in a single country. When we look at individual countries, then multi-national AS and customer networks tend to skew this form of measurement.

This report is accessible at http://stats.labs.apnic.net/aspop.