NANOG 63: The DDOS Conversation

In some ways the details of specific case of DDOS attacks are less material than the larger picture. The Internet always had the potential issue that the aggregate sum of I/O capacity of the edge was massively larger than the interior, and the sum of multiple edge outputs was always greater than the input rate of any single edge.

The DDOS picture exploits this imbalance and drives multiple edges in a way that either saturates a network component or saturates a selected victim edge device.

The attacker has a variety of exploits here, and addressing one weakness simply allows others to be exploited. The current vogue is the massive bandwidth attack using simple query / response UDP protocols where the repose is larger than the query, with a spoofd source address. Send enough queries to enough servers at a fast enough rate and the victim is overwhelmed with useless traffic. The DNS UDP query protocol has been used in this manner, as has the NTP time protocol. You would’ve thought that folk would not expose their SNMP ports and at the very least use decent protection, but evidently SNMP is exploitable, as is chargen and similar. The mantra we repeat to ourselves is that we could stop all this form of attack by isolating the query packets, and the signature of these packets is a false source address. So if we all performed egress filtering using BCP 38 and prevented such packets from leaving every network then the network would again be pristine. Yes? Not really.

TCP has its own issues. SYN flooding is now a quite venerable attack vector, but this form of attack can still be effective in blocking out legitimate connection attempts if the incoming packet rate is sufficiently high. SYN flooding is possible using source address spoofing as well. A variant of this is for the attack system to complete the connection handshake and hold the connection open, consuming the resources of the server. For this form of attack its necessary to take over a large set of compromised end hosts and orchestrate them all to perform the connection. Unfortunately these zombie armies have been a persistent “feature” of the Internet for many years.

So the attacks get larger and larger. What happens then?

If you can’t stop these attacks then you have to absorb them.

Which means deploying massive capacity in your network and across your servers in a way that can absorb the attack traffic and still remain responsive to “genuine” traffic. With significant levels of investment in infrastructure capacity this is a viable approach. But it has the effect of increasing the ante. If you operate a server whose availability is critical in some sense, then you can no longer host it on your own local edge infrastructure. And picking the cheapest hosting solution in some cloud or other is also a high risk path. If you want resiliency then you have little choice but to use a hosting provider who has made major investments in capacity and skills, and has the capability to provide the service in the face of such attacks. These attacks are creating differentials in the online neighbourhoods. Those who can afford premium services can purchase effective protection from such virulent attacks, while those who cannot afford to use such highly resilient service platforms operate in a far more exposed mode without any real form of effective protection against such attacks.

But of course this differentiation in hosting is also apparent in the ISP industry itself. Smaller local providers are being squeezed out through the same means. In order to survive such attacks they are being forced to purchase “scrubbing” services from far larger providers, who are in a position to absorb the incoming traffic and pass on the so-called “genuine” traffic.

The result is of course further pressure to concentrate resources within the industry – the larger providers with capacity and skills can respond to this onslaught of attack traffic, while the the smaller providers simply cannot. They are marginalised and ultimately will get squeezed out if there is no effective way to ameliorate such attacks without resorting to a “big iron” solution.

The picture is not entirely bleak, or at least not yet. There have been some attempts to provide an attack feedback loop, where a victim can pass outward a profile of attack traffic and neighboring networks can filter transit traffic based on this profile and pass the filter onward. One of the more promising approaches is to coopt BGP, and instead of flooding reachability information, propagate a filter profile of the attack traffic. (BGP Flow Spec RFC 5575)