The math behind Flash Worms

From Stuart Staniford, David Moore, Vern Paxson, & Nicholas Weaver’s “The Top Speed of Flash Worms” [PDF] (29 October 2004):

Flash worms follow a precomputed spread tree using prior knowledge of all systems vulnerable to the wormÃ¢â‚¬â„¢s exploit. In previous work we suggested that a flash worm could saturate one million vulnerable hosts on the Internet in under 30 seconds [18]. We grossly over-estimated.

In this paper, we revisit the problem in the context of single packet UDP worms (inspired by Slammer and Witty). Simulating a flash version of Slammer, calibrated by current Internet latency measurements and observed worm packet delivery rates, we show that a worm could saturate 95% of one million vulnerable hosts on the Internet in 510 milliseconds. A similar worm using a TCP based service could 95% saturate in 1.3 seconds. …

Since Code Red in July 2001 [11], worms have been of great interest in the security research community. This is because worms can spread so fast that existing signature-based anti-virus and intrusion-prevention defenses risk being irrelevant; signatures cannot be manually generated fast enough …

The premise of a flash worm is that a worm releaser has somehow acquired a list of vulnerable addresses, perhaps by stealthy scanning of the target address space or perhaps by obtaining a database of parties to the vulnerable protocol. The worm releaser, in advance, computes an efficient spread tree and encodes it in the worm. This allows the worm to be far more efficient than a scan- ning worm; it does not make large numbers of wild guesses for every successful infection. Instead, it successfully infects on most attempts. This makes it less vulnerable to containment defenses based on looking for missed connections [7, 16, 24], or too many connections [20, 25]. …

A difficulty for the flash worm releaser is a lack of robustness if the list of vulnerable addresses is imperfect. Since it is assembled in advance, and networks constantly change, the list is likely to be more-or-less out of date by the time of use. This has two effects. Firstly, a certain proportion of actually vulnerable and reachable machines may not be on the list, thus preventing the worm from saturating as fully as otherwise possible. More seriously, some ad- dresses on the list may not be vulnerable. If such nodes are near the base of the spread tree, they may prevent large numbers of vulnerable machines from being infected by the worm. Very deep spread trees are particularly prone to this. Thus in thinking about flash worms, we need to explore the issue of robustness as well as speed. …

The Slammer worm [10, 22] of January 2003 was the fastest scanning worm to date by far and is likely close to the lower bound on the size of a worm. Data on observed Slammer infections (and on those of the similar Witty worm) provide us with estimates for packet rate and minimum code size in future flash worms. Slammer infected MicrosoftÃ¢â‚¬â„¢s SQL server. A single UDP packet served as exploit and worm and required no acknowledgment. The size of the data was 376 bytes, giving a 404 byte IP packet. This consisted of the following sections:

Ã¢â‚¬Â¢ IP header
Ã¢â‚¬Â¢ UDP header
Ã¢â‚¬Â¢ Data to overflow buffer and gain control
Ã¢â‚¬Â¢ Code to find the addresses of needed functions.
Ã¢â‚¬Â¢ Code to initialize a UDP socket
Ã¢â‚¬Â¢ Code to seed the pseudo-random number generator
Ã¢â‚¬Â¢ Code to generate a random address
Ã¢â‚¬Â¢ Code to copy the worm to the address via the socket …

In this paper, we assume that the target vulnerable population is N = 1000000 (one million hosts-somewhat larger than the 360, 000 infected by Code Red [11]). Thus in much less than a sec- ond, the initial host can directly infect a first generation of roughly 5,000 – 50,000 intermediate nodes, leaving each of those with only 20-200 hosts to infect to saturate the population. There would be no need for a third layer in the tree.

This implies that the address list for the intermediate hosts can fit in the same packet as the worm; 200 addresses only consumes 800 bytes. A flash version of Slammer need only be slightly different than the original: the address list of nodes to be infected would be carried immediately after the end of the code, and the final loop could traverse that list sending out packets to infect it (instead of generating pseudo-random addresses). …

The graph indicates clearly that such flash worms can indeed be extraordinarily fast-infecting 95% of hosts in 510ms, and 99% in 1.2s. There is a long tail at the end due to the long tail in Internet latency data; some parts of the Internet are poorly connected and take a few seconds to reach. …

Can these results be extended to TCP services? If so, then our results are more grave; TCP offers worm writers a wealth of additional services to exploit. In this section we explore these issues. We conclude that top-speed propagation is viable for TCP worms, too, at the cost of an extra round-trip in latency to establish the connection and double the bandwidth if we want to quickly recover from loss. …

We believe a TCP worm could be written to be not much larger than Slammer. In addition to that 404 bytes, it needs a few more ioctl calls to set up a low level socket to send crafted SYN packets, and to set up a separate thread to listen for SYN-ACKs and send out copies of the worm. We estimate 600 bytes total. Such a worm could send out SYNs at line rate, confident that the SYN-ACKs would come back slower due to latency spread. The initial node can maintain a big enough buffer for the SYN-ACKs and the secondary nodes only send out a small number of SYNs. Both will likely be limited by the latency of the SYN-ACKs returning rather than the small amount of time required to deliver all the worms at their respective line rates.

To estimate the performance of such a small TCP flash worm, we repeated the Monte Carlo simulation we performed for the UDP worm with the latency increased by a factor of three for the hand- shake and the outbound delivery rates adjusted for 40 byte SYN packets. The results are shown in Figure 6. This simulation predicts 95% compromise after 1.3s, and 99% compromise after 3.3s. Thus TCP flash worms are a little slower than UDP ones because of the handshake latency, but can still be very fast. …

It appears that the optimum solution for the attacker – considering the plausible near-term worm defenses – is for a flash worm author to simply ignore the defenses and concentrate on making the worm as fast and reliable as possible, rather than slowing the worm to avoid detection. Any system behind a fully working defense can simply be considered as resistant, which the worm author counters by using the resiliency mechanisms outlined in the previous sections, combined with optimizing for minimum infection time.

Thus, for the defender, the current best hope is to keep the list of vulnerable addresses out of the hands of the attacker. …

The fastest worm seen in the wild so far was Slammer [10]. That was a random scanning worm, but saturated over 90% of vulnerable machines in under 10 minutes, and appears to have mainly been limited by bandwidth. The early exponential spread had an 8.5s time constant.

In this paper, we performed detailed analysis of how long a flash worm might take to spread on the contemporary Internet. These analyses use simulations based on actual data about Internet latencies and observed packet delivery rates by worms. Flash worms can complete their spread extremly quickly – with most infections occuring in much less than a second for single packet UDP worms and only a few seconds for small TCP worms. Anyone designing worm defenses needs to bear these time factors in mind.