analysis

An overview of Flash Worms

From Stuart Staniford, Gary Grim, & Roelof Jonkman’s “Flash Worms: Thirty Seconds to Infect the Internet” (Silicon Defense: 16 August 2001):

In a recent very ingenious analysis, Nick Weaver at UC Berkeley proposed the possibility of a Warhol Worm that could spread across the Internet and infect all vulnerable servers in less than 15 minutes (much faster than the hours or days seen in Worm infections to date, such as Code Red).

In this note, we observe that there is a variant of the Warhol strategy that could plausibly be used and that could result in all vulnerable servers on the Internet being infected in less than thirty seconds (possibly significantly less). We refer to this as a Flash Worm, or flash infection. …

For the well funded three-letter agency with an OC12 connection to the Internet, we believe a scan of the entire Internet address space can be conducted in a little less than two hours (we estimate about 750,000 syn packets per second can be fit down the 622Mbps of an OC12, allowing for ATM/AAL framing of the 40 byte TCP segments. The return traffic will be smaller in size than the outbound. Faster links could scan even faster. …

Given that an attacker has the determination and foresight to assemble a list of all or most Internet connected addresses with the relevant service open, a worm can spread most efficiently by simply attacking addresses on that list. There are about 12 million web servers on the Internet (according to Netcraft), so the size of that particular address list would be 48MB, uncompressed. …

In conclusion, we argue that a small worm that begins with a list including all likely vulnerable addresses, and that has initial knowledge of some vulnerable sites with high-bandwidth links, can infect almost all vulnerable servers on the Internet in less than thirty seconds.

An overview of Flash Worms Read More »

The Vitruvian Triad & the Urban Triad

From Andrés Duany’s “Classic Urbanism“:

From time to time there appears a concept of exceptional longevity. In architecture, the pre-eminent instance is the Vitruvian triad of Comoditas, Utilitas, e Venustas. This Roman epigram was propelled into immortality by Lord Burlington’s felicitous translation as Commodity, Firmness and Delight.

It has thus passed down the centuries and remains authoritative, even if not always applied in practice; Commodity: That a building must accommodate its program; Firmness: That it must stand up to the natural elements, among them gravity; Delight: that it must be satisfying to the eye, is with the aberrant exception of the tiny, current avant garde, the ideal of architecture. …

Let me propose the urban triad of Function, Disposition and Configuration as categories that would both describe and “test” the urban performance of a building.

Function describes the use to which the building lends itself, towards the ideal of mixed-use. In urbanism the range of function a first cut may include: exclusively residential, primarily residential, primarily commercial or exclusively commercial. The middle two being the best in urban performance although the extremes have justification in the urban to rural transect. An elaboration should probably differentiate the function at the all-important sidewalk level from the function above.

Disposition describes the location of the building on its lot or site. This may range from a building placed across the frontage of its lot, creating a most urban condition to the rural condition of the building freestanding in the center of its site. Perhaps the easiest way to categorize the disposition of the building is by describing it by its yards: The rearyard building has the building along the frontage, the courtyard building internalizes the space and is just as urban, the sideyard building is the zero-lot line or “Charleston single house” and the edgeyard building is a freestanding object closest to the rural edge of the transect.

The third component of the urban triad is Configuration. This describes the massing, height of a building and, for those who believe that harmony is a tool of urbanism, the architectural syntax and constructional tectonic. It can be argued that the surface of a building is a tool of urbanism no less than its form. Silence of expression is required to achieve the “wall” that defines public space, and that reserves the exalted configuration to differentiate the public building. Harmony in the architectural language is the secret of mixed-use. People seem not to mind variation of function as long as the container looks similar. It is certainly a concern of urbanism.

The Vitruvian Triad & the Urban Triad Read More »

Social network analysis by the NSA

From John Diamond and Leslie Cauley’s “Pre-9/11 records help flag suspicious calling” (USA TODAY: 22 May 2006):

Armed with details of billions of telephone calls, the National Security Agency used phone records linked to the Sept. 11, 2001 attacks to create a template of how phone activity among terrorists looks, say current and former intelligence officials who were briefed about the program. …

The “call detail records” are the electronic information that is logged automatically each time a call is initiated. For more than 20 years, local and long-distance companies have used call detail records to figure out how much to charge each other for handling calls and to determine problems with equipment.

In addition to the number from which a call is made, the detail records are packed with information. Also included: the number called; the route a call took to reach its final destination; the time, date and place where a call started and ended; and the duration of the call. The records also note whether the call was placed from a cellphone or from a traditional “land line.” …

Calls coming into the country from Pakistan, Afghanistan or the Middle East, for example, are flagged by NSA computers if they are followed by a flood of calls from the number that received the call to other U.S. numbers.

The spy agency then checks the numbers against databases of phone numbers linked to terrorism, the officials say. Those include numbers found during searches of computers or cellphones that belonged to terrorists.

It is not clear how much terrorist activity, if any, the data collection has helped to find.

Social network analysis by the NSA Read More »

How to grade or judge water

From Gideon Lewis-Kraus’s “The Water Rush” (Oxford American):

On the tables in front of us are pink “trial” judging sheets. Across the top run a series of boxes for water numbers, and down the side is the set of criteria we’ll be using. Arthur goes through the criteria one by one, and explains what to look for.

The first criterion is Appearance, which is rated on a scale from zero to five. Good is colorless; bad is cloudy. Self-explanatory, so Arthur moves along quickly to Odor, which is also based on five possible points. The box on the sheet has one example of a positive descriptor on the left side—in this case, “none”—and a row of possible characterizations of water odor on the right side: chlorine, plastic, sulfur, chemical, musty. Next on the list is Flavor, rated out of ten points; the left side of the box reads “clean” and the right side has the identical list of identifiers provided for Odor, plus “salty.” Mouthfeel is back down to a five-point criterion, and the relevant distinction is “refreshing/stale.” There’s a five-point box for Aftertaste (this one on a spectrum from “thirst-quenching” to “residue”), and finally we come to Overall Impressions.

Overall Impressions is scored out of fourteen points, which makes the total available points for each entrant an eyebrow-raising forty-nine. The fourteen-point scale is provided to us on an attached sheet. It was developed by a food scientist at UC Berkeley named William Bruvold. In the ’60s, he pioneered experiments in the acceptability levels of total dissolved solids in water, and he used his students as subjects; he incrementally increased the turbidity of the sample until the water came to resemble Turkish coffee and his students refused to drink it. Out of these experiments came this scale, which Arthur tantalizingly referred to the day I met him in Santa Barbara. Arthur seems a bit sheepish about the language of the document.

The fourteen-point scale, in its entirety, reads exactly as follows (all formatting original):

1. This water has a TERRIBLE, STRONG TASTE. I can’t stand it in my mouth.

2. This water has a TERRIBLE TASTE. I would never drink it.

3. This water has a REAL BAD TASTE. I don’t think I would ever drink it.

4. This water has a REAL BAD TASTE. I would drink it only in a serious emergency.

5. This water has a BAD TASTE. I could not accept it as my everyday drinking water, but I could drink it in an emergency.

6. This water has a BAD TASTE. I don’t think I could accept it as my everyday drinking water.

7. This water has a FAIRLY BAD TASTE. I think I could accept it as my everyday drinking water.

8. This water has a MILD BAD TASTE. I could accept it as my everyday drinking water.

9. This water has an OFF TASTE. I could accept it as my everyday drinking water.

10. This water seems to have a MILD OFF TASTE. I would be satisfied to have it as my everyday drinking water.

11. This water seems to have a LITTLE TASTE. I would be satisfied to have it as my everyday drinking water.

12. This water has NO SPECIAL TASTE at all. I would be happy to have it for my everyday drinking water.

13. This water TASTES GOOD. I would be happy to have it for my everyday drinking water.

14. This water tastes REAL GOOD. I would be very happy to have it for my everyday drinking water.

How to grade or judge water Read More »

Matching identities across databases, anonymously

From MIT Technology Review‘s’ “Blindfolding Big Brother, Sort of“:

In 1983, entrepreneur Jeff Jonas founded Systems Research and Development (SRD), a firm that provided software to identify people and determine who was in their circle of friends. In the early 1990s, the company moved to Las Vegas, where it worked on security software for casinos. Then, in January 2005, IBM acquired SRD and Jonas became chief scientist in the company’s Entity Analytic Solutions group.

His newest technology, which allows entities such as government agencies to match an individual found in one database to that same person in another database, is getting a lot of attention from governments, banks, health-care providers, and, of course, privacy advocates. Jonas claims that his technology is as good at protecting privacy as it as at finding important information. …

JJ: The technique that we have created allows the bank to anonymize its customer data. When I say “anonymize,” I mean it changes the name and address and date of birth, or whatever data they have about an identity, into a numeric value that is nonhuman readable and nonreversible. You can’t run the math backwards and compute from the anonymized value what the original input value was. …

Here’s the scenario: The government has a list of people we should never let into the country. It’s a secret. They don’t want people in other countries to know. And the government tends to not share this list with corporate America. Now, if you have a cruise line, you want to make sure you don’t have people getting on your boat who shouldn’t even be in the United States in the first place. Prior to the U.S. Patriot Act, the government couldn’t go and subpoena 100,000 records every day from every company. Usually, the government would have to go to a cruise line and have a subpoena for a record. Section 215 [of the Patriot Act] allows the government to go to a business entity and say, “We want all your records.” Now, the Fourth Amendment, which is “search and seizure,” has a legal test called “reasonable and particular.” Some might argue that if a government goes to a cruise line and says, “Give us all your data,” it is hard to envision that this would be reasonable and particular.

But what other solution do they have? There was no other solution. Our Anonymous Resolution technology would allow a government to take its secret list and anonymize it, allow a cruise line to anonymize their passenger list, and then when there’s a match it would tell the government: “record 123.” So they’d look it up and say, “My goodness, it’s Majed Moqed.” And it would tell them which record to subpoena from which organization. Now it’s back to reasonable and particular. ….

TR: How is this is based on earlier work you did for Las Vegas casinos?

JJ: The ability to figure out if two people are the same despite all the natural variability of how people express their identity is something we really got a good understanding of assisting the gaming industry. We also learned how people try to fabricate fake identities and how they try to evade systems. It was learning how to do that at high speed that opened the door to make this next thing possible. Had we not solved that in the 1990s, we would not have been able to conjure up a method to do anonymous resolution.

Matching identities across databases, anonymously Read More »

Problems with fingerprints for authentication

From lokedhs’ “There is much truth in what you say”:

The problem with fingerprints is that it’s inherently a very insecure way of authentication for two reasons:

Firstly, you can’t change it if it leaks out. A password or a credit card number can be easily changed and the damage minimised in case of an information leak. Doing this with a fingerprint is much harder.

Secondly, the fingerprint is very hard to keep secret. Your body has this annoying ability to leave copies of your identification token all over the place, very easy for anyone to pick up.

Problems with fingerprints for authentication Read More »

A technical look at the Morris Worm of 1988

From Donn Seeley’s “The Internet Worm of 1988: A Tour of the Worm“:

November 3, 1988 is already coming to be known as Black Thursday. System administrators around the country came to work on that day and discovered that their networks of computers were laboring under a huge load. If they were able to log in and generate a system status listing, they saw what appeared to be dozens or hundreds of “shell” (command interpreter) processes. If they tried to kill the processes, they found that new processes appeared faster than they could kill them. Rebooting the computer seemed to have no effect within minutes after starting up again, the machine was overloaded by these mysterious processes.

… The worm had taken advantage of lapses in security on systems that were running 4.2 or 4.3 BSD UNIX or derivatives like SunOS. These lapses allowed it to connect to machines across a network, bypass their login authentication, copy itself and then proceed to attack still more machines. The massive system load was generated by multitudes of worms trying to propagate the epidemic. …

The worm consists of a 99-line bootstrap program written in the C language, plus a large relocatable object file that comes in VAX and Sun-3 flavors. …

The activities of the worm break down into the categories of attack and defense. Attack consists of locating hosts (and accounts) to penetrate, then exploiting security holes on remote systems to pass across a copy of the worm and run it. The worm obtains host addresses by examining the system tables /etc/hosts.equiv and /.rhosts, user files like .forward and. rhosts, dynamic routing information produced by the netstat program, and finally randomly generated host addresses on local networks. It ranks these by order of preference, trying a file like /etc/hosts.equiv first because it contains names of local machines that are likely to permit unauthenticated connections. Penetration of a remote system can be accomplished in any of three ways. The worm can take advantage of a bug in the finger server that allows it to download code in place of a finger request and trick the server into executing it. The worm can use a “trap door” in the sendmail SMTP mail service, exercising a bug in the debugging code that allows it to execute a command interpreter and download code across a mail connection. If the worm can penetrate a local account by guessing its password, it can use the rexec and rsh remote command interpreter services to attack hosts that share that account. In each case the worm arranges to get a remote command interpreter which it can use to copy over, compile and execute the 99-line bootstrap. The bootstrap sets up its own network connection with the local worm and copies over the other files it needs, and using these pieces a remote worm is built and the infection procedure starts over again. …

When studying a tricky program like this, it’s just as important to establish what the program does not do as what it does do. The worm does not delete a system’s files: it only removes files that it created in the process of bootstrapping. The program does not attempt to incapacitate a system by deleting important files, or indeed any files. It does not remove log files or otherwise interfere with normal operation other than by consuming system resources. The worm does not modify existing files: it is not a virus. The worm propagates by copying itself and compiling itself on each system; it does not modify other programs to do its work for it. Due to its method of infection, it can’t count on sufficient privileges to be able to modify programs. The worm does not install trojan horses: its method of attack is strictly active, it never waits for a user to trip over a trap. Part of the reason for this is that the worm can’t afford to waste time waiting for trojan horses-it must reproduce before it is discovered. Finally, the worm does not record or transmit decrypted passwords: except for its own static list of favorite passwords, the worm does not propagate cracked passwords on to new worms nor does it transmit them back to some home base. This is not to say that the accounts that the worm penetrated are secure merely because the worm did not tell anyone what their passwords were, of course-if the worm can guess an account’s password, certainly others can too. The worm does not try to capture superuser privileges: while it does try to break into accounts, it doesn’t depend on having particular privileges to propagate, and never makes special use of such privileges if it somehow gets them. The worm does not propagate over uucp or X.25 or DECNET or BITNET: it specifically requires TCP/IP. The worm does not infect System V systems unless they have been modified to use Berkeley network programs like sendmail, fingerd and rexec.

A technical look at the Morris Worm of 1988 Read More »

A short explanation of moral rights in IP

From Betsy Rosenblatt’s “Moral Rights Basics“:

The term “moral rights” is a translation of the French term “droit moral,” and refers … to the ability of authors to control the eventual fate of their works. An author is said to have the “moral right” to control her work. … Moral rights protect the personal and reputational, rather than purely monetary, value of a work to its creator.

The scope of a creator’s moral rights is unclear, and differs with cultural conceptions of authorship and ownership, but may include the creator’s right to receive or decline credit for her work, to prevent her work from being altered without her permission, to control who owns the work, to dictate whether and in what way the work is displayed, and/or to receive resale royalties. Under American Law, moral rights receive protection through judicial interpretation of several copyright, trademark, privacy, and defamation statues, and through 17 U.S.C. §106A, known as the Visual Artists Rights Act of 1990 (VARA). VARA applies exclusively to visual art. In Europe and elsewhere, moral rights are more broadly protected by ordinary copyright law.

In the United States, the term “moral rights” typically refers to the right of an author to prevent revision, alteration, or distortion of her work, regardless of who owns the work. Moral rights as outlined in VARA also allow an author of a visual work to avoid being associated with works that are not entirely her own, and to prevent the defacement of her works. …

Under VARA, moral rights automatically vest in the author of a “work of visual art.” For the purposes of VARA, visual art includes paintings, drawings, prints, sculptures, and photographs, existing in a single copy or a limited edition of 200 signed and numbered copies or fewer. In order to be protected, a photograph must have been taken for exhibition purposes only. VARA only protects works of “recognized stature;” posters, maps, globes, motion pictures, electronic publications, and applied art are among the categories of visual works explicitly excluded from VARA protection. …

Moral rights are not transferrable, and end only with the life of the author. Even if the author has conveyed away a work or her copyright in it, she retains the moral rghts to the work under VARA. Authors may, however, waive their moral rights if do so in writing.

What constitutes infringement of moral rights?

VARA grants two rights to authors of visual works: the right of attribution, and the right of integrity. The right of attribution allows an author to prevent misattribution of a work, and to require that the authorship of the work not be disclosed (i.e. remain anonymous). The right of integrity bars intentional distortion, mutilation, or other modification of a work if that distortion is likely to harm the author’s reputation, and prevents the destruction of any work of recognized stature.

A short explanation of moral rights in IP Read More »

Fundamentalism as limited reading

From Douglas Rushkoff’s “Faith = Illness: Why I’ve had it with religious tolerance“:

When religions are practiced, as they are by a majority of those in developed nations, today, as a kind of nostalgic little ritual – a community event or an excuse to get together and not work – it doesn’t really screw anything up too badly. But when they radically alter our ability to contend with reality, cope with difference, or implement the most basic ethical provisions, they must be stopped. …

As I’ve always understood them, and as I try to convey them in my comic book, the stories in the Bible are less significant because they happened at some moment in history than because their underlying dynamics seem to be happening in all moments. We are all Cain, struggling with our feelings about a sibling who seems to be more blessed than we are. We are always escaping the enslaved mentality of Egypt and the idolatry we practiced there. We are all Mordechai, bristling against the pressure to bow in subservience to our bosses.

But true believers don’t have this freedom. Whether it’s because they need the Bible to prove a real estate claim in the Middle East, because they don’t know how to relate something that didn’t really happen, or because they require the threat of an angry super-being who sees all in order behave like good children, true believers – what we now call fundamentalists – are not in a position to appreciate the truth and beauty of the Holy Scriptures. No, the multi-dimensional document we call the Bible is not available to them because, for them, all those stories have to be accepted as historical truth.

Fundamentalism as limited reading Read More »

Bruce Schneier on steganography

From Bruce Schneier’s “Steganography: Truths and Fictions“:

Steganography is the science of hiding messages in messages. … In the computer world, it has come to mean hiding secret messages in graphics, pictures, movies, or sounds. …

The point of steganography is to hide the existence of the message, to hide the fact that the parties are communicating anything other than innocuous photographs. This only works when it can be used within existing communications patterns. I’ve never sent or received a GIF in my life. If someone suddenly sends me one, it won’t take a rocket scientist to realize that there’s a steganographic message hidden somewhere in it. If Alice and Bob already regularly exchange files that are suitable to hide steganographic messages, then an eavesdropper won’t know which messages — if any — contain the messages. If Alice and Bob change their communications patterns to hide the messages, it won’t work. An eavesdropper will figure it out.

… Don’t use the sample image that came with the program when you downloaded it; your eavesdropper will quickly recognize that one. Don’t use the same image over and over again; your eavesdropper will look for the differences between that indicate the hidden message. Don’t use an image that you’ve downloaded from the net; your eavesdropper can easily compare the image you’re sending with the reference image you downloaded.

Bruce Schneier on steganography Read More »

Developing nations stand up to US/UN bullying on copyright

From “Statement by India at the Inter-Sessional Intergovernmental Meeting on a Development Agenda For WIPO, April 11-13, 2005” (emphasis added):

“Development”, in WIPO’s terminology means increasing a developing country’s capacity to provide protection to the owners of intellectual property rights. This is quite a the opposite of what developing countries understand when they refer to the ‘development dimension’. The document presented by the Group of Friends of Development corrects this misconception – that development dimension means technical assistance.

The real “development” imperative is ensuring that the interest of Intellectual Property owners is not secured at the expense of the users of IP, of consumers at large, and of public policy in general. …

The legal monopoly granted to IP owners is an exceptional departure from the general principle of competitive markets as the best guarantee for securing the interest of society. The rationale for the exception is not that extraction of monopoly profits by the innovator is, of and in itself, good for society and so needs to be promoted. Rather, that properly controlled, such a monopoly, by providing an incentive for innovation, might produce sufficient benefits for society to compensate for the immediate loss to consumers as a result of the existence of a monopoly market instead of a competitive market. Monopoly rights, then, granted to IP holders is a special incentive that needs to be carefully calibrated by each country, in the light of its own circumstances, taking into account the overall costs and benefits of such protection. …

The current emphasis of Technical Assistance on implementation and enforcement issues is misplaced. IP Law enforcement is embedded in the framework of all law enforcement in the individual countries. It is unrealistic, and even undesirable to expect that the enforcement of IP laws will be privileged over the enforcement of other laws in the country. Society faces a considerable challenge to effectively protect, and resolve disputes over, physical property. To expect that the police, the lawyers and the courts should dedicate a sizable part of society’s enforcement resources for protecting intangible intellectual property, is unrealistic. …

In conclusion, it is important that developed countries and WIPO acknowledge that IP protection is an important policy instrument for developing countries, one that needs to be used carefully. While the claimed benefits of strong IP protection for developing countries are a matter of debate – and nearly always in the distant future – such protection invariably entails substatial real an immediate costs for these countries. In formulating its IP policy, therefore, each country needs to have sufficient flexibility so that the cost of IP protection does not outweigh the benefits.

Developing nations stand up to US/UN bullying on copyright Read More »

Copyright stupidity: arguments & numbers

From Financial Times” “James Boyle: Deconstructing stupidity“:

Thomas Macaulay told us copyright law is a tax on readers for the benefit of writers, a tax that shouldn’t last a day longer than necessary. …

Since only about 4 per cent of copyrighted works more than 20 years old are commercially available, this locks up 96 per cent of 20th century culture to benefit 4 per cent. The harm to the public is huge, the benefit to authors, tiny. …

We need to deconstruct the culture of IP stupidity, to understand it so we can change it. But this is a rich and complex stupidity, like a fine Margaux. I can only review a few flavours.

Maximalism: The first thing to realize is that many decisions are driven by honest delusion, not corporate corruption. The delusion is maximalism: the more intellectual property rights we create, the more innovation. This is clearly wrong; rights raise the cost of innovation inputs (lines of code, gene sequences, data.) Do their monopolistic and anti-competitive effects outweigh their incentive effects? That’s the central question, but many of our decision makers seem never to have thought of it.

The point was made by an exchange inside the Committee that shaped Europe’s ill-starred Database Directive. It was observed that the US, with no significant property rights over unoriginal compilations of data, had a much larger database industry than Europe which already had significant “sweat of the brow” protection in some countries. Europe has strong rights, the US weak. The US is winning.

Did this lead the committee to wonder for a moment whether Europe should weaken its rights? No. Their response was that this showed we had to make the European rights much stronger. …

Authorial Romance: Part of the delusion depends on the idea that inventors and artists create from nothing. Who needs a public domain of accessible material if one can create out of thin air? But in most cases this simply isn’t true; artists, scientists and technologists build on the past. …

An Industry Contract: Who are the subjects of IP? They used to be companies. You needed a printing press or a factory to trigger the landmines of IP. The law was set up as a contract between industry groups. This was a cosy arrangement, but it is no longer viable. The citizen-publishers of cyberspace, the makers of free software, the scientists of distributed data-analysis are all now implicated in the IP world. The decision-making structure has yet to adjust. …

Fundamentally, though, the views I have criticised here are not merely stupidity. They constitute an ideology, a worldview, like flat earth-ism. …

Copyright stupidity: arguments & numbers Read More »

The Witty Worm was special

From CAIDA’s “The Spread of the Witty Worm“:

On Friday March 19, 2004 at approximately 8:45pm PST, an Internet worm began to spread, targeting a buffer overflow vulnerability in several Internet Security Systems (ISS) products, including ISS RealSecure Network, RealSecure Server Sensor, RealSecure Desktop, and BlackICE. The worm takes advantage of a security flaw in these firewall applications that was discovered earlier this month by eEye Digital Security. Once the Witty worm infects a computer, it deletes a randomly chosen section of the hard drive, over time rendering the machine unusable. The worm’s payload contained the phrase “(^.^) insert witty message here (^.^)” so it came to be known as the Witty worm.

While the Witty worm is only the latest in a string of self-propagating remote exploits, it distinguishes itself through several interesting features:

  • Witty was the first widely propagated Internet worm to carry a destructive payload.
  • Witty was started in an organized manner with an order of magnitude more ground-zero hosts than any previous worm.
  • Witty represents the shortest known interval between vulnerability disclosure and worm release — it began to spread the day after the ISS vulnerability was publicized.
  • Witty spread through a host population in which every compromised host was doing something proactive to secure their computers and networks.
  • Witty spread through a population almost an order of magnitude smaller than that of previous worms, demonstrating the viability of worms as an automated mechanism to rapidly compromise machines on the Internet, even in niches without a software monopoly. …

Once Witty infects a host, the host sends 20,000 packets by generating packets with a random destination IP address, a random size between 796 and 1307 bytes, and a destination port. The worm payload of 637 bytes is padded with data from system memory to fill this random size and a packet is sent out from source port 4000. After sending 20,000 packets, Witty seeks to a random point on the hard disk, writes 65k of data from the beginning of iss-pam1.dll to the disk. After closing the disk, the worm repeats this process until the machine is rebooted or until the worm permanently crashes the machine.

Witty Worm Spread

With previous Internet worms, including Code-Red, Nimda, and SQL Slammer, a few hosts were seeded with the worm and proceeded to spread it to the rest of the vulnerable population. The spread was slow early on and then accelerates dramatically as the number of infected machines spewing worm packets to the rest of the Internet rises. Eventually as the victim population becomes saturated, the spread of the worm slows because there are few vulnerable machines left to compromise. Plotted on a graph, this worm growth appears as an S-shaped exponential growth curve called a sigmoid.

At 8:45:18pm[4] PST on March 19, 2004, the network telescope received its first Witty worm packet. In contrast to previous worms, we observed 110 hosts infected in the first ten seconds, and 160 at the end of 30 seconds. The chances of a single instance of the worm infecting 110 machines so quickly are vanishingly small — worse than 10-607. This rapid onset indicates that the worm used either a hitlist or previously compromised vulnerable hosts to start the worm. …

After the sharp rise in initial coordinated activity, the Witty worm followed a normal exponential growth curve for a pathogen spreading in a fixed population. Witty reached its peak after approximately 45 minutes, at which point the majority of vulnerable hosts had been infected. After that time, the churn caused by dynamic addressing causes the IP address count to inflate without any additional Witty infections. At the peak of the infection, Witty hosts flooded the Internet with more than 90Gbits/second of traffic (more than 11 million packets per second). …

The vulnerable host population pool for the Witty worm was quite different from that of previous virulent worms. Previous worms have lagged several weeks behind publication of details about the remote-exploit bug, and large portions of the victim populations appeared to not know what software was running on their machines, let alone take steps to make sure that software was up to date with security patches. In contrast, the Witty worm infected a population of hosts that were proactive about security — they were running firewall software. The Witty worm also started to spread the day after information about the exploit and the software upgrades to fix the bug were available. …

By infecting firewall devices, Witty proved particularly adept at thwarting security measures and successfully infecting hosts on internal networks. …

The Witty worm incorporates a number of dangerous characteristics. It is the first widely spreading Internet worm to actively damage infected machines. It was started from a large set of machines simultaneously, indicating the use of a hit list or a large number of compromised machines. Witty demonstrated that any minimally deployed piece of software with a remotely exploitable bug can be a vector for wide-scale compromise of host machines without any action on the part of a victim. The practical implications of this are staggering; with minimal skill, a malevolent individual could break into thousands of machines and use them for almost any purpose with little evidence of the perpetrator left on most of the compromised hosts.

The Witty Worm was special Read More »

Why are some people really good at some things?

From Stephen J. Dubner & Steven D. Levitt’s “A Star Is Made” (The New York Times):

Anders Ericsson, a 58-year-old psychology professor at Florida State University, … is the ringleader of what might be called the Expert Performance Movement, a loose coalition of scholars trying to answer an important and seemingly primordial question: When someone is very good at a given thing, what is it that actually makes him good? …

In other words, whatever innate differences two people may exhibit in their abilities to memorize, those differences are swamped by how well each person “encodes” the information. And the best way to learn how to encode information meaningfully, Ericsson determined, was a process known as deliberate practice.

Deliberate practice entails more than simply repeating a task – playing a C-minor scale 100 times, for instance, or hitting tennis serves until your shoulder pops out of its socket. Rather, it involves setting specific goals, obtaining immediate feedback and concentrating as much on technique as on outcome. …

Their work, compiled in the “Cambridge Handbook of Expertise and Expert Performance,” a 900-page academic book that will be published next month, makes a rather startling assertion: the trait we commonly call talent is highly overrated. Or, put another way, expert performers – whether in memory or surgery, ballet or computer programming – are nearly always made, not born. And yes, practice does make perfect. …

Ericsson’s research suggests a third cliché as well: when it comes to choosing a life path, you should do what you love – because if you don’t love it, you are unlikely to work hard enough to get very good. Most people naturally don’t like to do things they aren’t “good” at. So they often give up, telling themselves they simply don’t possess the talent for math or skiing or the violin. But what they really lack is the desire to be good and to undertake the deliberate practice that would make them better. …

Ericsson has noted that most doctors actually perform worse the longer they are out of medical school. Surgeons, however, are an exception. That’s because they are constantly exposed to two key elements of deliberate practice: immediate feedback and specific goal-setting.

Why are some people really good at some things? Read More »

Clay Shirky on why the Semantic Web will fail

From Clay Shirky’s “The Semantic Web, Syllogism, and Worldview“:

What is the Semantic Web good for?

The simple answer is this: The Semantic Web is a machine for creating syllogisms. A syllogism is a form of logic, first described by Aristotle, where “…certain things being stated, something other than what is stated follows of necessity from their being so.” [Organon]

The canonical syllogism is:

Humans are mortal
Greeks are human
Therefore, Greeks are mortal

with the third statement derived from the previous two.

The Semantic Web is made up of assertions, e.g. “The creator of shirky.com is Clay Shirky.” Given the two statements

– Clay Shirky is the creator of shirky.com
– The creator of shirky.com lives in Brooklyn

you can conclude that I live in Brooklyn, something you couldn’t know from either statement on its own. From there, other expressions that include Clay Shirky, shirky.com, or Brooklyn can be further coupled.

The Semantic Web specifies ways of exposing these kinds of assertions on the Web, so that third parties can combine them to discover things that are true but not specified directly. This is the promise of the Semantic Web — it will improve all the areas of your life where you currently use syllogisms.

Which is to say, almost nowhere. …

Despite their appealing simplicity, syllogisms don’t work well in the real world, because most of the data we use is not amenable to such effortless recombination. As a result, the Semantic Web will not be very useful either. …

In the real world, we are usually operating with partial, inconclusive or context-sensitive information. When we have to make a decision based on this information, we guess, extrapolate, intuit, we do what we did last time, we do what we think our friends would do or what Jesus or Joan Jett would have done, we do all of those things and more, but we almost never use actual deductive logic. …

Syllogisms sound stilted in part because they traffic in absurd absolutes. …

There is a list of technologies that are actually political philosophy masquerading as code, a list that includes Xanadu, Freenet, and now the Semantic Web. The Semantic Web’s philosophical argument — the world should make more sense than it does — is hard to argue with. The Semantic Web, with its neat ontologies and its syllogistic logic, is a nice vision. However, like many visions that project future benefits but ignore present costs, it requires too much coordination and too much energy to effect in the real world, where deductive logic is less effective and shared worldview is harder to create than we often want to admit.

Clay Shirky on why the Semantic Web will fail Read More »

The structure & meaning of the URL as key to the Web’s success

From Clay Shirky’s “The Semantic Web, Syllogism, and Worldview“:

The systems that have succeeded at scale have made simple implementation the core virtue, up the stack from Ethernet over Token Ring to the web over gopher and WAIS. The most widely adopted digital descriptor in history, the URL, regards semantics as a side conversation between consenting adults, and makes no requirements in this regard whatsoever: sports.yahoo.com/nfl/ is a valid URL, but so is 12.0.0.1/ftrjjk.ppq. The fact that a URL itself doesn’t have to mean anything is essential — the Web succeeded in part because it does not try to make any assertions about the meaning of the documents it contained, only about their location.

The structure & meaning of the URL as key to the Web’s success Read More »

Thoughts on tagging/folksonomy

From Ulises Ali Mejias’ “A del.icio.us study: Bookmark, Classify and Share: A mini-ethnography of social practices in a distributed classification community“:

This principle of distribution is at work in socio-technical systems that allow users to collaboratively organize a shared set of resources by assigning classifiers, or tags, to each item. The practice is coming to be known as free tagging, open tagging, ethnoclassification, folksonomy, or faceted hierarchy (henceforth referred to in this study as distributed classification) …

One important feature of systems such as these is that they do not impose a rigid taxonomy. Instead, they allow users to assign whatever classifiers they choose. Although this might sound counter-productive to the ultimate goal of organizing content, in practice it seems to work rather well, although it does present some drawbacks. For example, most people will probably classify pictures of cats by using the tag ‘cats.’ But what happens when some individuals use ‘cat’ or ‘feline’ or ‘meowmeow’ …

It seems that while most people might not be motivated to contribute to a pre-established system of classification that may not meet their needs, or to devise new and complex taxonomies of their own, they are quite happy to use distributed systems of classification that are quick and able to accommodate their personal (and ever changing) systems of classification. …

But distributed classification does not accrue benefits only to the individual. It is a very social endeavor in which the community as a whole can benefit. Jon Udell describes some of the individual and social possibilities of this method of classification:

These systems offer lots of ways to visualize and refine the tag space. It’s easy to know whether a tag you’ve used is unique or, conversely, popular. It’s easy to rename a tag across a set of items. It’s easy to perform queries that combine tags. Armed with such powerful tools, people can collectively enrich shared data. (Udell 2004) …

Set this [an imposed taxonomy] against the idea of allowing a user to add tags to any given document in the corpus. Like Del.icio.us, there needn’t be a pre-defined hierarchy or lexicon of terms to use; one can simply lean on the power of ethnoclassification to build that lexicon dynamically. As such, it will dynamically evolve as usages change and shift, even as needs change and shift. (Williams, 2004)

The primary benefit of free tagging is that we know the classification makes sense to users… For a content creator who is uploading information into such a system, being able to freely list subjects, instead of choosing from a pre-approved “pick list,” makes tagging content much easier. This, in turn, makes it more likely that users will take time to classify their contributions. (Merholz, 2004)

Folksonomies work best when a number of users all describe the same piece of information. For instance, on del.icio.us, many people have bookmarked wikipedia (http://del.icio.us/url/bca8b85b54a7e6c01a1bcfaf15be1df5), each with a different set of words to describe it. Among the various tags used, del.icio.us shows that reference, wiki, and encyclopedia are the most popular. (Wikipedia entry for folksonomy, retrieved December 15, 2004 from http://en.wikipedia.org/wiki/Folksonomy)

Of course, this approach is not without its potential problems:

With no one controlling the vocabulary, users develop multiple terms for identical concepts. For example, if you want to find all references to New York City on Del.icio.us, you’ll have to look through “nyc,” “newyork,” and “newyorkcity.” You may also encounter the inverse problem — users employing the same term for disparate concepts. (Merholz, 2004) …

But as Clay Shirky remarks, this solution might diminish some of the benefits that we can derive from folksonomies:

Synonym control is not as wonderful as is often supposed, because synonyms often aren’t. Even closely related terms like movies, films, flicks, and cinema cannot be trivially collapsed into a single word without loss of meaning, and of social context … (Shirky, 2004) …

The choice of tags [in the entire del.icio.us system] follows something resembling the Zipf or power law curve often seen in web-related traffic. Just six tags (python, delicious/del.icio.us, programming, hacks, tools, and web) account for 80% of all the tags chosen, and a long tail of 58 other tags make up the remaining 20%, with most occurring just once or twice … In the del.icio.us community, the rich get richer and the poor stay poor via http://del.icio.us/popular. Links noted by enough users within a short space of time get listed here, and many del.icio.us users use it to keep up with the zeitgeist. (Biddulph, 2004) …

Thoughts on tagging/folksonomy Read More »

Bring down the cell network with SMS spam

From John Schwartz’s “Text Hackers Could Jam Cellphones, a Paper Says“:

Malicious hackers could take down cellular networks in large cities by inundating their popular text-messaging services with the equivalent of spam, said computer security researchers, who will announce the findings of their research today.

Such an attack is possible, the researchers say, because cellphone companies provide the text-messaging service to their networks in a way that could allow an attacker who jams the message system to disable the voice network as well.

And because the message services are accessible through the Internet, cellular networks are open to the denial-of-service attacks that occur regularly online, in which computers send so many messages or commands to a target that the rogue data blocks other machines from connecting.

By pushing 165 messages a second into the network, said Patrick D. McDaniel, a professor of computer science and engineering at Pennsylvania State University and the lead researcher on the paper, “you can congest all of Manhattan.”

Also see http://www.smsanalysis.org/.

Bring down the cell network with SMS spam Read More »

Bruce Schneier on phishing

From Bruce Schneier’s “Phishing“:

Phishing, for those of you who have been away from the Internet for the past few years, is when an attacker sends you an e-mail falsely claiming to be a legitimate business in order to trick you into giving away your account info — passwords, mostly. When this is done by hacking DNS, it’s called pharming. …

In general, two Internet trends affect all forms of identity theft. The widespread availability of personal information has made it easier for a thief to get his hands on it. At the same time, the rise of electronic authentication and online transactions — you don’t have to walk into a bank, or even use a bank card, in order to withdraw money now — has made that personal information much more valuable. …

The newest variant, called “spear phishing,” involves individually targeted and personalized e-mail messages that are even harder to detect. …

It’s not that financial institutions suffer no losses. Because of something called Regulation E, they already pay most of the direct costs of identity theft. But the costs in time, stress, and hassle are entirely borne by the victims. And in one in four cases, the victims have not been able to completely restore their good name.

In economics, this is known as an externality: It’s an effect of a business decision that is not borne by the person or organization making the decision. Financial institutions have no incentive to reduce those costs of identity theft because they don’t bear them. …

If there’s one general precept of security policy that is universally true, it is that security works best when the entity that is in the best position to mitigate the risk is responsible for that risk.

Bruce Schneier on phishing Read More »