license

These are their brilliant plans to save magazines?

From Jeremy W. Peters’ “In Magazine World, a New Crop of Chiefs” (The New York Times: 28 November 2010):

“This is the changing of the guard from an older school to a newer school,” said Justin B. Smith, president of the Atlantic Media Company. The changes, he added, were part of an inevitable evolution in publishing that was perhaps long overdue. “It is quite remarkable that it took until 2010, 15 years after the arrival of the Internet, for a new generation of leaders to emerge.”

At Time, the world’s largest magazine publisher, Mr. Griffin said he wanted to reintroduce the concept of “charging a fair price, and charging consumers who are interested in the product.” In other words, consumers can expect to pay more. “We spent a tremendous amount of money creating original content, original journalism, fact-checking, sending reporters overseas to cover wars,” he said. “You name it. What we’ve got to do as a business is get fair value for that.” Supplementing that approach, Mr. Griffin said, will be new partnerships within Time Warner, Time Inc.’s parent company, that allow magazines to take advantage of the vast film and visual resources at their disposal. One such partnership in the planning stages, he said, is a deal between a major cosmetics company and InStyle to broadcast from the red carpets of big Hollywood events like the Academy Awards and the Screen Actors Guild Awards.

But one thing Mr. Harty said the company was examining: expanding its licensed products. The company already pulls in more than a billion dollars a year selling products with a Better Homes and Gardens license at Wal-Mart stores. It is now planning to sell plants and bulbs with the magazine’s imprimatur directly to consumers. “We have relationships with all these consumers,” Mr. Harty said. “How can we figure out how to sell them goods and services? We believe that’s a key.”

How the Madden NFL videogame was developed

From Patrick Hruby’s “The Franchise: The inside story of how Madden NFL became a video game dynasty” (ESPN: 22 July 2010):

1982

Harvard grad and former Apple employee Trip Hawkins founds video game maker Electronic Arts, in part to create a football game; one year later, the company releases “One-on-One: Dr. J vs. Larry Bird,” the first game to feature licensed sports celebrities. Art imitates life.

1983-84

Hawkins approaches former Oakland Raiders coach and NFL television analyst John Madden to endorse a football game. Madden agrees, but insists on realistic game play with 22 on-screen players, a daunting technical challenge.

1988-90

EA releases the first Madden football game for the Apple II home computer; a subsequent Sega Genesis home console port blends the Apple II game’s realism with control pad-heavy, arcade-style action, becoming a smash hit.

madden-nfl-covers-sm.jpg

You can measure the impact of “Madden” through its sales: as many as 2 million copies in a single week, 85 million copies since the game’s inception and more than $3 billion in total revenue. You can chart the game’s ascent, shoulder to shoulder, alongside the $20 billion-a-year video game industry, which is either co-opting Hollywood (see “Tomb Raider” and “Prince of Persia”) or topping it (opening-week gross of “Call of Duty: Modern Warfare 2”: $550 million; “The Dark Knight”: $204 million).

Some of the pain was financial. Just as EA brought its first games to market in 1983, the home video game industry imploded. In a two-year span, Coleco abandoned the business, Intellivision went from 1,200 employees to five and Atari infamously dumped thousands of unsold game cartridges into a New Mexico landfill. Toy retailers bailed, concluding that video games were a Cabbage Patch-style fad. Even at EA — a hot home computer startup — continued solvency was hardly assured.

In 1988, “John Madden Football” was released for the Apple II computer and became a modest commercial success.

THE STAKES WERE HIGH for a pair of upstart game makers, with a career-making opportunity and a $100,000 development contract on the line. In early 1990, Troy Lyndon and Mike Knox of San Diego-based Park Place Productions met with Hawkins to discuss building a “Madden” game for Sega’s upcoming home video game console, the Genesis. …

Because the game that made “Madden” a phenomenon wasn’t the initial Apple II release, it was the Genesis follow-up, a surprise smash spawned by an entirely different mindset. Hawkins wanted “Madden” to play out like the NFL. Equivalent stats. Similar play charts. Real football.

In 1990, EA had a market cap of about $60 million; three years later, that number swelled to $2 billion.

In 2004, EA paid the NFL a reported $300 million-plus for five years of exclusive rights to teams and players. The deal was later extended to 2013. Just like that, competing games went kaput. The franchise stands alone, triumphant, increasingly encumbered by its outsize success.

Hawkins left EA in the early 1990s to spearhead 3D0, an ill-fated console maker that became a doomed software house. An icy rift between the company and its founder ensued.

What Google’s book settlement means

Google Book Search
Image via Wikipedia

From Robert Darnton’s “Google & the Future of Books” (The New York Review of Books: 12 February 2009):

As the Enlightenment faded in the early nineteenth century, professionalization set in. You can follow the process by comparing the Encyclopédie of Diderot, which organized knowledge into an organic whole dominated by the faculty of reason, with its successor from the end of the eighteenth century, the Encyclopédie méthodique, which divided knowledge into fields that we can recognize today: chemistry, physics, history, mathematics, and the rest. In the nineteenth century, those fields turned into professions, certified by Ph.D.s and guarded by professional associations. They metamorphosed into departments of universities, and by the twentieth century they had left their mark on campuses…

Along the way, professional journals sprouted throughout the fields, subfields, and sub-subfields. The learned societies produced them, and the libraries bought them. This system worked well for about a hundred years. Then commercial publishers discovered that they could make a fortune by selling subscriptions to the journals. Once a university library subscribed, the students and professors came to expect an uninterrupted flow of issues. The price could be ratcheted up without causing cancellations, because the libraries paid for the subscriptions and the professors did not. Best of all, the professors provided free or nearly free labor. They wrote the articles, refereed submissions, and served on editorial boards, partly to spread knowledge in the Enlightenment fashion, but mainly to advance their own careers.

The result stands out on the acquisitions budget of every research library: the Journal of Comparative Neurology now costs $25,910 for a year’s subscription; Tetrahedron costs $17,969 (or $39,739, if bundled with related publications as a Tetrahedron package); the average price of a chemistry journal is $3,490; and the ripple effects have damaged intellectual life throughout the world of learning. Owing to the skyrocketing cost of serials, libraries that used to spend 50 percent of their acquisitions budget on monographs now spend 25 percent or less. University presses, which depend on sales to libraries, cannot cover their costs by publishing monographs. And young scholars who depend on publishing to advance their careers are now in danger of perishing.

The eighteenth-century Republic of Letters had been transformed into a professional Republic of Learning, and it is now open to amateurs—amateurs in the best sense of the word, lovers of learning among the general citizenry. Openness is operating everywhere, thanks to “open access” repositories of digitized articles available free of charge, the Open Content Alliance, the Open Knowledge Commons, OpenCourseWare, the Internet Archive, and openly amateur enterprises like Wikipedia. The democratization of knowledge now seems to be at our fingertips. We can make the Enlightenment ideal come to life in reality.

What provoked these jeremianic- utopian reflections? Google. Four years ago, Google began digitizing books from research libraries, providing full-text searching and making books in the public domain available on the Internet at no cost to the viewer. For example, it is now possible for anyone, anywhere to view and download a digital copy of the 1871 first edition of Middlemarch that is in the collection of the Bodleian Library at Oxford. Everyone profited, including Google, which collected revenue from some discreet advertising attached to the service, Google Book Search. Google also digitized an ever-increasing number of library books that were protected by copyright in order to provide search services that displayed small snippets of the text. In September and October 2005, a group of authors and publishers brought a class action suit against Google, alleging violation of copyright. Last October 28, after lengthy negotiations, the opposing parties announced agreement on a settlement, which is subject to approval by the US District Court for the Southern District of New York.[2]

The settlement creates an enterprise known as the Book Rights Registry to represent the interests of the copyright holders. Google will sell access to a gigantic data bank composed primarily of copyrighted, out-of-print books digitized from the research libraries. Colleges, universities, and other organizations will be able to subscribe by paying for an “institutional license” providing access to the data bank. A “public access license” will make this material available to public libraries, where Google will provide free viewing of the digitized books on one computer terminal. And individuals also will be able to access and print out digitized versions of the books by purchasing a “consumer license” from Google, which will cooperate with the registry for the distribution of all the revenue to copyright holders. Google will retain 37 percent, and the registry will distribute 63 percent among the rightsholders.

Meanwhile, Google will continue to make books in the public domain available for users to read, download, and print, free of charge. Of the seven million books that Google reportedly had digitized by November 2008, one million are works in the public domain; one million are in copyright and in print; and five million are in copyright but out of print. It is this last category that will furnish the bulk of the books to be made available through the institutional license.

Many of the in-copyright and in-print books will not be available in the data bank unless the copyright owners opt to include them. They will continue to be sold in the normal fashion as printed books and also could be marketed to individual customers as digitized copies, accessible through the consumer license for downloading and reading, perhaps eventually on e-book readers such as Amazon’s Kindle.

After reading the settlement and letting its terms sink in—no easy task, as it runs to 134 pages and 15 appendices of legalese—one is likely to be dumbfounded: here is a proposal that could result in the world’s largest library. It would, to be sure, be a digital library, but it could dwarf the Library of Congress and all the national libraries of Europe. Moreover, in pursuing the terms of the settlement with the authors and publishers, Google could also become the world’s largest book business—not a chain of stores but an electronic supply service that could out-Amazon Amazon.

An enterprise on such a scale is bound to elicit reactions of the two kinds that I have been discussing: on the one hand, utopian enthusiasm; on the other, jeremiads about the danger of concentrating power to control access to information.

Google is not a guild, and it did not set out to create a monopoly. On the contrary, it has pursued a laudable goal: promoting access to information. But the class action character of the settlement makes Google invulnerable to competition. Most book authors and publishers who own US copyrights are automatically covered by the settlement. They can opt out of it; but whatever they do, no new digitizing enterprise can get off the ground without winning their assent one by one, a practical impossibility, or without becoming mired down in another class action suit. If approved by the court—a process that could take as much as two years—the settlement will give Google control over the digitizing of virtually all books covered by copyright in the United States.

Google alone has the wealth to digitize on a massive scale. And having settled with the authors and publishers, it can exploit its financial power from within a protective legal barrier; for the class action suit covers the entire class of authors and publishers. No new entrepreneurs will be able to digitize books within that fenced-off territory, even if they could afford it, because they would have to fight the copyright battles all over again. If the settlement is upheld by the court, only Google will be protected from copyright liability.

Google’s record suggests that it will not abuse its double-barreled fiscal-legal power. But what will happen if its current leaders sell the company or retire? The public will discover the answer from the prices that the future Google charges, especially the price of the institutional subscription licenses. The settlement leaves Google free to negotiate deals with each of its clients, although it announces two guiding principles: “(1) the realization of revenue at market rates for each Book and license on behalf of the Rightsholders and (2) the realization of broad access to the Books by the public, including institutions of higher education.”

What will happen if Google favors profitability over access? Nothing, if I read the terms of the settlement correctly. Only the registry, acting for the copyright holders, has the power to force a change in the subscription prices charged by Google, and there is no reason to expect the registry to object if the prices are too high. Google may choose to be generous in it pricing, and I have reason to hope it may do so; but it could also employ a strategy comparable to the one that proved to be so effective in pushing up the price of scholarly journals: first, entice subscribers with low initial rates, and then, once they are hooked, ratchet up the rates as high as the traffic will bear.

Taxi driver party lines

8th Ave .....Midtown Manhattan
Creative Commons License photo credit: 708718

From Annie Karni’s “Gabbing Taxi Drivers Talking on ‘Party Lines’” (The New York Sun: 11 January 2007):

It’s not just wives at home or relatives overseas that keep taxi drivers tied up on their cellular phones during work shifts. Many cabbies say that when they are chatting on duty, it’s often with their cab driver colleagues on group party lines. Taxi drivers say they use conference calls to discuss directions and find out about congested routes to avoid. They come to depend on one another as first responders, reacting faster even than police to calls from drivers in distress. Some drivers say they participate in group prayers on a party line.

It is during this morning routine, waiting for the first shuttle flights to arrive from Washington and Boston, where many friendships between cabbies are forged and cell phone numbers are exchanged, Mr. Sverdlov said. Once drivers have each other’s numbers, they can use push-to-talk technology to call large groups all at once.

Mr. Sverdlov said he conferences with up to 10 cabbies at a time to discuss “traffic, what’s going on, this and that, and where do cops stay.” He estimated that every month, he logs about 20,000 talking minutes on his cell phone.

While civilian drivers are allowed to use hands-free devices to talk on cell phones while behind the wheel, the Taxi & Limousine Commission imposed a total cell phone ban for taxi drivers on duty in 1999. In 2006, the Taxi & Limousine Commission issued 1,049 summonses for phone use while on duty, up by almost 69% from the 621 summonses it issued the previous year. Drivers caught chatting while driving are fined $200 and receive two-point penalties on their licenses.

Drivers originally from countries like Israel, China, and America, who are few and far between, say they rarely chat on the phone with other cab drivers because of the language barrier. For many South Asians and Russian drivers, however, conference calls that are prohibited by the Taxi & Limousine Commission are mainstays of cabby life.

Some facts about GPL 2 & GPL 3

From Liz Laffan’s “GPLv2 vs GPLv3: The two seminal open source licenses, their roots, consequences and repercussions” (VisionMobile: September 2007):

From a licensing perspective, the vast majority (typically 60-70%) of all open source projects are licensed under the GNU Public License version 2 (GPLv2).

GPLv3 was published in July 2007, some 16 years following the creation of GPLv2. The purpose of this new license is to address some of the areas identified for improvement and clarification in GPLv2 – such as patent indemnity, internationalisation and remedies for inadvertent license infringement (rather than the previous immediate termination effect). The new GPLv3 license is nearly double the length of the GPLv2 …

GPLv3 differs to GPLv2 in several important ways. Firstly it provides more clarity on patent licenses and attempts to clarify what is meant by both a distribution and derivative works. Secondly it revokes the immediate termination of license clause in favour of licensee opportunities to ‘fix’ any violations within a given time-period. In addition there are explicit ‘Additional Terms’ which permits users to choose from a fixed set of alternative terms which can modify the standard GPLv3 terms. These are all welcome, positive moves which should benefit all users of the GPLv3 license.

Nonetheless there are three contentious aspects of GPLv3 that have provoked much discussion in the FOSS community and could deter adoption of GPLv3 by more circumspect users and organisations.

Open source & patents

From Liz Laffan’s “GPLv2 vs GPLv3: The two seminal open source licenses, their roots, consequences and repercussions” (VisionMobile: September 2007):

Cumulatively patents have been doubling practically every year since 1990. Patents are now probably the most contentious issue in software-related intellectual property rights.

However we should also be aware that software written from scratch is as likely to infringe patents as FOSS covered software – due mainly to the increasing proliferation of patents in all software technologies. Consequently the risk of patent infringement is largely comparable whether one chooses to write one’s own software or use software covered by the GPLv2; one will most likely have to self-indemnify against a potential patent infringement claim in both cases.

The F.U.D. (fear, uncertainty and doubt) that surrounds patents in FOSS has been further heightened by two announcements, both instigated by Microsoft. Firstly in November 2006 Microsoft and Novell1 entered into a cross- licensing patent agreement where Microsoft gave Novell assurances that it would not sue the company or its customers if they were to be found infringing Microsoft patents in the Novell Linux distribution. Secondly in May 2007 Microsoft2 restated (having alluded to the same in 2004) that FOSS violates 235 Microsoft patents. Unfortunately, the Redmond giant did not state which patents in particular were being infringed and nor have they initiated any actions against a user or distributor of Linux.

The FOSS community have reacted to these actions by co-opting the patent system and setting up the Patent Commons (http://www.patentcommons.org). This initiative, managed by the Linux Foundation, coordinates and manages a patent commons reference library, documenting information about patent- related pledges in support of Linux and FOSS that are provided by large software companies. Moreover, software giants such as IBM and Nokia have committed not to assert patents against the Linux kernel and other FOSS projects. In addition, the FSF have strengthened the patent clause of GPLv3…

Microsoft Exchange is expensive

From Joel Snyder’s “Exchange: Should I stay or should I go?” (Network World: 9 March 2009):

There are many ways to buy Exchange, depending on how many users you need, but the short answer is that none of them cost less than about $75 per user and can run up to $140 per user for the bundles that include Exchange and Windows Server and user licenses for both of those as well as Forefront, Microsoft’s antispam/antivirus service. …

If you really want to make a case for cost, you could also claim that Exchange requires a $90 Outlook license for each user, a Windows XP or Vista license for each user, and more expensive hardware than a similar open source platform might require.

Bruce Schneier on wholesale, constant surveillance

From Stephen J. Dubner’s interview with Bruce Schneier in “Bruce Schneier Blazes Through Your Questions” (The New York Times: 4 December 2007):

There’s a huge difference between nosy neighbors and cameras. Cameras are everywhere. Cameras are always on. Cameras have perfect memory. It’s not the surveillance we’ve been used to; it’s wholesale surveillance. I wrote about this here, and said this: “Wholesale surveillance is a whole new world. It’s not ‘follow that car,’ it’s ‘follow every car.’ The National Security Agency can eavesdrop on every phone call, looking for patterns of communication or keywords that might indicate a conversation between terrorists. Many airports collect the license plates of every car in their parking lots, and can use that database to locate suspicious or abandoned cars. Several cities have stationary or car-mounted license-plate scanners that keep records of every car that passes, and save that data for later analysis.

“More and more, we leave a trail of electronic footprints as we go through our daily lives. We used to walk into a bookstore, browse, and buy a book with cash. Now we visit Amazon, and all of our browsing and purchases are recorded. We used to throw a quarter in a toll booth; now EZ Pass records the date and time our car passed through the booth. Data about us are collected when we make a phone call, send an e-mail message, make a purchase with our credit card, or visit a Web site.”

What’s happening is that we are all effectively under constant surveillance. No one is looking at the data most of the time, but we can all be watched in the past, present, and future. And while mining this data is mostly useless for finding terrorists (I wrote about that here), it’s very useful in controlling a population.

An analysis of Google’s technology, 2005

From Stephen E. Arnold’s The Google Legacy: How Google’s Internet Search is Transforming Application Software (Infonortics: September 2005):

The figure Google’s Fusion: Hardware and Software Engineering shows that Google’s technology framework has two areas of activity. There is the software engineering effort that focuses on PageRank and other applications. Software engineering, as used here, means writing code and thinking about how computer systems operate in order to get work done quickly. Quickly means the sub one-second response times that Google is able to maintain despite its surging growth in usage, applications and data processing.

Google is hardware plus software

The other effort focuses on hardware. Google has refined server racks, cable placement, cooling devices, and data center layout. The payoff is lower operating costs and the ability to scale as demand for computing resources increases. With faster turnaround and the elimination of such troublesome jobs as backing up data, Google’s hardware innovations give it a competitive advantage few of its rivals can equal as of mid-2005.

How Google Is Different from MSN and Yahoo

Google’s technologyis simultaneously just like other online companies’ technology, and very different. A data center is usually a facility owned and operated by a third party where customers place their servers. The staff of the data center manage the power, air conditioning and routine maintenance. The customer specifies the computers and components. When a data center must expand, the staff of the facility may handle virtually all routine chores and may work with the customer’s engineers for certain more specialized tasks.

Before looking at some significant engineering differences between Google and two of its major competitors, review this list of characteristics for a Google data center.

1. Google data centers – now numbering about two dozen, although no one outside Google knows the exact number or their locations. They come online and automatically, under the direction of the Google File System, start getting work from other data centers. These facilities, sometimes filled with 10,000 or more Google computers, find one another and configure themselves with minimal human intervention.

2. The hardware in a Google data center can be bought at a local computer store. Google uses the same types of memory, disc drives, fans and power supplies as those in a standard desktop PC.

3. Each Google server comes in a standard case called a pizza box with one important change: the plugs and ports are at the front of the box to make access faster and easier.

4. Google racks are assembled for Google to hold servers on their front and back sides. This effectively allows a standard rack, normally holding 40 pizza box servers, to hold 80.

5. A Google data center can go from a stack of parts to online operation in as little as 72 hours, unlike more typical data centers that can require a week or even a month to get additional resources online.

6. Each server, rack and data center works in a way that is similar to what is called “plug and play.” Like a mouse plugged into the USB port on a laptop, Google’s network of data centers knows when more resources have been connected. These resources, for the most part, go into operation without human intervention.

Several of these factors are dependent on software. This overlap between the hardware and software competencies at Google, as previously noted, illustrates the symbiotic relationship between these two different engineering approaches. At Google, from its inception, Google software and Google hardware have been tightly coupled. Google is not a software company nor is it a hardware company. Google is, like IBM, a company that owes its existence to both hardware and software. Unlike IBM, Google has a business model that is advertiser supported. Technically, Google is conceptually closer to IBM (at one time a hardware and software company) than it is to Microsoft (primarily a software company) or Yahoo! (an integrator of multiple softwares).

Software and hardware engineering cannot be easily segregated at Google. At MSN and Yahoo hardware and software are more loosely-coupled. Two examples will illustrate these differences.

Microsoft – with some minor excursions into the Xbox game machine and peripherals – develops operating systems and traditional applications. Microsoft has multiple operating systems, and its engineers are hard at work on the company’s next-generation of operating systems.

Several observations are warranted:

1. Unlike Google, Microsoft does not focus on performance as an end in itself. As a result, Microsoft gets performance the way most computer users do. Microsoft buys or upgrades machines. Microsoft does not fiddle with its operating systems and their subfunctions to get that extra time slice or two out of the hardware.

2. Unlike Google, Microsoft has to support many operating systems and invest time and energy in making certain that important legacy applications such as Microsoft Office or SQLServer can run on these new operating systems. Microsoft has a boat anchor tied to its engineer’s ankles. The boat anchor is the need to ensure that legacy code works in Microsoft’s latest and greatest operating systems.

3. Unlike Google, Microsoft has no significant track record in designing and building hardware for distributed, massively parallelised computing. The mice and keyboards were a success. Microsoft has continued to lose money on the Xbox, and the sudden demise of Microsoft’s entry into the home network hardware market provides more evidence that Microsoft does not have a hardware competency equal to Google’s.

Yahoo! operates differently from both Google and Microsoft. Yahoo! is in mid-2005 a direct competitor to Google for advertising dollars. Yahoo! has grown through acquisitions. In search, for example, Yahoo acquired 3721.com to handle Chinese language search and retrieval. Yahoo bought Inktomi to provide Web search. Yahoo bought Stata Labs in order to provide users with search and retrieval of their Yahoo! mail. Yahoo! also owns AllTheWeb.com, a Web search site created by FAST Search & Transfer. Yahoo! owns the Overture search technology used by advertisers to locate key words to bid on. Yahoo! owns Alta Vista, the Web search system developed by Digital Equipment Corp. Yahoo! licenses InQuira search for customer support functions. Yahoo has a jumble of search technology; Google has one search technology.

Historically Yahoo has acquired technology companies and allowed each company to operate its technology in a silo. Integration of these different technologies is a time-consuming, expensive activity for Yahoo. Each of these software applications requires servers and systems particular to each technology. The result is that Yahoo has a mosaic of operating systems, hardware and systems. Yahoo!’s problem is different from Microsoft’s legacy boat-anchor problem. Yahoo! faces a Balkan-states problem.

There are many voices, many needs, and many opposing interests. Yahoo! must invest in management resources to keep the peace. Yahoo! does not have a core competency in hardware engineering for performance and consistency. Yahoo! may well have considerable competency in supporting a crazy-quilt of hardware and operating systems, however. Yahoo! is not a software engineering company. Its engineers make functions from disparate systems available via a portal.

The figure below provides an overview of the mid-2005 technical orientation of Google, Microsoft and Yahoo.

2005 focuses of Google, MSN, and Yahoo

The Technology Precepts

… five precepts thread through Google’s technical papers and presentations. The following snapshots are extreme simplifications of complex, yet extremely fundamental, aspects of the Googleplex.

Cheap Hardware and Smart Software

Google approaches the problem of reducing the costs of hardware, set up, burn-in and maintenance pragmatically. A large number of cheap devices using off-the-shelf commodity controllers, cables and memory reduces costs. But cheap hardware fails.

In order to minimize the “cost” of failure, Google conceived of smart software that would perform whatever tasks were needed when hardware devices fail. A single device or an entire rack of devices could crash, and the overall system would not fail. More important, when such a crash occurs, no full-time systems engineering team has to perform technical triage at 3 a.m.

The focus on low-cost, commodity hardware and smart software is part of the Google culture.

Logical Architecture

Google’s technical papers do not describe the architecture of the Googleplex as self-similar. Google’s technical papers provide tantalizing glimpses of an approach to online systems that makes a single server share features and functions of a cluster of servers, a complete data center, and a group of Google’s data centers.

The collections of servers running Google applications on the Google version of Linux is a supercomputer. The Googleplex can perform mundane computing chores like taking a user’s query and matching it to documents Google has indexed. Further more, the Googleplex can perform side calculations needed to embed ads in the results pages shown to user, execute parallelized, high-speed data transfers like computers running state-of-the-art storage devices, and handle necessary housekeeping chores for usage tracking and billing.

When Google needs to add processing capacity or additional storage, Google’s engineers plug in the needed resources. Due to self-similarity, the Googleplex can recognize, configure and use the new resource. Google has an almost unlimited flexibility with regard to scaling and accessing the capabilities of the Googleplex.

In Google’s self-similar architecture, the loss of an individual device is irrelevant. In fact, a rack or a data center can fail without data loss or taking the Googleplex down. The Google operating system ensures that each file is written three to six times to different storage devices. When a copy of that file is not available, the Googleplex consults a log for the location of the copies of the needed file. The application then uses that replica of the needed file and continues with the job’s processing.

Speed and Then More Speed

Google uses commodity pizza box servers organized in a cluster. A cluster is group of computers that are joined together to create a more robust system. Instead of using exotic servers with eight or more processors, Google generally uses servers that have two processors similar to those found in a typical home computer.

Through proprietary changes to Linux and other engineering innovations, Google is able to achieve supercomputer performance from components that are cheap and widely available.

… engineers familiar with Google believe that read rates may in some clusters approach 2,000 megabytes a second. When commodity hardware gets better, Google runs faster without paying a premium for that performance gain.

Another key notion of speed at Google concerns writing computer programs to deploy to Google users. Google has developed short cuts to programming. An example is Google’s creating a library of canned functions to make it easy for a programmer to optimize a program to run on the Googleplex computer. At Microsoft or Yahoo, a programmer must write some code or fiddle with code to get different pieces of a program to execute simultaneously using multiple processors. Not at Google. A programmer writes a program, uses a function from a Google bundle of canned routines, and lets the Googleplex handle the details. Google’s programmers are freed from much of the tedium associated with writing software for a distributed, parallel computer.

Eliminate or Reduce Certain System Expenses

Some lucky investors jumped on the Google bandwagon early. Nevertheless, Google was frugal, partly by necessity and partly by design. The focus on frugality influenced many hardware and software engineering decisions at the company.

Drawbacks of the Googleplex

The Laws of Physics: Heat and Power 101

In reality, no one knows. Google has a rapidly expanding number of data centers. The data center near Atlanta, Georgia, is one of the newest deployed. This state-of-the-art facility reflects what Google engineers have learned about heat and power issues in its other data centers. Within the last 12 months, Google has shifted from concentrating its servers at about a dozen data centers, each with 10,000 or more servers, to about 60 data centers, each with fewer machines. The change is a response to the heat and power issues associated with larger concentrations of Google servers.

The most failure prone components are:

  • Fans.
  • IDE drives which fail at the rate of one per 1,000 drives per day.
  • Power supplies which fail at a lower rate.

Leveraging the Googleplex

Google’s technology is one major challenge to Microsoft and Yahoo. So to conclude this cursory and vastly simplified look at Google technology, consider these items:

1. Google is fast anywhere in the world.

2. Google learns. When the heat and power problems at dense data centers surfaced, Google introduced cooling and power conservation innovations to its two dozen data centers.

3. Programmers want to work at Google. “Google has cachet,” said one recent University of Washington graduate.

4. Google’s operating and scaling costs are lower than most other firms offering similar businesses.

5. Google squeezes more work out of programmers and engineers by design.

6. Google does not break down, or at least it has not gone offline since 2000.

7. Google’s Googleplex can deliver desktop-server applications now.

8. Google’s applications install and update without burdening the user with gory details and messy crashes.

9. Google’s patents provide basic technology insight pertinent to Google’s core functionality.

Debt collection business opens up huge security holes

From Mark Gibbs’ “Debt collectors mining your secrets” (Network World: 19 June 2008):

[Bud Hibbs, a consumer advocate] told me any debt collection company has access to an incredible amount of personal data from hundreds of possible sources and the motivation to mine it.

What intrigued me after talking with Hibbs was how the debt collection business works. It turns out pretty much anyone can set up a collections operation by buying a package of bad debts for around $40,000, hiring collectors who will work on commission, and applying for the appropriate city and state licenses. Once a company is set up it can buy access to Axciom and Experian and other databases and start hunting down defaulters.

So, here we have an entire industry dedicated to buying, selling and mining your personal data that has been derived from who knows where. Even better, because the large credit reporting companies use a lot of outsourcing for data entry, much of this data has probably been processed in India or Pakistan where, of course, the data security and integrity are guaranteed.

Hibbs points out that, with no prohibitions on sending data abroad and with the likes of, say, the Russian mafia being interested in the personal information, the probability of identity theft from these foreign data centers is enormous.

George Clinton and the sample troll

From Tim Wu’s “On Copyright’s Authorship Policy” (Internet Archive: 2007):

On May 4, 2001, a one-man corporation named Bridgeport Music, Inc. launched over 500 counts of copyright infringement against more than 800 different artists and labels.1 Bridgeport Music has no employees, and other than copyrights, no reported assets.2 Technically, Bridgeport is a “catalogue company.” Others call it a “sample troll.”

Bridgeport is the owner of valuable copyrights, including many of funk singer George Clinton’s most famous songs – songs which are sampled in a good amount of rap music.3 Bridgeport located every sample of Clinton’s and other copyrights it owned, and sued based on the legal position that any sampling of a sound recording, no matter how minimal or unnoticeable, is still an infringement.

During the course of Bridgeport’s campaign, it has won two important victories. First, the Sixth Circuit, the appellate court for Nashville adopted Bridgeport’s theory of infringement. In Bridgeport Music, Inc. v. Dimension Films,4 the defendants sampled a single chord from the George Clinton tune “Get Off Your Ass and Jam,” changed the pitch, and looped the sound. Despite the plausible defense that one note is but a de minimus use of the work, the Sixth Circuit ruled for Bridgeport and created a stark rule: any sampling, no matter how minimal or undetectable, is a copyright infringement. Said the court in Bridgeport, “Get a license or do not sample. We do not see this as stifling creativity in any significant way.”5 In 2006 Bridgeport convinced a district court to enjoin the sales of the bestselling Notorious B.I.G. album, Ready to Die, for “illegal sampling.”6 A jury then awarded Bridgeport more than four million dollars in damages.7

The Bridgeport cases have been heavily criticized, and taken as a prime example of copyright’s excesses.8 Yet the deeper problem with the Bridgeport litigation is not necessarily a problem of too much copyright. It can be equally concluded that the ownership of the relevant rights is the root of the problem. George Clinton, the actual composer and recording artist, takes a much different approach to sampling. “When hip-hop came out,” said Clinton in an interview with journalist Rick Karr, “I was glad to hear it, especially when it was our songs – it was a way to get back on the radio.”9 Clinton accepts sampling of his work, and has released a three CD collection of his sounds for just that purpose.10 The problem is that he doesn’t own many of his most important copyrights. Instead, it is Bridgeport, the one-man company, that owns the rights to Clinton’s work. In the 1970s Bridgeport, through its owner Armen Boladian, managed to seize most of George Clinton’s copyrights and many other valuable rights. In at least a few cases, Boladian assigned the copyrights to Bridgeport by writing a contract and then faking Clinton’s signature.11 As Clinton puts it “he just stole ‘em.”12 With the copyrights to Clinton’s songs in the hands of Bridgeport – an entity with no vested interest in the works beyond their sheer economic value – the targeting of sampling is not surprising.

1 Tim Wu, Jay-Z Versus the Sample Troll, Slate Magazine, Nov. 16, 2006, http://www.slate.com/id/2153961/.

2 See Bridgeport Music, Inc.’s corporate entity details, Michigan Department of Labor & Economic Growth, available at http://www.dleg.state.mi.us/bcs_corp/dt_corp.asp?id_nbr=190824&name_entity=BRIDGEPORT%20MUSIC,%20INC (last visited Mar. 18, 2007).

3 See Wu, supra note 1.

4 410 F.3d 792 (6th Cir. 2005).

5 Id. at 801.

6 Jeff Leeds, Judge Freezes Notorious B.I.G. Album, N.Y. Times, Mar. 21, 2006, at E2.

7 Id.

8 See, e.g., Matthew R. Broodin, Comment, Bridgeport Music, Inc. v. Dimension Films: The Death of the Substantial Similarity Test in Digital Samping Copyright Infringemnt Claims—The Sixth Circuit’s Flawed Attempt at a Bright Line Rule, 6 Minn. J. L. Sci. & Tech. 825 (2005); Jeffrey F. Kersting, Comment, Singing a Different Tune: Was the Sixth Circuit Justified in Changing the Protection of Sound Recordings in Bridgeport Music, Inc. v. Dimension Films?, 74 U. Cin. L. Rev. 663 (2005) (answering the title question in the negative); John Schietinger, Note, Bridgeport Music, Inc. v. Dimension Films: How the Sixth Circuit Missed a Beat on Digital Music Sampling, 55 DePaul L. Rev. 209 (2005).

9 Interview by Rick Karr with George Clinton, at the 5th Annual Future of Music Policy Summit, Wash. D.C. (Sept. 12, 2005), video clip available at http://www.tvworldwide.com/showclip.cfm?ID=6128&clip=2 [hereinafter Clinton Interview].

10 George Clinton, Sample Some of Disc, Sample Some of D.A.T., Vols. 1-3 (1993-94).

11 Sound Generator, George Clinton awarded Funkadelic master recordings (Jun. 6, 2005), http://www.soundgenerator.com/news/showarticle.cfm?articleid=5555.

12 Clinton Interview, supra note 9.

George Clinton and the sample troll

From Tim Wu’s “On Copyright’s Authorship Policy” (Internet Archive: 2007):

On May 4, 2001, a one-man corporation named Bridgeport Music, Inc. launched over 500 counts of copyright infringement against more than 800 different artists and labels.1 Bridgeport Music has no employees, and other than copyrights, no reported assets.2 Technically, Bridgeport is a “catalogue company.” Others call it a “sample troll.”

Bridgeport is the owner of valuable copyrights, including many of funk singer George Clinton’s most famous songs – songs which are sampled in a good amount of rap music.3 Bridgeport located every sample of Clinton’s and other copyrights it owned, and sued based on the legal position that any sampling of a sound recording, no matter how minimal or unnoticeable, is still an infringement.

During the course of Bridgeport’s campaign, it has won two important victories. First, the Sixth Circuit, the appellate court for Nashville adopted Bridgeport’s theory of infringement. In Bridgeport Music, Inc. v. Dimension Films,4 the defendants sampled a single chord from the George Clinton tune “Get Off Your Ass and Jam,” changed the pitch, and looped the sound. Despite the plausible defense that one note is but a de minimus use of the work, the Sixth Circuit ruled for Bridgeport and created a stark rule: any sampling, no matter how minimal or undetectable, is a copyright infringement. Said the court in Bridgeport, “Get a license or do not sample. We do not see this as stifling creativity in any significant way.”5 In 2006 Bridgeport convinced a district court to enjoin the sales of the bestselling Notorious B.I.G. album, Ready to Die, for “illegal sampling.”6 A jury then awarded Bridgeport more than four million dollars in damages.7

The Bridgeport cases have been heavily criticized, and taken as a prime example of copyright’s excesses.8 Yet the deeper problem with the Bridgeport litigation is not necessarily a problem of too much copyright. It can be equally concluded that the ownership of the relevant rights is the root of the problem. George Clinton, the actual composer and recording artist, takes a much different approach to sampling. “When hip-hop came out,” said Clinton in an interview with journalist Rick Karr, “I was glad to hear it, especially when it was our songs – it was a way to get back on the radio.”9 Clinton accepts sampling of his work, and has released a three CD collection of his sounds for just that purpose.10 The problem is that he doesn’t own many of his most important copyrights. Instead, it is Bridgeport, the one-man company, that owns the rights to Clinton’s work. In the 1970s Bridgeport, through its owner Armen Boladian, managed to seize most of George Clinton’s copyrights and many other valuable rights. In at least a few cases, Boladian assigned the copyrights to Bridgeport by writing a contract and then faking Clinton’s signature.11 As Clinton puts it “he just stole ‘em.”12 With the copyrights to Clinton’s songs in the hands of Bridgeport – an entity with no vested interest in the works beyond their sheer economic value – the targeting of sampling is not surprising.

1 Tim Wu, Jay-Z Versus the Sample Troll, Slate Magazine, Nov. 16, 2006, http://www.slate.com/id/2153961/.

2 See Bridgeport Music, Inc.’s corporate entity details, Michigan Department of Labor & Economic Growth, available at http://www.dleg.state.mi.us/bcs_corp/dt_corp.asp?id_nbr=190824&name_entity=BRI DGEPORT%20MUSIC,%20INC (last visited Mar. 18, 2007).

3 See Wu, supra note 1.

4 410 F.3d 792 (6th Cir. 2005).

5 Id. at 801.

6 Jeff Leeds, Judge Freezes Notorious B.I.G. Album, N.Y. Times, Mar. 21, 2006, at E2.

7 Id.

8 See, e.g., Matthew R. Broodin, Comment, Bridgeport Music, Inc. v. Dimension Films: The Death of the Substantial Similarity Test in Digital Samping Copyright Infringemnt Claims—The Sixth Circuit’s Flawed Attempt at a Bright Line Rule, 6 Minn. J. L. Sci. & Tech. 825 (2005); Jeffrey F. Kersting, Comment, Singing a Different Tune: Was the Sixth Circuit Justified in Changing the Protection of Sound Recordings in Bridgeport Music, Inc. v. Dimension Films?, 74 U. Cin. L. Rev. 663 (2005) (answering the title question in the negative); John Schietinger, Note, Bridgeport Music, Inc. v. Dimension Films: How the Sixth Circuit Missed a Beat on Digital Music Sampling, 55 DePaul L. Rev. 209 (2005).

9 Interview by Rick Karr with George Clinton, at the 5th Annual Future of Music Policy Summit, Wash. D.C. (Sept. 12, 2005), video clip available at http://www.tvworldwide.com/showclip.cfm?ID=6128&clip=2 [hereinafter Clinton Interview].

10 George Clinton, Sample Some of Disc, Sample Some of D.A.T., Vols. 1-3 (1993-94).

11 Sound Generator, George Clinton awarded Funkadelic master recordings (Jun. 6, 2005), http://www.soundgenerator.com/news/showarticle.cfm?articleid=5555.

12 Clinton Interview, supra note 9.

Types of open source licenses

From Eric Steven Raymond’s “Varieties of Open-Source Licensing” (The Art of Unix Programming: 19 September 2003):

MIT or X Consortium License

The loosest kind of free-software license is one that grants unrestricted rights to copy, use, modify, and redistribute modified copies as long as a copy of the copyright and license terms is retained in all modified versions. But when you accept this license you do give up the right to sue the maintainers. …

BSD Classic License

The next least restrictive kind of license grants unrestricted rights to copy, use, modify, and redistribute modified copies as long as a copy of the copyright and license terms is retained in all modified versions, and an acknowledgment is made in advertising or documentation associated with the package. Grantee has to give up the right to sue the maintainers. … Note that in mid-1999 the Office of Technology Transfer of the University of California rescinded the advertising clause in the BSD license. …

Artistic License

The next most restrictive kind of license grants unrestricted rights to copy, use, and locally modify. It allows redistribution of modified binaries, but restricts redistribution of modified sources in ways intended to protect the interests of the authors and the free-software community. …

General Public License

The GNU General Public License (and its derivative, the Library or “Lesser” GPL) is the single most widely used free-software license. Like the Artistic License, it allows redistribution of modified sources provided the modified files bear “prominent notice”.

The GPL requires that any program containing parts that are under GPL be wholly GPLed. (The exact circumstances that trigger this requirement are not perfectly clear to everybody.)

These extra requirements actually make the GPL more restrictive than any of the other commonly used licenses. …

Mozilla Public License

The Mozilla Public License supports software that is open source, but may be linked with closed-source modules or extensions. It requires that the distributed software (“Covered Code”) remain open, but permits add-ons called through a defined API to remain closed. …