technology

What happens to IP when it’s easy to copy anything?

From Bruce Sterling’s “2009 Will Be a Year of Panic” (Seed: 29 January 2009):

Let’s consider seven other massive reservoirs of potential popular dread. Any one of these could erupt, shattering the fragile social compact we maintain with one another in order to believe things contrary to fact.

2. Intellectual property. More specifically, the fiat declaration that properties that are easy to reproduce shouldn’t be reproduced.

Declaring that “information wants to be free” is an ideological stance. A real-world situation where information can’t be anything but free, where digital information cannot be monetized, is bizarre and deeply scary. No banker or economist anywhere has the ghost of clue what to do under such conditions.

Intellectual property made sense and used to work rather well when conditions of production favored it. Now they don’t. If it’s simple to copy just one single movie, some gray area of fair use can be tolerated. If it becomes easy to copy a million movies with one single button-push, this vast economic superstructure is reduced to rags. Our belief in this kind of “property” becomes absurd.

To imagine that real estate is worthless is strange, though we’ve somehow managed to do that. But our society is also built on the supposed monetary worth of unreal estate. In fact, the planet’s most advanced economies are optimized to create pretty much nothing else. The ultimate global consequences of this situation’s abject failure would rank with the collapse of Communism.

What happens to IP when it’s easy to copy anything? Read More »

Give CLEAR your info, watch CLEAR lose your info

From “Missing SFO Laptop With Sensitive Data Found” (CBS5: 5 August 2008):

The company that runs a fast-pass security prescreening program at San Francisco International Airport said Tuesday that it found a laptop containing the personal information of 33,000 people more than a week after it apparently went missing.

The Transportation Security Administration announced late Monday that it had suspended new enrollments to the program, known as Clear, after the unencrypted computer was reported stolen at SFO.

The laptop was found Tuesday morning in the same company office where it supposedly had gone missing on July 26, said spokeswoman Allison Beer.

“It was not in an obvious location,” said Beer, who said an investigation was under way to determine whether the computer was actually stolen or had just been misplaced.

The laptop contained personal information on applicants to the program, including names, address and birth dates, and in some cases driver’s license, passport or green card numbers, the company said.

The laptop did not contain Social Security numbers, credit card numbers or fingerprint or iris images used to verify identities at the checkpoints, Beer said.

In a statement, the company said the information on the laptop, which was originally reported stolen from its locked office, “is secured by two levels of password protection.” Beer called the fact that the personal information itself was not encrypted “a mistake” that the company would fix.

Give CLEAR your info, watch CLEAR lose your info Read More »

How ARP works

From Chris Sanders’ “Packet School 201 – Part 1 (ARP)” (Completely Full of I.T.: 23 December 2007):

The basic idea behind ARP is for a machine to broadcast its IP address and MAC address to all of the clients in its broadcast domain in order to find out the IP address associated with a particular MAC address. Basically put, it looks like this:

Computer A – “Hey everybody, my IP address is XX.XX.XX.XX, and my MAC address is XX:XX:XX:XX:XX:XX. I need to send something to whoever has the IP address XX.XX.XX.XX, but I don’t know what their hardware address is. Will whoever has this IP address please respond back with their MAC address?

All of the other computers that receive the broadcast will simply ignore it, however, the one who does have the requested IP address will send its MAC address to Computer A. With this information in hand, the exchange of data can being.

Computer B – “Hey Computer A. I am who you are looking for with the IP address of XX.XX.XX.XX. My MAC address is XX:XX:XX:XX:XX:XX.

One of the best ways I’ve seen this concept described is through the limousine driver analogy. If you have ever flown, then chances are when you get off of a plane, you have seen a limo driver standing with a sign bearing someone’s last name. Here, the driver knows the name of the person he is picking up, but doesn’t know what they look like. The driver holds up the sign so that everyone can see it. All of the people getting off of the plane see the sign, and if it isn’t them, they simply ignore it. The person whose name is on the card however, sees it, approaches the driver, and identifies himself.

How ARP works Read More »

ODF compared & constrasted with OOXML

From Sam Hiser’s “Achieving Openness: A Closer Look at ODF and OOXML” (ONLamp.com: 14 June 2007):

An open, XML-based standard for displaying and storing data files (text documents, spreadsheets, and presentations) offers a new and promising approach to data storage and document exchange among office applications. A comparison of the two XML-based formats–OpenDocument Format (“ODF”) and Office Open XML (“OOXML”)–across widely accepted “openness” criteria has revealed substantial differences, including the following:

  • ODF is developed and maintained in an open, multi-vendor, multi-stakeholder process that protects against control by a single organization. OOXML is less open in its development and maintenance, despite being submitted to a formal standards body, because control of the standard ultimately rests with one organization.
  • ODF is the only openly available standard, published fully in a document that is freely available and easy to comprehend. This openness is reflected in the number of competing applications in which ODF is already implemented. Unlike ODF, OOXML’s complexity, extraordinary length, technical omissions, and single-vendor dependencies combine to make alternative implementation unattractive as well as legally and practically impossible.
  • ODF is the only format unencumbered by intellectual property rights (IPR) restrictions on its use in other software, as certified by the Software Freedom Law Center. Conversely, many elements designed into the OOXML formats but left undefined in the OOXML specification require behaviors upon document files that only Microsoft Office applications can provide. This makes data inaccessible and breaks work group productivity whenever alternative software is used.
  • ODF offers interoperability with ODF-compliant applications on most of the common operating system platforms. OOXML is designed to operate fully within the Microsoft environment only. Though it will work elegantly across the many products in the Microsoft catalog, OOXML ignores accepted standards and best practices regarding its use of XML.

Overall, a comparison of both formats reveals significant differences in their levels of openness. While ODF is revealed as sufficiently open across all four key criteria, OOXML shows relative weakness in each criteria and offers fundamental flaws that undermine its candidacy as a global standard.

ODF compared & constrasted with OOXML Read More »

The future of security

From Bruce Schneier’s “Security in Ten Years” (Crypto-Gram: 15 December 2007):

Bruce Schneier: … The nature of the attacks will be different: the targets, tactics and results. Security is both a trade-off and an arms race, a balance between attacker and defender, and changes in technology upset that balance. Technology might make one particular tactic more effective, or one particular security technology cheaper and more ubiquitous. Or a new emergent application might become a favored target.

By 2017, people and organizations won’t be buying computers and connectivity the way they are today. The world will be dominated by telcos, large ISPs and systems integration companies, and computing will look a lot like a utility. Companies will be selling services, not products: email services, application services, entertainment services. We’re starting to see this trend today, and it’s going to take off in the next 10 years. Where this affects security is that by 2017, people and organizations won’t have a lot of control over their security. Everything will be handled at the ISPs and in the backbone. The free-wheeling days of general-use PCs will be largely over. Think of the iPhone model: You get what Apple decides to give you, and if you try to hack your phone, they can disable it remotely. We techie geeks won’t like it, but it’s the future. The Internet is all about commerce, and commerce won’t survive any other way.

Marcus Ranum: … Another trend I see getting worse is government IT know-how. At the rate outsourcing has been brain-draining the federal workforce, by 2017 there won’t be a single government employee who knows how to do anything with a computer except run PowerPoint and Web surf. Joking aside, the result is that the government’s critical infrastructure will be almost entirely managed from the outside. The strategic implications of such a shift have scared me for a long time; it amounts to a loss of control over data, resources and communications.

Bruce Schneier: … I’m reminded of the post-9/11 anti-terrorist hysteria — we’ve confused security with control, and instead of building systems for real security, we’re building systems of control. Think of ID checks everywhere, the no-fly list, warrantless eavesdropping, broad surveillance, data mining, and all the systems to check up on scuba divers, private pilots, peace activists and other groups of people. These give us negligible security, but put a whole lot of control in the government’s hands.

That’s the problem with any system that relies on control: Once you figure out how to hack the control system, you’re pretty much golden. So instead of a zillion pesky worms, by 2017 we’re going to see fewer but worse super worms that sail past our defenses.

The future of security Read More »

My new book – Google Apps Deciphered – is out!

I’m really proud to announce that my 5th book is now out & available for purchase: Google Apps Deciphered: Compute in the Cloud to Streamline Your Desktop. My other books include:

(I’ve also contributed to two others: Ubuntu Hacks: Tips & Tools for Exploring, Using, and Tuning Linux and Microsoft Vista for IT Security Professionals.)

Google Apps Deciphered is a guide to setting up Google Apps, migrating to it, customizing it, and using it to improve productivity, communications, and collaboration. I walk you through each leading component of Google Apps individually, and then show my readers exactly how to make them work together for you on the Web or by integrating them with your favorite desktop apps. I provide practical insights on Google Apps programs for email, calendaring, contacts, wikis, word processing, spreadsheets, presentations, video, and even Google’s new web browser Chrome. My aim was to collect together and present tips and tricks I’ve gained by using and setting up Google Apps for clients, family, and friends.

Here’s the table of contents:

  • 1: Choosing an Edition of Google Apps
  • 2: Setting Up Google Apps
  • 3: Migrating Email to Google Apps
  • 4: Migrating Contacts to Google Apps
  • 5: Migrating Calendars to Google Apps
  • 6: Managing Google Apps Services
  • 7: Setting Up Gmail
  • 8: Things to Know About Using Gmail
  • 9: Integrating Gmail with Other Software and Services
  • 10: Integrating Google Contacts with Other Software and Services
  • 11: Setting Up Google Calendar
  • 12: Things to Know About Using Google Calendar
  • 13: Integrating Google Calendar with Other Software and Services
  • 14: Things to Know About Using Google Docs
  • 15: Integrating Google Docs with Other Software and Services
  • 16: Setting Up Google Sites
  • 17: Things to Know About Using Google Sites
  • 18: Things to Know About Using Google Talk
  • 19: Things to Know About Using Start Page
  • 20: Things to Know About Using Message Security and Recovery
  • 21: Things to Know About Using Google Video
  • Appendix A: Backing Up Google Apps
  • Appendix B: Dealing with Multiple Accounts
  • Appendix C: Google Chrome: A Browser Built for Cloud Computing

If you want to know more about Google Apps and how to use it, then I know you’ll enjoy and learn from Google Apps Deciphered. You can read about and buy the book at Amazon (http://www.amazon.com/Google-Apps-Deciphered-Compute-Streamline/dp/0137004702) for $26.39. If you have any questions or comments, don’t hesitate to contact me at scott at granneman dot com.

My new book – Google Apps Deciphered – is out! Read More »

A single medium, with a single search engine, & a single info source

From Nicholas Carr’s “All hail the information triumvirate!” (Rough Type: 22 January 2009):

Today, another year having passed, I did the searches [on Google] again. And guess what:

World War II: #1
Israel: #1
George Washington: #1
Genome: #1
Agriculture: #1
Herman Melville: #1
Internet: #1
Magna Carta: #1
Evolution: #1
Epilepsy: #1

Yes, it’s a clean sweep for Wikipedia.

The first thing to be said is: Congratulations, Wikipedians. You rule. Seriously, it’s a remarkable achievement. Who would have thought that a rag-tag band of anonymous volunteers could achieve what amounts to hegemony over the results of the most popular search engine, at least when it comes to searches for common topics.

The next thing to be said is: what we seem to have here is evidence of a fundamental failure of the Web as an information-delivery service. Three things have happened, in a blink of history’s eye: (1) a single medium, the Web, has come to dominate the storage and supply of information, (2) a single search engine, Google, has come to dominate the navigation of that medium, and (3) a single information source, Wikipedia, has come to dominate the results served up by that search engine. Even if you adore the Web, Google, and Wikipedia – and I admit there’s much to adore – you have to wonder if the transformation of the Net from a radically heterogeneous information source to a radically homogeneous one is a good thing. Is culture best served by an information triumvirate?

It’s hard to imagine that Wikipedia articles are actually the very best source of information for all of the many thousands of topics on which they now appear as the top Google search result. What’s much more likely is that the Web, through its links, and Google, through its search algorithms, have inadvertently set into motion a very strong feedback loop that amplifies popularity and, in the end, leads us all, lemminglike, down the same well-trod path – the path of least resistance. You might call this the triumph of the wisdom of the crowd. I would suggest that it would be more accurately described as the triumph of the wisdom of the mob. The former sounds benign; the latter, less so.

A single medium, with a single search engine, & a single info source Read More »

A definition of cloud computing

From Darryl K. Taft’s “Predictions for the Cloud in 2009” (eWeek: 29 December 2008):

[Peter] Coffee, who is now director of platform research at Salesforce.com, said, “I’m currently using a simple reference model for what a ‘cloud computing’ initiative should try to provide. I’m borrowing from the famous Zero-One-Infinity rule, canonically defined in The Jargon File…”

He continued, “It seems to me that a serious effort at delivering cloud benefits pursues the following ideals—perhaps never quite reaching them, but clearly having them as goals within theoretical possibility: Zero—On-premise[s] infrastructure, acquisition cost, adoption cost and support cost. One—Coherent software environment—not a ‘stack’ of multiple products from different providers. This avoids the chaos of uncoordinated release cycles or deferred upgrades. Infinity—Scalability in response to changing need, integratability/interoperability with legacy assets and other services, and customizability/programmability from data, through logic, up into the user interface without compromising robust multitenancy.”

A definition of cloud computing Read More »

DIY genetic engineering

From Marcus Wohlsen’s “Amateurs are trying genetic engineering at home” (AP: 25 December 2008):

Now, tinkerers are working at home with the basic building blocks of life itself.

Using homemade lab equipment and the wealth of scientific knowledge available online, these hobbyists are trying to create new life forms through genetic engineering — a field long dominated by Ph.D.s toiling in university and corporate laboratories.

In her San Francisco dining room lab, for example, 31-year-old computer programmer Meredith L. Patterson is trying to develop genetically altered yogurt bacteria that will glow green to signal the presence of melamine, the chemical that turned Chinese-made baby formula and pet food deadly.

Many of these amateurs may have studied biology in college but have no advanced degrees and are not earning a living in the biotechnology field. Some proudly call themselves “biohackers” — innovators who push technological boundaries and put the spread of knowledge before profits.

In Cambridge, Mass., a group called DIYbio is setting up a community lab where the public could use chemicals and lab equipment, including a used freezer, scored for free off Craigslist, that drops to 80 degrees below zero, the temperature needed to keep many kinds of bacteria alive.

Patterson, the computer programmer, wants to insert the gene for fluorescence into yogurt bacteria, applying techniques developed in the 1970s.

She learned about genetic engineering by reading scientific papers and getting tips from online forums. She ordered jellyfish DNA for a green fluorescent protein from a biological supply company for less than $100. And she built her own lab equipment, including a gel electrophoresis chamber, or DNA analyzer, which she constructed for less than $25, versus more than $200 for a low-end off-the-shelf model.

DIY genetic engineering Read More »

Many layers of cloud computing, or just one?

From Nicholas Carr’s “Further musings on the network effect and the cloud” (Rough Type: 27 October 2008):

I think O’Reilly did a nice job of identifying the different layers of the cloud computing business – infrastructure, development platform, applications – and I think he’s right that they’ll have different economic and competitive characteristics. One thing we don’t know yet, though, is whether those layers will in the long run exist as separate industry sectors or whether they’ll collapse into a single supply model. In other words, will the infrastructure suppliers also come to dominate the supply of apps? Google and Microsoft are obviously trying to play across all three layers, while Amazon so far seems content to focus on the infrastructure business and Salesforce is expanding from the apps layer to the development platform layer. The degree to which the layers remain, or don’t remain, discrete business sectors will play a huge role in determining the ultimate shape, economics, and degree of consolidation in cloud computing.

Let me end on a speculative note: There’s one layer in the cloud that O’Reilly failed to mention, and that layer is actually on top of the application layer. It’s what I’ll call the device layer – encompassing all the various appliances people will use to tap the cloud – and it may ultimately come to be the most interesting layer. A hundred years ago, when Tesla, Westinghouse, Insull, and others were building the cloud of that time – the electric grid – companies viewed the effort in terms of the inputs to their business: in particular, the power they needed to run the machines that produced the goods they sold. But the real revolutionary aspect of the electric grid was not the way it changed business inputs – though that was indeed dramatic – but the way it changed business outputs. After the grid was built, we saw an avalanche of new products outfitted with electric cords, many of which were inconceivable before the grid’s arrival. The real fortunes were made by those companies that thought most creatively about the devices that consumers would plug into the grid. Today, we’re already seeing hints of the device layer – of the cloud as output rather than input. Look at the way, for instance, that the little old iPod has shaped the digital music cloud.

Many layers of cloud computing, or just one? Read More »

Business models for software

From Brian D’s “The benefits of a monthly recurring revenue model in tough economic times” (37 Signals: 18 December 2008):

At 37signals we sell our web-based products using the monthly subscription model. We also give people a 30-day free trial up front before we bill them for their first month.

We think this model works best all the time, but we believe it works especially well in tough times. When times get tough people obviously look to spend less, but understanding how they spend less has a lot to do with which business models work better than others.

There are lots of business models for software. Here are a few of the most popular:

* Freeware
* Freeware, ad supported
* One-off pay up front, get upgrades free
* One-off pay up front, pay for upgrades
* Subscription (recurring annual)
* Subscription (recurring monthly)

Business models for software Read More »

An analysis of Google’s technology, 2005

From Stephen E. Arnold’s The Google Legacy: How Google’s Internet Search is Transforming Application Software (Infonortics: September 2005):

The figure Google’s Fusion: Hardware and Software Engineering shows that Google’s technology framework has two areas of activity. There is the software engineering effort that focuses on PageRank and other applications. Software engineering, as used here, means writing code and thinking about how computer systems operate in order to get work done quickly. Quickly means the sub one-second response times that Google is able to maintain despite its surging growth in usage, applications and data processing.

Google is hardware plus software

The other effort focuses on hardware. Google has refined server racks, cable placement, cooling devices, and data center layout. The payoff is lower operating costs and the ability to scale as demand for computing resources increases. With faster turnaround and the elimination of such troublesome jobs as backing up data, Google’s hardware innovations give it a competitive advantage few of its rivals can equal as of mid-2005.

How Google Is Different from MSN and Yahoo

Google’s technologyis simultaneously just like other online companies’ technology, and very different. A data center is usually a facility owned and operated by a third party where customers place their servers. The staff of the data center manage the power, air conditioning and routine maintenance. The customer specifies the computers and components. When a data center must expand, the staff of the facility may handle virtually all routine chores and may work with the customer’s engineers for certain more specialized tasks.

Before looking at some significant engineering differences between Google and two of its major competitors, review this list of characteristics for a Google data center.

1. Google data centers – now numbering about two dozen, although no one outside Google knows the exact number or their locations. They come online and automatically, under the direction of the Google File System, start getting work from other data centers. These facilities, sometimes filled with 10,000 or more Google computers, find one another and configure themselves with minimal human intervention.

2. The hardware in a Google data center can be bought at a local computer store. Google uses the same types of memory, disc drives, fans and power supplies as those in a standard desktop PC.

3. Each Google server comes in a standard case called a pizza box with one important change: the plugs and ports are at the front of the box to make access faster and easier.

4. Google racks are assembled for Google to hold servers on their front and back sides. This effectively allows a standard rack, normally holding 40 pizza box servers, to hold 80.

5. A Google data center can go from a stack of parts to online operation in as little as 72 hours, unlike more typical data centers that can require a week or even a month to get additional resources online.

6. Each server, rack and data center works in a way that is similar to what is called “plug and play.” Like a mouse plugged into the USB port on a laptop, Google’s network of data centers knows when more resources have been connected. These resources, for the most part, go into operation without human intervention.

Several of these factors are dependent on software. This overlap between the hardware and software competencies at Google, as previously noted, illustrates the symbiotic relationship between these two different engineering approaches. At Google, from its inception, Google software and Google hardware have been tightly coupled. Google is not a software company nor is it a hardware company. Google is, like IBM, a company that owes its existence to both hardware and software. Unlike IBM, Google has a business model that is advertiser supported. Technically, Google is conceptually closer to IBM (at one time a hardware and software company) than it is to Microsoft (primarily a software company) or Yahoo! (an integrator of multiple softwares).

Software and hardware engineering cannot be easily segregated at Google. At MSN and Yahoo hardware and software are more loosely-coupled. Two examples will illustrate these differences.

Microsoft – with some minor excursions into the Xbox game machine and peripherals – develops operating systems and traditional applications. Microsoft has multiple operating systems, and its engineers are hard at work on the company’s next-generation of operating systems.

Several observations are warranted:

1. Unlike Google, Microsoft does not focus on performance as an end in itself. As a result, Microsoft gets performance the way most computer users do. Microsoft buys or upgrades machines. Microsoft does not fiddle with its operating systems and their subfunctions to get that extra time slice or two out of the hardware.

2. Unlike Google, Microsoft has to support many operating systems and invest time and energy in making certain that important legacy applications such as Microsoft Office or SQLServer can run on these new operating systems. Microsoft has a boat anchor tied to its engineer’s ankles. The boat anchor is the need to ensure that legacy code works in Microsoft’s latest and greatest operating systems.

3. Unlike Google, Microsoft has no significant track record in designing and building hardware for distributed, massively parallelised computing. The mice and keyboards were a success. Microsoft has continued to lose money on the Xbox, and the sudden demise of Microsoft’s entry into the home network hardware market provides more evidence that Microsoft does not have a hardware competency equal to Google’s.

Yahoo! operates differently from both Google and Microsoft. Yahoo! is in mid-2005 a direct competitor to Google for advertising dollars. Yahoo! has grown through acquisitions. In search, for example, Yahoo acquired 3721.com to handle Chinese language search and retrieval. Yahoo bought Inktomi to provide Web search. Yahoo bought Stata Labs in order to provide users with search and retrieval of their Yahoo! mail. Yahoo! also owns AllTheWeb.com, a Web search site created by FAST Search & Transfer. Yahoo! owns the Overture search technology used by advertisers to locate key words to bid on. Yahoo! owns Alta Vista, the Web search system developed by Digital Equipment Corp. Yahoo! licenses InQuira search for customer support functions. Yahoo has a jumble of search technology; Google has one search technology.

Historically Yahoo has acquired technology companies and allowed each company to operate its technology in a silo. Integration of these different technologies is a time-consuming, expensive activity for Yahoo. Each of these software applications requires servers and systems particular to each technology. The result is that Yahoo has a mosaic of operating systems, hardware and systems. Yahoo!’s problem is different from Microsoft’s legacy boat-anchor problem. Yahoo! faces a Balkan-states problem.

There are many voices, many needs, and many opposing interests. Yahoo! must invest in management resources to keep the peace. Yahoo! does not have a core competency in hardware engineering for performance and consistency. Yahoo! may well have considerable competency in supporting a crazy-quilt of hardware and operating systems, however. Yahoo! is not a software engineering company. Its engineers make functions from disparate systems available via a portal.

The figure below provides an overview of the mid-2005 technical orientation of Google, Microsoft and Yahoo.

2005 focuses of Google, MSN, and Yahoo

The Technology Precepts

… five precepts thread through Google’s technical papers and presentations. The following snapshots are extreme simplifications of complex, yet extremely fundamental, aspects of the Googleplex.

Cheap Hardware and Smart Software

Google approaches the problem of reducing the costs of hardware, set up, burn-in and maintenance pragmatically. A large number of cheap devices using off-the-shelf commodity controllers, cables and memory reduces costs. But cheap hardware fails.

In order to minimize the “cost” of failure, Google conceived of smart software that would perform whatever tasks were needed when hardware devices fail. A single device or an entire rack of devices could crash, and the overall system would not fail. More important, when such a crash occurs, no full-time systems engineering team has to perform technical triage at 3 a.m.

The focus on low-cost, commodity hardware and smart software is part of the Google culture.

Logical Architecture

Google’s technical papers do not describe the architecture of the Googleplex as self-similar. Google’s technical papers provide tantalizing glimpses of an approach to online systems that makes a single server share features and functions of a cluster of servers, a complete data center, and a group of Google’s data centers.

The collections of servers running Google applications on the Google version of Linux is a supercomputer. The Googleplex can perform mundane computing chores like taking a user’s query and matching it to documents Google has indexed. Further more, the Googleplex can perform side calculations needed to embed ads in the results pages shown to user, execute parallelized, high-speed data transfers like computers running state-of-the-art storage devices, and handle necessary housekeeping chores for usage tracking and billing.

When Google needs to add processing capacity or additional storage, Google’s engineers plug in the needed resources. Due to self-similarity, the Googleplex can recognize, configure and use the new resource. Google has an almost unlimited flexibility with regard to scaling and accessing the capabilities of the Googleplex.

In Google’s self-similar architecture, the loss of an individual device is irrelevant. In fact, a rack or a data center can fail without data loss or taking the Googleplex down. The Google operating system ensures that each file is written three to six times to different storage devices. When a copy of that file is not available, the Googleplex consults a log for the location of the copies of the needed file. The application then uses that replica of the needed file and continues with the job’s processing.

Speed and Then More Speed

Google uses commodity pizza box servers organized in a cluster. A cluster is group of computers that are joined together to create a more robust system. Instead of using exotic servers with eight or more processors, Google generally uses servers that have two processors similar to those found in a typical home computer.

Through proprietary changes to Linux and other engineering innovations, Google is able to achieve supercomputer performance from components that are cheap and widely available.

… engineers familiar with Google believe that read rates may in some clusters approach 2,000 megabytes a second. When commodity hardware gets better, Google runs faster without paying a premium for that performance gain.

Another key notion of speed at Google concerns writing computer programs to deploy to Google users. Google has developed short cuts to programming. An example is Google’s creating a library of canned functions to make it easy for a programmer to optimize a program to run on the Googleplex computer. At Microsoft or Yahoo, a programmer must write some code or fiddle with code to get different pieces of a program to execute simultaneously using multiple processors. Not at Google. A programmer writes a program, uses a function from a Google bundle of canned routines, and lets the Googleplex handle the details. Google’s programmers are freed from much of the tedium associated with writing software for a distributed, parallel computer.

Eliminate or Reduce Certain System Expenses

Some lucky investors jumped on the Google bandwagon early. Nevertheless, Google was frugal, partly by necessity and partly by design. The focus on frugality influenced many hardware and software engineering decisions at the company.

Drawbacks of the Googleplex

The Laws of Physics: Heat and Power 101

In reality, no one knows. Google has a rapidly expanding number of data centers. The data center near Atlanta, Georgia, is one of the newest deployed. This state-of-the-art facility reflects what Google engineers have learned about heat and power issues in its other data centers. Within the last 12 months, Google has shifted from concentrating its servers at about a dozen data centers, each with 10,000 or more servers, to about 60 data centers, each with fewer machines. The change is a response to the heat and power issues associated with larger concentrations of Google servers.

The most failure prone components are:

  • Fans.
  • IDE drives which fail at the rate of one per 1,000 drives per day.
  • Power supplies which fail at a lower rate.

Leveraging the Googleplex

Google’s technology is one major challenge to Microsoft and Yahoo. So to conclude this cursory and vastly simplified look at Google technology, consider these items:

1. Google is fast anywhere in the world.

2. Google learns. When the heat and power problems at dense data centers surfaced, Google introduced cooling and power conservation innovations to its two dozen data centers.

3. Programmers want to work at Google. “Google has cachet,” said one recent University of Washington graduate.

4. Google’s operating and scaling costs are lower than most other firms offering similar businesses.

5. Google squeezes more work out of programmers and engineers by design.

6. Google does not break down, or at least it has not gone offline since 2000.

7. Google’s Googleplex can deliver desktop-server applications now.

8. Google’s applications install and update without burdening the user with gory details and messy crashes.

9. Google’s patents provide basic technology insight pertinent to Google’s core functionality.

An analysis of Google’s technology, 2005 Read More »

Richard Stallman on proprietary software

From Richard Stallman’s “Transcript of Richard Stallman at the 4th international GPLv3 conference; 23rd August 2006” (FSF Europe: 23 August 2006):

I hope to see all proprietary software wiped out. That’s what I aim for. That would be a World in which our freedom is respected. A proprietary program is a program that is not free. That is to say, a program that does respect the user’s essential rights. That’s evil. A proprietary program is part of a predatory scheme where people who don’t value their freedom are drawn into giving it up in order to gain some kind of practical convenience. And then once they’re there, it’s harder and harder to get out. Our goal is to rescue people from this.

Richard Stallman on proprietary software Read More »

Richard Stallman on the 4 freedoms

From Richard Stallman’s “Transcript of Richard Stallman at the 4th international GPLv3 conference; 23rd August 2006” (FSF Europe: 23 August 2006):

Specifically, this refers to four essential freedoms, which are the definition of Free Software.

Freedom zero is the freedom to run the program, as you wish, for any purpose.

Freedom one is the freedom to study the source code and then change it so that it does what you wish.

Freedom two is the freedom to help your neighbour, which is the freedom to distribute, including publication, copies of the program to others when you wish.

Freedom three is the freedom to help build your community, which is the freedom to distribute, including publication, your modified versions, when you wish.

These four freedoms make it possible for users to live an upright, ethical life as a member of a community and enable us individually and collectively to have control over what our software does and thus to have control over our computing.

Richard Stallman on the 4 freedoms Read More »

The NSA and threats to privacy

From James Bamford’s “Big Brother Is Listening” (The Atlantic: April 2006):

This legislation, the 1978 Foreign Intelligence Surveillance Act, established the FISA court—made up of eleven judges handpicked by the chief justice of the United States—as a secret part of the federal judiciary. The court’s job is to decide whether to grant warrants requested by the NSA or the FBI to monitor communications of American citizens and legal residents. The law allows the government up to three days after it starts eavesdropping to ask for a warrant; every violation of FISA carries a penalty of up to five years in prison. Between May 18, 1979, when the court opened for business, until the end of 2004, it granted 18,742 NSA and FBI applications; it turned down only four outright.

Such facts worry Jonathan Turley, a George Washington University law professor who worked for the NSA as an intern while in law school in the 1980s. The FISA “courtroom,” hidden away on the top floor of the Justice Department building (because even its location is supposed to be secret), is actually a heavily protected, windowless, bug-proof installation known as a Sensitive Compartmented Information Facility, or SCIF.

It is true that the court has been getting tougher. From 1979 through 2000, it modified only two out of 13,087 warrant requests. But from the start of the Bush administration, in 2001, the number of modifications increased to 179 out of 5,645 requests. Most of those—173—involved what the court terms “substantive modifications.”

Contrary to popular perception, the NSA does not engage in “wiretapping”; it collects signals intelligence, or “sigint.” In contrast to the image we have from movies and television of an FBI agent placing a listening device on a target’s phone line, the NSA intercepts entire streams of electronic communications containing millions of telephone calls and e-mails. It runs the intercepts through very powerful computers that screen them for particular names, telephone numbers, Internet addresses, and trigger words or phrases. Any communications containing flagged information are forwarded by the computer for further analysis.

Names and information on the watch lists are shared with the FBI, the CIA, the Department of Homeland Security, and foreign intelligence services. Once a person’s name is in the files, even if nothing incriminating ever turns up, it will likely remain there forever. There is no way to request removal, because there is no way to confirm that a name is on the list.

In December of 1997, in a small factory outside the southern French city of Toulouse, a salesman got caught in the NSA’s electronic web. Agents working for the NSA’s British partner, the Government Communications Headquarters, learned of a letter of credit, valued at more than $1.1 million, issued by Iran’s defense ministry to the French company Microturbo. According to NSA documents, both the NSA and the GCHQ concluded that Iran was attempting to secretly buy from Microturbo an engine for the embargoed C-802 anti-ship missile. Faxes zapping back and forth between Toulouse and Tehran were intercepted by the GCHQ, which sent them on not just to the NSA but also to the Canadian and Australian sigint agencies, as well as to Britain’s MI6. The NSA then sent the reports on the salesman making the Iranian deal to a number of CIA stations around the world, including those in Paris and Bonn, and to the U.S. Commerce Department and the Customs Service. Probably several hundred people in at least four countries were reading the company’s communications.

Such events are central to the current debate involving the potential harm caused by the NSA’s warrantless domestic eavesdropping operation. Even though the salesman did nothing wrong, his name made its way into the computers and onto the watch lists of intelligence, customs, and other secret and law-enforcement organizations around the world. Maybe nothing will come of it. Maybe the next time he tries to enter the United States or Britain he will be denied, without explanation. Maybe he will be arrested. As the domestic eavesdropping program continues to grow, such uncertainties may plague innocent Americans whose names are being run through the supercomputers even though the NSA has not met the established legal standard for a search warrant. It is only when such citizens are turned down while applying for a job with the federal government—or refused when seeking a Small Business Administration loan, or turned back by British customs agents when flying to London on vacation, or even placed on a “no-fly” list—that they will realize that something is very wrong. But they will never learn why.

General Michael Hayden, director of the NSA from 1999 to 2005 and now principal deputy director of national intelligence, noted in 2002 that during the 1990s, e-communications “surpassed traditional communications. That is the same decade when mobile cell phones increased from 16 million to 741 million—an increase of nearly 50 times. That is the same decade when Internet users went from about 4 million to 361 million—an increase of over 90 times. Half as many land lines were laid in the last six years of the 1990s as in the whole previous history of the world. In that same decade of the 1990s, international telephone traffic went from 38 billion minutes to over 100 billion. This year, the world’s population will spend over 180 billion minutes on the phone in international calls alone.”

Intercepting communications carried by satellite is fairly simple for the NSA. The key conduits are the thirty Intelsat satellites that ring the Earth, 22,300 miles above the equator. Many communications from Europe, Africa, and the Middle East to the eastern half of the United States, for example, are first uplinked to an Intelsat satellite and then downlinked to AT&T’s ground station in Etam, West Virginia. From there, phone calls, e-mails, and other communications travel on to various parts of the country. To listen in on that rich stream of information, the NSA built a listening post fifty miles away, near Sugar Grove, West Virginia. Consisting of a group of very large parabolic dishes, hidden in a heavily forested valley and surrounded by tall hills, the post can easily intercept the millions of calls and messages flowing every hour into the Etam station. On the West Coast, high on the edge of a bluff overlooking the Okanogan River, near Brewster, Washington, is the major commercial downlink for communications to and from Asia and the Pacific. Consisting of forty parabolic dishes, it is reportedly the largest satellite antenna farm in the Western Hemisphere. A hundred miles to the south, collecting every whisper, is the NSA’s western listening post, hidden away on a 324,000-acre Army base in Yakima, Washington. The NSA posts collect the international traffic beamed down from the Intelsat satellites over the Atlantic and Pacific. But each also has a number of dishes that appear to be directed at domestic telecommunications satellites.

Until recently, most international telecommunications flowing into and out of the United States traveled by satellite. But faster, more reliable undersea fiber-optic cables have taken the lead, and the NSA has adapted. The agency taps into the cables that don’t reach our shores by using specially designed submarines, such as the USS Jimmy Carter, to attach a complex “bug” to the cable itself. This is difficult, however, and undersea taps are short-lived because the batteries last only a limited time. The fiber-optic transmission cables that enter the United States from Europe and Asia can be tapped more easily at the landing stations where they come ashore. With the acquiescence of the telecommunications companies, it is possible for the NSA to attach monitoring equipment inside the landing station and then run a buried encrypted fiber-optic “backhaul” line to NSA headquarters at Fort Meade, Maryland, where the river of data can be analyzed by supercomputers in near real time.

Tapping into the fiber-optic network that carries the nation’s Internet communications is even easier, as much of the information transits through just a few “switches” (similar to the satellite downlinks). Among the busiest are MAE East (Metropolitan Area Ethernet), in Vienna, Virginia, and MAE West, in San Jose, California, both owned by Verizon. By accessing the switch, the NSA can see who’s e-mailing with whom over the Internet cables and can copy entire messages. Last September, the Federal Communications Commission further opened the door for the agency. The 1994 Communications Assistance for Law Enforcement Act required telephone companies to rewire their networks to provide the government with secret access. The FCC has now extended the act to cover “any type of broadband Internet access service” and the new Internet phone services—and ordered company officials never to discuss any aspect of the program.

The National Security Agency was born in absolute secrecy. Unlike the CIA, which was created publicly by a congressional act, the NSA was brought to life by a top-secret memorandum signed by President Truman in 1952, consolidating the country’s various military sigint operations into a single agency. Even its name was secret, and only a few members of Congress were informed of its existence—and they received no information about some of its most important activities. Such secrecy has lent itself to abuse.

During the Vietnam War, for instance, the agency was heavily involved in spying on the domestic opposition to the government. Many of the Americans on the watch lists of that era were there solely for having protested against the war. … Even so much as writing about the NSA could land a person a place on a watch list.

For instance, during World War I, the government read and censored thousands of telegrams—the e-mail of the day—sent hourly by telegraph companies. Though the end of the war brought with it a reversion to the Radio Act of 1912, which guaranteed the secrecy of communications, the State and War Departments nevertheless joined together in May of 1919 to create America’s first civilian eavesdropping and code-breaking agency, nicknamed the Black Chamber. By arrangement, messengers visited the telegraph companies each morning and took bundles of hard-copy telegrams to the agency’s offices across town. These copies were returned before the close of business that day.

A similar tale followed the end of World War II. In August of 1945, President Truman ordered an end to censorship. That left the Signal Security Agency (the military successor to the Black Chamber, which was shut down in 1929) without its raw intelligence—the telegrams provided by the telegraph companies. The director of the SSA sought access to cable traffic through a secret arrangement with the heads of the three major telegraph companies. The companies agreed to turn all telegrams over to the SSA, under a plan code-named Operation Shamrock. It ran until the government’s domestic spying programs were publicly revealed, in the mid-1970s.

Frank Church, the Idaho Democrat who led the first probe into the National Security Agency, warned in 1975 that the agency’s capabilities

“could be turned around on the American people, and no American would have any privacy left, such [is] the capability to monitor everything: telephone conversations, telegrams, it doesn’t matter. There would be no place to hide. If this government ever became a tyranny, if a dictator ever took charge in this country, the technological capacity that the intelligence community has given the government could enable it to impose total tyranny, and there would be no way to fight back, because the most careful effort to combine together in resistance to the government, no matter how privately it is done, is within the reach of the government to know. Such is the capacity of this technology.”

The NSA and threats to privacy Read More »

George Clinton and the sample troll

From Tim Wu’s “On Copyright’s Authorship Policy” (Internet Archive: 2007):

On May 4, 2001, a one-man corporation named Bridgeport Music, Inc. launched over 500 counts of copyright infringement against more than 800 different artists and labels.1 Bridgeport Music has no employees, and other than copyrights, no reported assets.2 Technically, Bridgeport is a “catalogue company.” Others call it a “sample troll.”

Bridgeport is the owner of valuable copyrights, including many of funk singer George Clinton’s most famous songs – songs which are sampled in a good amount of rap music.3 Bridgeport located every sample of Clinton’s and other copyrights it owned, and sued based on the legal position that any sampling of a sound recording, no matter how minimal or unnoticeable, is still an infringement.

During the course of Bridgeport’s campaign, it has won two important victories. First, the Sixth Circuit, the appellate court for Nashville adopted Bridgeport’s theory of infringement. In Bridgeport Music, Inc. v. Dimension Films,4 the defendants sampled a single chord from the George Clinton tune “Get Off Your Ass and Jam,” changed the pitch, and looped the sound. Despite the plausible defense that one note is but a de minimus use of the work, the Sixth Circuit ruled for Bridgeport and created a stark rule: any sampling, no matter how minimal or undetectable, is a copyright infringement. Said the court in Bridgeport, “Get a license or do not sample. We do not see this as stifling creativity in any significant way.”5 In 2006 Bridgeport convinced a district court to enjoin the sales of the bestselling Notorious B.I.G. album, Ready to Die, for “illegal sampling.”6 A jury then awarded Bridgeport more than four million dollars in damages.7

The Bridgeport cases have been heavily criticized, and taken as a prime example of copyright’s excesses.8 Yet the deeper problem with the Bridgeport litigation is not necessarily a problem of too much copyright. It can be equally concluded that the ownership of the relevant rights is the root of the problem. George Clinton, the actual composer and recording artist, takes a much different approach to sampling. “When hip-hop came out,” said Clinton in an interview with journalist Rick Karr, “I was glad to hear it, especially when it was our songs – it was a way to get back on the radio.”9 Clinton accepts sampling of his work, and has released a three CD collection of his sounds for just that purpose.10 The problem is that he doesn’t own many of his most important copyrights. Instead, it is Bridgeport, the one-man company, that owns the rights to Clinton’s work. In the 1970s Bridgeport, through its owner Armen Boladian, managed to seize most of George Clinton’s copyrights and many other valuable rights. In at least a few cases, Boladian assigned the copyrights to Bridgeport by writing a contract and then faking Clinton’s signature.11 As Clinton puts it “he just stole ‘em.”12 With the copyrights to Clinton’s songs in the hands of Bridgeport – an entity with no vested interest in the works beyond their sheer economic value – the targeting of sampling is not surprising.

1 Tim Wu, Jay-Z Versus the Sample Troll, Slate Magazine, Nov. 16, 2006, http://www.slate.com/id/2153961/.

2 See Bridgeport Music, Inc.’s corporate entity details, Michigan Department of Labor & Economic Growth, available at http://www.dleg.state.mi.us/bcs_corp/dt_corp.asp?id_nbr=190824&name_entity=BRIDGEPORT%20MUSIC,%20INC (last visited Mar. 18, 2007).

3 See Wu, supra note 1.

4 410 F.3d 792 (6th Cir. 2005).

5 Id. at 801.

6 Jeff Leeds, Judge Freezes Notorious B.I.G. Album, N.Y. Times, Mar. 21, 2006, at E2.

7 Id.

8 See, e.g., Matthew R. Broodin, Comment, Bridgeport Music, Inc. v. Dimension Films: The Death of the Substantial Similarity Test in Digital Samping Copyright Infringemnt Claims—The Sixth Circuit’s Flawed Attempt at a Bright Line Rule, 6 Minn. J. L. Sci. & Tech. 825 (2005); Jeffrey F. Kersting, Comment, Singing a Different Tune: Was the Sixth Circuit Justified in Changing the Protection of Sound Recordings in Bridgeport Music, Inc. v. Dimension Films?, 74 U. Cin. L. Rev. 663 (2005) (answering the title question in the negative); John Schietinger, Note, Bridgeport Music, Inc. v. Dimension Films: How the Sixth Circuit Missed a Beat on Digital Music Sampling, 55 DePaul L. Rev. 209 (2005).

9 Interview by Rick Karr with George Clinton, at the 5th Annual Future of Music Policy Summit, Wash. D.C. (Sept. 12, 2005), video clip available at http://www.tvworldwide.com/showclip.cfm?ID=6128&clip=2 [hereinafter Clinton Interview].

10 George Clinton, Sample Some of Disc, Sample Some of D.A.T., Vols. 1-3 (1993-94).

11 Sound Generator, George Clinton awarded Funkadelic master recordings (Jun. 6, 2005), http://www.soundgenerator.com/news/showarticle.cfm?articleid=5555.

12 Clinton Interview, supra note 9.

George Clinton and the sample troll Read More »

Microsoft’s programmers, evaluated by an engineer

From John Wharton’s “The Origins of DOS” (Microprocessor Report: 3 October 1994):

In August of 1981, soon after Microsoft had acquired full rights to 86-DOS, Bill Gates visited Santa Clara in an effort to persuade Intel to abandon a joint development project with DRI and endorse MS-DOS instead. It was I – the Intel applications engineer then responsible for iRMX-86 and other 16-bit operating systems – who was assigned the task of performing a technical evaluation of the 86- DOS software. It was I who first informed Gates that the software he just bought was not, in fact, fully compatible with CP/M 2.2. At the time I had the distinct impression that, until then, he’d thought the entire OS had been cloned.

The strong impression I drew 13 years ago was that Microsoft programmers were untrained, undisciplined, and content merely to replicate other people’s ideas, and that they did not seem to appreciate the importance of defining operating systems and user interfaces with an eye to the future.

Microsoft’s programmers, evaluated by an engineer Read More »

The life cycle of a botnet client

From Chapter 2: Botnets Overview of Craig A. Schiller’s Botnets: The Killer Web App (Syngress: 2007):

What makes a botnet a botnet? In particular, how do you distinguish a botnet client from just another hacker break-in? First, the clients in a botnet must be able to take actions on the client without the hacker having to log into the client’s operating system (Windows, UNIX, or Mac OS). Second, many clients must be able to act in a coordinated fashion to accomplish a common goal with little or no intervention from the hacker. If a collection of computers meet this criteria it is a botnet.

The life of a botnet client, or botclient, begins when it has been exploited. A prospective botclient can be exploited via malicious code that a user is tricked into running; attacks against unpatched vulnerabilities; backdoors left by Trojan worms or remote access Trojans; and password guessing and brute force access attempts. In this section we’ll discuss each of these methods of exploiting botnets.

Rallying and Securing the Botnet Client

Although the order in the life cycle may vary, at some point early in the life of a new botnet client it must call home, a process called “rallying. “When rallying, the botnet client initiates contact with the botnet Command and Control (C&C) Server. Currently, most botnets use IRC for Command and Control.

Rallying is the term given for the first time a botnet client logins in to a C&C server. The login may use some form of encryption or authentication to limit the ability of others to eavesdrop on the communications. Some botnets are beginning to encrypt the communicated data.

At this point the new botnet client may request updates. The updates could be updated exploit software, an updated list of C&C server names, IP addresses, and/or channel names. This will assure that the botnet client can be managed and can be recovered should the current C&C server be taken offline.

The next order of business is to secure the new client from removal. The client can request location of the latest anti-antivirus (Anti-A/V) tool from the C&C server. The newly controlled botclient would download this soft- ware and execute it to remove the A/V tool, hide from it, or render it ineffective.

Shutting off the A/V tool may raise suspicions if the user is observant. Some botclients will run a dll that neuters the A/V tool. With an Anti-A/V dll in place the A/V tool may appear to be working normally except that it never detects or reports the files related to the botnet client. It may also change the Hosts file and LMHosts file so that attempts to contact an A/V vendor for updates will not succeed. Using this method, attempts to contact an A/V vendor can be redirected to a site containing malicious code or can yield a “website or server not found” error.

One tool, hidden32. exe, is used to hide applications that have a GUI interface from the user. Its use is simple; the botherder creates a batch file that executes hidden32 with the name of the executable to be hidden as its parameter. Another stealthy tool, HideUserv2, adds an invisible user to the administrator group.

Waiting for Orders and Retrieving the Payload

Once secured, the botnet client will listen to the C&C communications channel.

The botnet client will then request the associated payload. The payload is the term I give the software representing the intended function of this botnet client.

The life cycle of a botnet client Read More »

Cheating, security, & theft in virtual worlds and online games

From Federico Biancuzzi’s interview with security researchers Greg Hoglund & Gary McGraw, authors of Exploiting Online Games, in “Real Flaws in Virtual Worlds” (SecurityFocus: 20 December 2007):

The more I dug into online game security, the more interesting things became. There are multiple threads intersecting in our book: hackers who cheat in online games and are not detected can make tons of money selling virtual items in the middle market; the law says next to nothing about cheating in online games, so doing so is really not illegal; the kinds of technological attacks and exploits that hackers are using to cheat in online games are an interesting bellwether; software is evolving to look very much like massively distributed online games look today with thick clients and myriad time and state related security problems. [Emphasis added]

In Brazil, a criminal gang even kidnapped a star MMORPG player in order to take away his character, and its associated virtual wealth.

The really interesting thing about online game security is that the attackers are in most cases after software running on their own machine, not software running on somebody else’s box. That’s a real change. Interestingly, the laws we have developed in computer security don’t have much to say about cheating in a game or hacking software on your own PC.

Cheating, security, & theft in virtual worlds and online games Read More »