The growth in data & the problem of storage

From Technology Review‘s “The Fading Memory of the State“:

Tom Hawk, general manager for enterprise storage at IBM, says that in the next three years, humanity will generate more data–from websites to digital photos and video–than it generated in the previous 1,000 years. … In 1996, companies spent 11 percent of their IT budgets on storage, but that figure will likely double to 22 percent in 2007, according to International Technology Group of Los Altos, CA.

… the Pentagon generates tens of millions of images from personnel files each year; the Clinton White House generated 38 million e-mail messages (and the current Bush White House is expected to generate triple that number); and the 2000 census returns were converted into more than 600 million TIFF-format image files, some 40 terabytes of data. A single patent application can contain a million pages, plus complex files like 3-D models of proteins or CAD drawings of aircraft parts. All told, NARA expects to receive 347 petabytes … of electronic records by 2022.

Currently, the Archives holds only a trivial number of electronic records. Stored on steel racks in NARA’s [National Archives and Records Administration] 11-year-old facility in College Park, the digital collection adds up to just five terabytes. Most of it consists of magnetic tapes of varying ages, many of them holding a mere 200 megabytes apiece–about the size of 10 high-resolution digital photographs.