Sign In

Communications of the ACM

News

The Future of Data Storage


View as: Print Mobile App ACM Digital Library Full Text (PDF) In the Digital Edition Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
data storage drive, illustration

Credit: Leo Photo

Over the decades, computer storage has encompassed a variety of technologies, including punch cards, floppy disks, tape, hard drives, and flash technology. In every instance, the objective is the same: keep data accessible and available for the future. These advances in speed and capacity have helped today's sophisticated computing frameworks take shape. However, despite these gains, a simple but sobering fact emerges: "Tape remains a popular and preferred way to back up data," explains Robert Grass, a professor of chemistry at ETH Zurich (the Swiss Federal Institute of Technology) and a leading expert on nanotechnology.

Consider: When a software bug destroyed the email boxes of Gmail users in 2011, Google turned to tape to restore the data. The company spent more than 30 hours painstakingly recreating the accounts. Other companies and government organizations have encountered similar circumstances—and many continue to rely on tape. The reason? Tape remains inexpensive, the data on a tape remains accessible longer than on other media, and tape is remarkably easy to use and manage, while offering security benefits. "It is not an accident that tape remains in use," Grass says.

Although engineers continue to eke out further performance and capacity gains from hard drives and flash storage—and researchers are developing next-generation technologies such as DNA storage, crystal etching techniques, and molecular storage that could hold massive amounts of data on a small object for hundreds of thousands of years or longer—tape continues to march on, and on, and on. In 2017, for example, tape manufacturers shipped more than 1 million petabytes of Linear-Tape-Open (LTO), a widely used magnetic tape data storage technology. This is about five times the volume of tape shipped in 2008.

Says Grass, "Tape may seem old-fashioned, and even obsolete, but from a purely economic point of view, it is the most cost-effective and efficient way to store data." As a result, "It isn't going to disappear anytime soon."

Back to Top

Recorded History

The evolution of tape has been nothing less than remarkable. In 1951, computing pioneers John Adam Presper "Pres" Eckert Jr. and John Mauchly introduced the world's first tape storage system: the UNISERVO tape drive for the UNIVAC computer. The device, which relied on 1/2-inch metal tape, was heavy and slow. It recorded data to eight channels at a density of 128 bits per inch. Moving at 100 inches per second, the tape delivered a practical transfer rate of about 7,200 characters per second. In contrast, today's tape devices transfer data at speeds as high as 800 megabytes per second, while hard drives deliver a write speed of about 50 to 120 megabytes per second, and solid-state drives (SSDs) write data at rates of 200 to 550 megabytes per second.

By the 1970s, tape reels, cartridges, and cassettes had become the de facto way to back up and store data for both personal computers and enterprise systems. The limited capacity of floppy disk drives, and space and cost limitations imposed by early hard disk drives, kept tape at the forefront. Yet, even when disk technologies advanced exponentially and other storage technologies emerged, such as flash storage, the demand for tape didn't subside. "It has remained the standard for some very good reasons," says Reinhard Heckel, an assistant professor in the Department of Electrical and Computer Engineering at Rice University.

There are a number of technical and practical reasons tape has refused to fade into history. First and foremost, the medium is considerably less expensive than disk or flash storage. Part of the reason is that, unlike disks, one tape machine can accommodate an unlimited number of tape drives or cartridges. An analysis conducted by BackupWorks.com indicates that equivalent levels of backup for tape versus disk results in about 4x cost savings for devices.

There are other cost benefits as well. These include a more than 2x savings in operational costs, and upward of 10x savings in total power and cooling costs. Today, a robotic tape library can contain upward of 278 petabytes of data; the same data stored on CDs would require almost 400 million discs.

Tape also delivers efficiency advantages. All devices, when they write bits to storage, produce an "unrecoverable bit error," which occurs because the device writes a "1" instead of a "0" or vice versa. Error-correction methods do not make the problem completely go away. The result for a commonly used tape format such as an LT0-7/8 is a bit error rate of 1:1018, which is approximately one error for every 1.25 exabytes (EB). Other enterprise-class and consumer drives perform at an error rate of about 1:1016, which translates to an error every 125 terabytes (TB). Although error-correction codes for storage technologies have improved over the years, tape is about 100 times more accurate than the best hard drives, and about 10 times better than the latest solid-state devices (SSDs).

"The challenge with any storage technology is to reduce error codes," Heckel says. In a practical sense, this means that tape systems are more dependable than other technology solutions. The greater the unrecoverable bit error rate, the greater the risk of loss of data, along with other errors and problems, including a system seeing two bad drives simultaneously.


There are a number of technical and practical reasons why tape has refused to fade into history.


Yet, there's still another consideration. Modern tape cartridges fail at a rate about five orders of magnitude less frequently than hard drives—and tapes in storage require no moving or mechanical device.

Finally, tape offers the added appeal of creating an air-gapped environment when they are not in use. This makes a tape library highly secure, as long as it is kept physically protected.

Back to Top

Beyond Tape

Tape is not ideal for every situation. Recovery from a tape backup can be slow and somewhat cumbersome. Finding specific files can prove vexing. If incorrectly stored, tapes can succumb to environmental damage or become demagnetized.

There's also a bigger problem with all current storage technologies. A typical hard drive will operate only about three to five years before failing. Portable disk storage technologies such as CDs and DVDs generally hold data for 10 to 25 years, while flash storage—which includes drives, cards, and SSDs—degrades with use, rather than with age. This means that the more a user writes and rewrites to the device, essentially using it for its intended purpose, the greater the risk of failure. Future improvements in hard drives or flash technology are likely to produce only marginal performance gains, Heckel points out. However, tape, stored under ideal conditions, can last 30 to 50 years—and perhaps even longer. Although none of these technologies can compare to the lifespan of paper stored under ideal conditions (about 500 years), tape emerges as a clear winner.

Yet there's still another long-term challenge: many existing storage technologies are butting up against physical and logical limits. It is increasingly difficult to add speed and capacity through more heads, platters, or microchips. A handful of technologies may help boost the power and scale of hard drives, for example. These include heat-assisted magnetic recording (HAMR) and microwave-assisted magnetic recording (MAMR). Both of these methods allow smaller regions of a disk to be magnetized, resulting in higher capacity. However, these approaches also boost costs, and result in density scaling gains of only about 15%.

Researchers are now exploring other technologies that could one day replace disks, tapes, and flash memory—or at least supplement them for specific uses. Underpinning this is the fact that up to 90% of the data generated by computers and other digital systems is never accessed again; it simply lies idle, consuming ever-growing mountains of storage media or servers.

Likewise, there is the issue of hard drive capacity. "The problem with today's systems is that they deliver no more than one terabyte per square inch," says Karthik V. Raman, a former IBM research scientist who now leads a research team at the Tata Institute of Fundamental Research (TIFR) in India.

The net effect is that current storage technologies—particularly disk-based servers and systems—consume massive amounts of physical space, particularly when they involve large numbers of devices and media, such as tapes or disks. Yet, even tape produces enormous volume of physical objects. It is estimated that humans produce approximately 2.5 quintillion bytes of data each day and, overall, nearly 3 zettabytes of data exist in the digital world. All these bytes require increasingly large data centers that consume massive amounts of energy, along with other resources.


Researchers are now exploring other technologies that could one day replace disks, tapes, and flash memory—or at least supplement them for specific uses.


However, Raman points out, "Creating new types of storage with greater capacity doesn't solve the problem by itself. There's a need to develop better ways to direct and redirect data for faster processing." Heckel and Grass, for instance, have focused on using DNA as a data storage mechanism. The idea, first presented by George Church, a molecular biologist and geneticist at Harvard Medical School, involves writing data to DNA material, which could conceivably store that data for hundreds of thousands, or even millions, of years. Church says the purpose of DNA storage "isn't to reinvent the hard drive, it's to introduce a medium that is ideal for archiving and long-term storage."

Others are taking a different tack for keeping data intact for long periods of time. For example, Peter G. Kazansky, a professor at the University of Southampton in the U.K., has developed a method that uses an ultrafast short-pulse laser to etch data into the bulk of silica material. "A single disk using this technology could store 360 terabytes of data, compared to a Blu-ray disk that can store about 45 gigabytes," he says. Moreover, the data would potentially stay on the disk for approximately 14 billion years. The project has caught the eye of Microsoft, which is working to produce a commercially viable version of the technology within a decade (and its Project Silica is focusing on ways to use the technology in the cloud).

Back to Top

Tape Prevails

Remarkably, all storage devices and use cases eventually lead back to tape—at least, for the foreseeable future. While tape is not as flexible or convenient as hard drives, SSDs, and other media, it remains cost-effective and highly reliable. What's more, advancements in tape continue to out-pace other storage technologies. In 2017, IBM and Sony announced a new magnetic tape system capable of storing 201 gigabytes of data per square inch in a single palm-sized cartridge. The technology has a theoretical limit of 330 terabytes per square inch. The world's largest hard drives, on the other hand, require twice the physical space, but hold only 12 terabytes per square inch. The most advanced SSDs hold about 60 terabytes per square inch.

Many experts say the practical and cost advantages tape has over hard drives and other storage technologies will likely grow over the next several years. Tape won't ever threaten hard drives and SSD for dominance, but it will remain at the center of storage—and provide a strong insurance policy for the likes of Google. Mark Lantz, manager of Advanced Tape Technologies at IBM Research Zurich, noted in an August 2018 IEEE Spectrum article that researchers continue to boost the density and capacity of tape, and the trend will continue for some time. "Tape may be one of the last information technologies to follow a Moore's Law," he wrote.

To be sure, tape remains viable and valuable—and the situation is not likely to change anytime soon. Concludes Grass, "Other emerging technologies will eventually change the way data is stored. But for archival data storage, tape is the technology to beat."

* Further Reading

Blawat, M., Gaedke, K., Hütter, I., Chen, X., Turczyk, B., Inverso, S., Pruitt, B.W., and Church, G.M.
Forward Error Correction for DNA Data Storage. Procedia Computer Science, Volume 80, 2016, pp. 1011–1022. https://www.sciencedirect.com/science/article/pii/S1877050916308742?via%3Dihub

Zhang, J., erkauskaité, A., Drevinskas, R, Patel, A., Beresna, M., and Kazansky, P.G.
Current Trends in Multi-Dimensional Optical Data Storage Technology, Asia Communications and Photonics Conference 2016, Current Trends in Multi-Dimensional Optical Data Storage Technology, Wuhan China, November 2-5, 2016. https://doi.org/10.1364/ACPC.2016.AF1J.4

Zhang, J., erkauskaité, A., Drevinskas, R., Patel, A., Beresna, M., and Kazansky, P.G.
Eternal 5D Data Storage by Ultrafast Laser Writing in Glass. Proc. SPIE 9736, Laser-based Micro- and Nanoprocessing X, 97360U (4 March 2016); doi: 10.1117/12.2220600; https://doi.org/10.1117/12.2220600

Heckel, R., Mikutis, G., and Grass, R.N.
A Characterization of the DNA Data Storage Channel, eprint arXiv:1803.03322. March 2018. https://arxiv.org/abs/1803.03322v1

Back to Top

Author

Samuel Greengard is an author and journalist based in West Linn, OR, USA.


©2019 ACM  0001-0782/19/04

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from [email protected] or fax (212) 869-0481.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2019 ACM, Inc.


 

No entries found