Sign In

Communications of the ACM

Communications of the ACM

Archival Perspectives on the Emerging Digital Library

View as: Print Mobile App ACM Digital Library Full Text (PDF) Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook

Although archives are often housed in libraries, the archival and library communities and institutions have long been distinct entities with related, but independent missions, perspectives, and self images. Libraries mainly acquire, preserve, arrange, describe, and make available published information. Archival repositories conduct all these functions, albeit in very different ways, with unpublished and unique materials of either organizational or personal origin having enduring value. Despite many similarities, libraries and archives function as parallel institutions, occasionally drawing on each other's theory and practice, but largely remain independent with their own traditions and literatures.

Digital library collections promise to contain both library and archival materialspublished and unpublished, commercially available, and institutionally unique, bound and collected materials all in large quantities. We already see digital libraries such as ibiblio ( at the University of North Carolina at Chapel Hill soliciting personal materials that users/participants digitize and donate. The Valley of the Shadow Project ( at the University of Virginia collects personal papers and municipal records documenting life during the Civil War. Certainly much of the material digitized for the Library of Congress American Memory Project, considered to be U.S.'s National Digital Library, is archival in nature (

Given the hybrid archive-library mixture of content that will be the digital library, it is time to apply relevant aspects of both librarianship and archivy, along with a strong technological base to the digital library's design and management. Archival theory and practice promise to be particularly important in three areas: What to save and what to digitize; how to save it; and how to provide access to it.

What to save; what to digitize. Appraisal theory and practice, along with life cycle of records, can facilitate the retention of materials of enduring value. While archivists are known as great savers, in reality, they are highly skilled selectors, generally retaining no more than 5% of the original bulk of any collection. Librarians, who primarily deal with published material, have a much easier selection task as their materials have already undergone the rigors of publication and review.

Archivists deal with massive institutional and personal collections in which they must determine what is of enduring valueboth to the creator and posterityand remove everything else so researchers can find what is useful over time. Such winnowing must be done while maintaining the context of the collection as a whole through the selection of items and their arrangement and description. Appraisal, the most intellectually challenging and critical aspect of archival work, for saving everything, especially in an era of documentary abundance, means finding nothing. Deaccessioning unique materials, however, means losing them forever. Appraisal strategies are also important in deciding which print materials warrant the expense of digitization.

Despite many similarities, libraries and archives function as parallel institutions, occasionally drawing on each other's theory and practice, but largely remain independent with their own traditions and literatures.

When various professions, communities, and organizations have not taken an active role in preserving their own history, archivists have developed documentation strategies to ensure that a representative sample of materials from these domains have been preserved for the future.1 Sometimes this has involved something as simple as conducting oral histories with community members and solicitation of papers, but in other cases has required mapping an entire profession and a strategic approach to record acquisition. The digital libraries that wish to solicit quality participant contributions can well look to the work of archivists in donor relations, acquisitions, appraisal, and document authenticity.

How to save it. Archival theory will be essential in developing models of long-term intellectual preservation of authentic and reliable digital objects for the governmental, commercial, and cultural heritage sectors. Significant work within the archival community, exploring the fundamental nature of evidentiary electronic records, promises insight into how the digital library can ensure the authenticity of its data.2 The InterPARES project, an international collaborative effort, is currently the most important example of work in this area.3 Archivists are also exploring physical preservation of digital objects in light of technological obsolescence and data and media degradation. Proposed approaches include data migration and software emulation.

Archivists are also in the forefront of determining best practices for the digitization process as many of the digitization projects funded to date have involved archival materials. In lieu of established standards in this field, Kenney and Rieger's landmark work, Moving Theory into Practice: Digital Imaging for Libraries and Archives, discusses the benchmarking process developed at Cornell University Libraries.4

How to provide access to it. Archives have developed effective and efficient techniques to deal with massive quantities of information and information containers that will surely reside in the digital library. It is quite common for record groups in governmental archives to have millions of documents in their domains; many manuscript collections will also contain photos, books, clippings, audio and video files, and more. Collective and hierarchical arrangement and description of materials, based on the provenance of data and reflected in collection-finding aids that preserve the context, structure, and diversity of archival objects, provide access to materials otherwise inaccessible due to their bulk. Where indexing is impossible and where the sheer extent of full-text electronic files might well prohibit effective retrieval, archival arrangement and description provide entry points into collections for researchers. The Encoded Archival Description (EAD) document type definition5a standard developed within the archival communitynow allows for the presentation of finding aids on the Web that preserves this hierarchy and allows the linking of digital documents to this framework.

These are just a few ways in which archival perspectives can inform the digital library.6 While much material in archives and manuscript repositories remains in paper format, archivists are grappling with the challenges of electronic records that will become tomorrow's heritage, and historical and legal evidence in ways unlike most other information professionals. The digital library has much to gain from incorporating archival theory and practice into its vision for the preservation and provision of the world's most important, enduring information.

Back to Top


Helen R. Tibbo ( is the Frances Carroll McColl Term Professor in the School of Information and Library Science at the University of North Carolina at Chapel Hill.

Back to Top


1 See J. Krizack's Documentation Planning for the U.S. Health Care System. Johns Hopkins Press, Baltimore, MD, 1994.

2 See, for example, and

3 International Project on Permanent Authentic Records in Electronic Systems (InterPARES);

4 Kenney, A. and Rieger, O. Moving Theory into Practice: Digital Imaging for Libraries and Archives. RLG, Mountain View, CA, 2000.

5 EAD is a document type definition of the Standard Generalized Markup Language (SGML).

6For an extended discussion of the value of the archival perspective for the digital library, see A. Gilliland-Swetland's "Enduring Paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment;"

©2001 ACM  0002-0782/01/0500  $5.00

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2001 ACM, Inc.


No entries found