Opinion
Artificial Intelligence and Machine Learning

Digital Village: Value-Added Publishing

Posted
  1. Introduction
  2. Information Delivery in the Gilded Age of Computing
  3. Feedback, Interactivity, and Support
  4. Author
  5. Sidebar: Potential Metalevel VAP Enhancements
Cargo-Bot app on iPad

Without question, electronic publishing is one of the hottest topics in computing. Groups worldwide want to know how to do it well, how to advertise it effectively, how to enhance the capabilities of electronic publishing to include emerging multimedia technologies, and, most of all, how to make money at it. In the future it will be increasingly important for successful publishers to add value to publications over and above the original content.

This column outlines some of the fundamental issues connected with the addition of value to electronic publications. Some of these issues have already been translated into products and services, while others have not.

Back to Top

Information Delivery in the Gilded Age of Computing

While the term electronic publishing takes on a variety of different meanings in different settings, one core principle holds true across all domains: electronic publication involves the distribution of digital documents. In its simplest form, electronic publishing may amount to little more than a "porting" of printed information over to the digital networks via scanning, OCR technology, and so forth. Augmented with some very basic accounting software, many publishing sites are in the business of serving up static HTML versions of their publications via the Web. In its more complex forms, however, electronic publishing will redefine itself in the light of available computer and network technologies. Our present goal is to outline the ways in which this redefinition may be achieved.

Although most of the critical technologies needed for electronic publishing have existed for decades, it has only been in the last few years that traditional, in-print publishers have taken it seriously. There were two basic reasons for this delay—one technological and one pragmatic. On the technology side, the primary intended venue for electronic publishing, the World-Wide Web, lacked two essential capabilities. First, it lacked secure HTTP transactions until about 1995. Without secure transactions, selling via credit cards would entail excessive risk due to digital eavesdroppers, packet sniffers, and other network nematodes. At the same time, there were no widespread standards for, and implementations of, electronic billing systems. Means had to be developed to charge in millicent amounts and accumulate charges until they reached cost-effective invoicing limits. These two pieces of technology were in place (in a variety of different forms, in fact) around mid-decade, thereby making the simpler forms of electronic publishing possible.

On the pragmatic side, no one knew (in fact, it could be argued that it’s still unknown) how to develop a sound business plan for electronic publishing. While it was widely assumed that adding electronic publishing products would irrevocably change the economics of publishing, few felt comfortable in speculating whether this would ultimately be good or bad for the industry. Many publishers jumped on the electronic publishing bandwagon for the worst of reasons: they were afraid of being left out of the future markets. In so doing, they packaged intellectual property basically the same way as Gutenberg did except for the addition of digital delivery mechanisms.

That is the primary cause of the popularity of digitizing anything and everything in print; herd mentality dictates that if you don’t have a good plan, do what everyone else does or thinks they should do.

In any case, all of the essential hooks for electronic publications are now in place. Advanced publishers can solicit, edit, produce and distribute electronic publications with not so much as a single piece of paper changing hands (not including signed copyright transfer forms and contracts, of course). The Web and the Internet has forever changed the face of publishing. But is this for good or ill?

The biggest misconception about electronic publishing is that its value lies in the ability to disseminate digital information over computer networks in a manner analogous to physical distribution of hard copy. There seems to be a tacit faith in a twisted variation of Metcalfe’s Law (the value of the Internet increases with the square of the number of nodes) to the effect that the value of electronic publishing increases with the square of the number of documents on the Internet. While this sounds good, it’s likely to be false. It is more likely that the value of electronic publishing varies inversely with the square of the number of documents.

This misconception has driven virtually every publisher into some form of electronic commerce. Nowhere is this more obvious than with academic and scholarly publications. Seen as a way of mitigating against the problem of slumping sales and an annual 5%–10% downturn in subscriptions, electronic offerings are thought to hold out the greatest promise of revenue growth—even a modest 5%–10% annual growth. But this reasoning ignores the fact that the decline of the academic publishing industry is inextricably linked to the overall economy, the widespread perception that there is already too much information available for most personal bandwidths, and the perception that only a small percentage of the information in the typical publication is relevant. Readers are, therefore, "voting" with their pocketbooks by canceling subscriptions. Publishers worldwide are assuming electronic publishing is the silver bullet that will save the day. Some publishers point to the capabilities of the networks to lower overhead and production costs, support a wider variety of advertising and marketing venues (for instance, broadcasting, narrowcasting, and personal-casting), and the ability to increase margins by dealing directly with the reader rather than distributors and middle-men as signs electronic publication will provide new opportunities for publishers seeking to turn their fortunes around. In other words, some publishers are working under the assumption that the decline in interest in scholarly and technical publications can be reversed if just those publications could be produced and marketed more cheaply electronically. It just won’t work that way; the publications avoided in hardcopy will be avoided in electronic form as well.

If the digitization of things publishable won’t get us far, what will? The payoff in electronic publishing in the future will be the deployment of new technologies for the integration of digital documents into the network fabric of associated ideas, texts, times, and people. Publishers will need to be more than just the providers of digital documents from their digital warehouses. They will also need to connect a document with its contexts. Thus, a digital document could be tightly integrated into the cybersphere of all related documents in a way that traditional publishing cannot permit. Such publishing could provide not just the documents, but their connections to other data sources, as well as other valuable information. This is the essence of "value-added" publishing.

Value-Added Publishing (VAP) is a natural extension of traditional publishing with the additional feature that the publication vehicles and venues accept from and react to additional, previously integrated and assimilated networked media. The challenges of VAP are likely to lie in such areas as:

  • Content enhancement
  • The encouragement of synergy between and among information providers, information consumers, and the resources they share
  • The addition of interactivity and feedback loops to traditional delivery systems
  • A reorientation of both the information provider and information consumer toward the "process" of publishing, rather than a focus on the individual products and services
  • Metalevel analyses and intelligent restructuring of document collections
  • Ad hoc document quality ranking and recommending systems

As one can see from this partial list of services, VAP must use a more advanced set of computational and network tools from that of its early electronic publishing ancestors. We illustrate these points with a selected enlargement of some of the aforementioned categories.

Content enhancement. One convenient way of viewing electronic publishing is as the exchange of information between an information provider and a information consumer via an intervening computing network infrastructure. While the content of a document is central to this exchange, it is not necessarily paramount since its value is utilitarian rather than intrinsic. That is, the value of the content is not independent of the ability of people to read it, view it, use it, reference it, and so forth. From the point of information retrieval, information which cannot be found or used is worthless.

Content enhancement involves the study of enrichment of the semantic and syntactic content of a document. The enhancement of semantic (alt., conceptual, deep) content can be thought of as an attempt to extract more meaning from the documents. A report, summary, extract, abstract, translation, or "gist" by an intelligent agent would be considered a semantic enhancement in this sense, as would results reported by natural language understanding and translation systems, and the automated inclusion of new hyperlinks.

The enhancement of syntactic (alt., grammatical, tag-based) content, on the other hand, would affect the way documents are structured, indexed, taxonomized and linked within the intervening network and computer resources. An example of enhancing syntactic content would be adding structure to documents for the benefit of helper agents, search engines, indexing tools, data mining, and warehousing applications.

Value-added metadata. While content enrichment of electronic publications is the holy grail of VAP, it is at the same time the most difficult strategy to implement. Some problems (complete natural language understanding, for one) are intractable given the current state of the computationalists’ art. Adding value through metadata, while less ambitious, holds out much greater promise in the short term.

Metadata is information "about" an electronic document, resource, or the operation of a computer system. For example, "confidence indicators" might provide useful information about a document or resource. We would expect that knowing an electronic publication produced a Pulitzer Prize would increase the credibility of the author and the value of the document (at least as an object of study), as would favorable reviews by the leading authorities on the subject, for example. The imprimatur of a publication might also be relevant, as some electronic publishers might be known to have higher standards than others.

Similarly, recommender systems assign assessments or recommendations to documents and resources that are as reliable as the confidence one has in the recommender system. Helper agents, brokerage systems, flash lists, and so forth also provide metalevel value in their evaluation and recommendation of documents.

Revision control systems, which collect metalevel information about various versions of a document, add value by helping create stability and continuity in network documents. In these systems, versions of documents are indexed in such a way that any particular version may be retrieved, with or without its predecessors or ancestors.

The sidebar illustrates the types of enhancements that might result from the judicious collection and use of metalevel information about electronic offerings.

Back to Top

Feedback, Interactivity, and Support

Content-based and metadata-based value-adding are two of the four strategies for building value in electronic publications. We add to the list two more components: (1) feedback-based value-adding and (2) interactive value-adding. Services of this type collect data from users that reflects their perceptions of their experience. Out of that collective experience might come useful comments, identifications of "hot" documents by some measure of use, average rankings of sites, group interactions, and so forth that will speak to the issue of the perceived value of content.

To this we must also add support-based value-adding—technologies that may not directly add value to a document, but that support the addition of value by other means. In other words, they are necessary conditions for the deployment of a VAP system. This might include database technologies, statistical and clustering tools, revision control system software, editing tools, information customization clients, and so forth.

Electronic publishing in the next century will be fundamentally different than it is today. I predict the most successful, early applications of VAP will be such things as:

  • Publications with limited commercial appeal
  • Publications with narrow audience appeal
  • Digital digests (i.e., personalized magazines assembled from many sources)
  • Focused retrieval publications (personalized encyclopedias)
  • Home-grown, personal publications
  • Interactive publications
  • Public interest/public awareness publications
  • Reference materials

Electronic publishing will evolve as developers and researchers are inspired to take more extensive advantage of computing and network technology, and slowly but inexorably move away from the notion that the paramount value of a document is its content. Additional enhancements such as those outlined in this column will establish the importance of the role of the digital or cyberspace context of information.

Many of these thoughts have evolved as a result of my serving nearly six years on the ACM Publications Board. By continually revisiting the questions of what we were doing and why we were doing it, this conceptual overview of the future of electronic publications began to take shape. The launch point was my belief (controversial, as it turned out) that ACM should move away from the policy of holding copyrights for its publications (www.acm.org/pubs/copyright_ policy/). I remain convinced that trying to fix one version of an electronic publication as definitive and copyrightable will prove as difficult as trying to paint falling leaves. In my view, electronic publications of the future will resemble filmstrips—each frame will incorporate some improvement, alteration, or reference which (in the ideal case) will have more value than its predecessor. In this sense, Ted Nelson’s notion of transpublishing is much like many layers of intersecting filmstrips, each of which has one cell that aligns with the cells of others.

Back to Top

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More