Technical Opinion: Semantic Ambiguity – Communications of the ACM

Establishing true understanding has been a challenge to human communication throughout history. Not understanding each other was the divine punishment for the people of Babylon. Unsurprisingly, considerable efforts have been made to overcome this problem. The Stone of Rosetta is a very early manifestation of these efforts. In the age of electronic communication, the challenge not only remains, but is multiplied by computer-to-human and computer-to-computer interaction options. With current Semantic Web approaches mainly addressing computer-to-human communication, computer-to-computer interaction in many respects remains a complex and arduous field. The implementation costs of EDI projects, with often significant manual effort needed to disambiguate reference tables for message conversion, illustrate that this particularly applies to electronic business communication.

Is there a remedy against business data equivocality? If at all, is the Rosetta Stone methodological approach—a static translation table—an appropriate strategy to prevent a business communication Babylon today?

Some will argue that the problem of mismatch and misunderstandings in electronic business transactions will vanish over time. If inter-company information flows were based upon one and the same standard, semantic referencing would not be necessary at all. And if this condition would be stable, the problem would be solved. And isn’t there evidence for this development? Look at the convergence of ASC X12 and UN/EDIFACT messages and the move towards core components. See how UN/CEFACT adopted XML-based message types. Will standards convergence, in the long run, not supersede semantic referencing?

Without wanting to either discredit or discourage ongoing projects and initiatives—no, it will not. Universal standards convergence is an illusion. We will have to cope with more than one e-business standard permanently, and these standards will keep changing—creating not only syntax management trouble, but, remarkably more challenging, a lasting situation of semantic variety. Which, in turn, is a sustaining source of potential mismatch and misunderstandings—i. e., semantic ambiguity- in electronic business transactions.

Why will semantic ambiguity prevail? I will shortly discuss three major reasons: first, new technologies, second, changing business processes, third, the globalized economy.

Although it might be expected that advances in technology and e-business application development bring with them a unification of standards, the opposite is the case. Before the advent of the commercial use of the Internet, in the beginning of the 1990s, it was expected by many that within a few years EDIFACT would be the general standard for EDI communications. With the Internet, the Web and the Semantic Web, layers as well as scenarios of business communication multiplied, asking for new tools and standards. RosettaNet, ebXML, RDF Schema and OWL, to name a few, all contributed to standardization and created a whole new range of application and process possibilities. But count: they also turned classical, EDI-FACT-based EDI into being less and less the standard scenario. Entirely novel development areas—today, for instance, creating community-based applications for what is now commonly called the Web 2.0—demand for new types of standards, like Microformats.¹

The development of new business process scenarios also contributes to the growing amount of visible standards. First, with the sheer rising number of companies participating in inter-organizational process chains, electronic markets and business transactions, once secluded in-house standards increasingly come into contact with other companies’ information structures and content and have to be matched. Furthermore, changing process coordination needs stipulate new standards. Time-critical advanced planning solutions for supply chain management scenarios, for instance, call for a tight coupling of shop-floor automation systems with ERP applications, driving, among others, the development of the B2MML language. Web Services and Semantic Web Services further widen the options for business application communication and advanced process designs—but also necessitate standards like WS-BPEL or SWSL. For some emerging business application areas like electronic negotiations, electronic tendering or electronic auctioning only very few standards are available today—many wait to be developed. Finally, with the evolution of the business process paradigm and the growing importance of process integration, the description of business processes has itself become a relevant object of standardization.

Within the globalized economy, business transactions of any company increasingly involve partners from other countries. In some of these countries, specific national or international standards are well established; exposing companies to the need of dealing with the particular standards their business partners from abroad use. Additionally, some large vendors still follow the idea that establishing their own norms as an industry standard will grant them a competitive edge in the world market. This results in a number of want-to-be industry standards for emerging application areas. Even if many of those vanish over time, some winning standards last. Finally, the globalized economy drives the multiplication of standards on a national level. Evolving major players in the world economy—especially China—have discovered that setting a standard may be a powerful means to gain economic standing and influence.² The fact that countries like China, India and Russia play a growing role in the global economy thus may well have an influence on the number of standards worldwide.

How can we achieve semantic interoperability under these circumstances? Classical approaches are comprehensive and static. Comprehensive, because they try to create either a universal super-standard by merging or, at least, build reference tables between preferably all standards. Static, because super-standard or reference tables, once created, remain fixed—like the Stone of Rosetta—until their next version is delivered. But those approaches neither work satisfactorily in present nor in future scenarios. Because of the sheer number of existing and evolving standards driven by the forces described above, a comprehensive system is destined to fail. Because of the dynamic nature of these changes, a static scheme will equally fall short.

In my view, a successful approach to manage lasting semantic ambiguity will show the following three characteristics: it will be dynamic, evolutionary, and community-based.

A dynamic method will not follow the ideal of one super-standard. It will accept the co-existence of a multitude of standards with references between them evolving dynamically and thus allow permanent adaptation to changing standards. An evolutionary approach discards the idea of a complete initial reference base. In a world in which it is unsure from where the next standard originates, this is a mere necessity. Moreover, if the complete initial building of a knowledge base is not intended, ramp-up costs and thus entry barriers for companies wishing to participate in an electronic exchange are significantly lower, which is particularly important for small and medium-sized companies. A community-based method relies on growing its knowledge base by means of user interaction and feedback. As experiences with automated ontology alignment tools show, domain expert knowledge is still needed to raise matching and mapping quality to an acceptable level.³ Communities can supply a platform for such quality improvement at acceptable cost.

A large amount of research and development work still waits to be carried out in this field. The ORBI project is a rare example where methods and application components that support a dynamic, evolutionary, community-based approach are analyzed and developed.^4,5 Its general objective is to build tools for an ontologies-based integration of e-business processes and information flows.

Clearly, such an approach has weaknesses, especially in the starting phase. With its knowledge base being small, references supplied by the system might be erroneous and incomplete. But with a growing knowledge base, fast-rising quality improvement can be expected to occur—as is the case with many community-based Web 2.0 projects. In a world of electronic collaboration options evolving swiftly, the approach relies on the millions of experts out there: the business users at their workplace, knowing their specific business and its semantics best.