Recent work on ontology development has mainly focused on building domain ontology in interorganizational contexts for the purposes of building a Semantic Web. While it is generally accepted that domain ontology can significantly improve knowledge management practices within organizations, very little evidence and information is available on the opportunities and challenges that organizations face in building ontologies. Many organizations consider knowledge management as the key to sustained competitive advantage; however, they are often unsure how to best define the knowledge unit of interest, struggling to efficiently store relevant knowledge artifacts such that retrieval can be fast and relevant. Practical experience often takes a reactive and incremental approach: groups build reports and other documents and sometime later aggregate them into a file repository; the task of tagging documents for effective retrieval is often an afterthought. Even then, the task seems unfinished. Problems relating to knowledge retrieval include superfluous and overwhelming amounts of information, the dynamic nature of information integration, and disparate and fragmented data sources .
This article extracts experiences from a domain ontology development project at Intel Corporation and presents some important implications for researchers and practitioners involved in building organizationally focused ontology development efforts. On the basis of extensive interviews with domain experts and using a hybrid (top-down and bottom-up) modeling approach, we developed a specific ontology as an enabler to integrating knowledge management practices and processes. The project arose out of a need to better utilize past information in performing failure analysis and failure identification (FA/FI) on integrated chips at a large semiconductor manufacturing firm. The initial goal of the project was to make better use of the existing knowledge base on failure causes and analysis tasks to increase throughput. The ontology development was based on workflow reports, which were considered first, and then further refined by archival documents.
A knowledge unit is a coarse set of information elements bound together by structure, assumptions, justifications, and process. These characteristics provide a perspective that typically does not exist with mere information or data elements.
Defining the Knowledge Unit. Consider how this knowledge unit is manifested within the practical context of failure analysis of chip manufacturing in the semiconductor industry. FA/FI initiates when chips in lot samples fail standard quality assurance (QA) testing. Failed chips are sent to various labs for detailed analysis to determine the chemical or structural nature of the failure. A lab may use expensive and sophisticated imaging, such as Transmission Electron Microscopy (TEM), to determine the chemical and/or structural anomalies of a chip sample. From one or more investigations, a root cause of the anomaly is determined that indicates either a design flaw or a production incident.
From the time of the failure to the final determination of root cause, personnel of various organizations and analytical roles generate reports and documentation. Each organization may have its own vocabulary, distinct from other manufacturing process and quality assurance groups. This somewhat autonomous nature among groups may lead to homonym or synonym discrepancies. For example, one group uses the term defect to indicate a condition influencing the failure while another lab only uses the term to mean an anomalous condition specifically relating to an electrical stress failure. In another example, one lab uses the term root cause to indicate the condition influencing the failure, such as the existence of an open void; another group uses the same term to indicate the source of the anomaly, such as an unusual breach in production where an extraneous particle resulted in the anomaly.
Reports are stored electronically within file clusters for each organization, loosely interconnected by identifiers within them for problem description, wafer, lot numbers, and other identifiers. Reports may include images that clarify the nature of the problem, or these images may be located in separate files. A final report on root cause may be developed from one report or an aggregation of intermediate reports from several analysis labs. The competitive nature of semiconductor manufacturing encourages new process innovations. However, failure analysis and resolution is expensive; the analysis equipment and analysts involved are sophisticated. Redundant activity or an inefficient process places unnecessary cost burdens on manufacturing. Therefore, an appropriate calibration of activity coordination against innovation aggressiveness needs to consider the need for shared vocabulary, access and reuse, speed and relevancy, and definition of the knowledge unit.
Why Ontology Is the Answer. Ontology is the basic structure or armature around which a knowledge base can be built [3, 5]. However, ontology is not a programmatic representation; it addresses domain conceptualizations, free of technical requirements. Ontology is not merely vocabulary, nor is it merely taxonomy (see Figure 1). Ontology characterizes itself with respect to mechanism and content . A simple ontology may include a hierarchy of concepts bounded by subsumption relationships. More complex ontologies include axioms to increase the complexity of relationships, concepts, and constraints desired to fully bind the intended interpretation. The knowledge ontology consists of a conceptual model, a thesaurus, and a set of expanded attributes and axioms. Its concern is for the appropriate representation of content, which may later be augmented using specification languages such as UML, RDF, BNF, or formal logic.
The conceptual model represents the metadata encompassing the critical set of concepts and their relationships. The concepts are formed from the control vocabulary. Within our project, we built the control vocabulary from archival workflow data and reports, as well as from semi-structured interviews . A thesaurus complements the conceptual model to document the various names and labels attached to the things1 in the model. Semantic understanding is enhanced by including not only synonyms, but also pseudo-synonyms and acronyms. A correctly constructed knowledge thesaurus assists a user to improve the chances of finding the correct target. Attributes describe the characteristics of the things in the knowledge ontology. The knowledge ontology considers three classes of attributes relating to content, construction, and evolution. Content attributes describe what the thing is, such as size, shape, composition, and analytical characterizations. Construction attributes relate to how the thing came into being: they may include descriptions of the creation method, the owner (organization and/or individual), and the location (or lab). Evolution attributes characterize the dynamic or static considerations, such as version or revision identifier, time, and relevancy (for example, currently applicable or archival). Axioms provide the rules of constraint refining how the things may act or interact.
Defining the Knowledge Lens. A critical step in any knowledge management project rests in characterizing the knowledge unit. Knowledge includes considerations for process, assumptions, and justifications that are only partially subsumed within vocabulary and taxonomy. Additionally, knowledge, unlike data and information, requires a focusing mechanism, a knowledge lens. A knowledge lens synthesizes convergent, legitimate perspectives of the desired knowledge while suppressing the irrelevant. Information and data are relatively flat in the sense that the associative understanding they intend to convey is fairly consistent among disparate contexts. Knowledge, however, is incredibly multidimensional, as well as context and socially dependent. One of the benefits of knowledge ontology is that it clarifies and refines the perspective and intended associative meaning of the specific knowledge unit.
Without the knowledge lens, an individual seeking knowledge sees various types and levels of information, which is often disparate or ambiguous. As the individual focuses with the knowledge lens, the elements start to organize and irrelevant elements are discarded. By further focusing on the lens, the inherent structure and multidimensionality allows for the knowledge unit to be rotated and realigned to accommodate and understand situations from a new perspective. Thus, the lens allows individuals and organizations to gather, retain, and utilize more and more complex types of knowledge. In using the ontology, the target knowledge element is framed by a series of presumed concepts and terminology interrelated to provide meaning. The knowledge lens (see Figure 2) is a general solution facilitating knowledge discovery, and eventually knowledge sharing, that enables the filtering and selection of knowledge artifacts from larger set of possibilities. By using a variety of metadata (most outer circle) and taxonomy (middle circle, which incorporates ontology and can be multiple small layers depending on the function or work) helps knowledge workers (through appropriate tool support) to find what they want in a structured and efficient manner.
There are different ways to designing a knowledge ontology. Ontological development may be approached from a variety of orientations : inspirational, inductive, deductive, synthetic, and collaborative. Increasingly, hybridization and integration of styles is strongly evidenced in the literature. The actual process of ontological development aligns with basic application development, as can be seen in Table 1, along the major concepts of design, development, integration, validation and feedback, and iteration. We will consider these processes using the actual experience gained from our project.
Design. The process of design includes framing the problem statement, defining the scope, developing success and acceptance criteria, investigating tasks and business area goals, and analyzing use cases. The focus for this effort was the analysis lab where images and textual files formed reports of structural and chemical analysis. TEM provides the highest precision of visual analysis with regard to structure and chemical composition and is capable of identifying the numerous layers in a wafer in addition to differentiating chemical compositions based on their density. TEM analysis reports number in the thousands: to refine the analysis to a workable set, a recently validated process was chosen. The task team addressed problem analysis from the perspective of the TEM analysis lab with regard to the chosen process, and set out to develop ontology to represent this perspective, with the following design criteria as the guiding principles:
Develop. Ontological development identifies and extracts the control vocabulary and, by incorporating relevant relationships, develops conceptual models. These models can be expanded with additional attributes and axioms to characterize the models further and embed rules relating to their use. As our investigation progressed, we found more structure within the images and textual reports than was initially anticipated.
In constructing ontology, one of the main challenges becomes "where to start?" In identifying the main elements for the ontology, we looked at the workflow of reports first. These elements revealed entities, relationships, and attributes to be included within the ontology. We extracted the entities and relationships to develop a graphical, conceptual model. To facilitate understanding, the conceptual model was reduced to a series of sub-models organized around the primary elements extracted from the reports. These sub-models depicted subsumptive and interelement relationships. For example, Figure 3 represents how an anomaly (the characterization of a problem) relates to other elements, such as the wafer it was found in, the failure it was manifested within, the root cause of the anomaly, and the report attached to the anomaly. Further, the anomaly itself consists of structural and material properties, and is of a certain type. The graphical nature of the model benefits a wider range of users in that it is easier and quicker to understand than either textual or computational representation.
As sub-models were developed, interviews relating to use cases for problem analysis were conducted with various lab managers, analysts, and customers of the TEM lab. Terminology refinement and augmentation was incorporated into model development through a process resulting in the iterative production of multiple versions of the ontology.
Integrate. Ontology integration starts with the integration of each individual's "knowledge lens" or perspective and expands to the organization or interorganization in conformance with the project's scope. Within the project, we addressed the task of intraorganizational integration by developing a formal interview structure, initiating formal interviews, and incorporating new knowledge elements within the ontology model. Revisions were reviewed with participants repeatedly. Additional extensions to the model in the form of attribute expansion and rules will complete the ontology; however, the model was evaluated as sufficiently robust to provide immediate value and opportunity for subsumption within one or more of available search and browsing tools.
Validation and Feedback. The final model representing the ontology consisted of eight sub-models of moderate complexity. The sub-models were validated qualitatively2 among the participating users and among additional users as adequately representative of the TEM focus. The ontology was further validated by applying the models within constructed queries and manually examined to see how the ontology subsumed the query. Of particular interest were incorporating any necessary, yet missing, elements and ensuring the taxonomy efficiently facilitated the query.
The ontology was encoded using an off-the-shelf product and evaluated for utility by comparing it with a similar environment without the ontology (it should be noted that the tools were unable to represent anything but the most general level of the taxonomy provided by our ontology). Most participating lab analysts preferred the tool in which more of the ontology was embedded over the simple free text search tool, primarily because it provided more functionality including browse, search, and subcategory creation. In the presence of both browse and search options, most users preferred the search capability to the browse, because they could gain access to the desired documents or files much faster. From initial impressions and qualitative feedback, the process of knowledge discovery was noticeably improved by placing structure on the information.
A final validation of this effort occurred with the ontology champion (who was a manager within the organization) embarking on a presentation circuit with the various groups and locations for which the ontology and ontological process were deemed to have value. This face-to-face interaction allowed the ontology proponent the ability to address specific questions of the individual groups, to solicit support, and to highlight the management value of this approach. In summary, by approaching validation in these three ways we find that, while unfamiliar with the notion of ontology, users can instantly see value in this ontology enhanced environment.
Iterate process. Iteration should be occurring within most, if not all, of the process steps as interviews and investigations reveal opportunities for improvement and clarity. It should be emphasized that analysis of the validation and feedback step should be expected to produce subsequent opportunities for refinement. Particularly as ontology development expands to an interorganizational project, integration of disparate views and iteration will expand the richness and applicability of the ontology.
Each organization must address certain constraints within its own unique objectives (see Table 2). These practical considerations are realities for which acceptance and integration, not avoidance, must be addressed.
Resource constraints may impact the scope and completion efforts in developing the ontology. As knowledge management practices increase in firms, so do the efforts required for managing knowledge ownership. Excessive security blocks knowledge sharing; inadequate security allows strategic knowledge to leak to competitors. With goal prioritization and coordination, the application of the ontology should be considered with respect to other factors such as conformance to existing tools, tool customization, evaluation of inter- and intraorganizational value derived from the ontology. Support and maintenance restraints may affect goal prioritization and coordination. Ontology should not be considered as a static exercise; as the organization expands its knowledge, adjustment to the ontology should be made. The following observations are derived from the test and evaluation phases of our project.
First, to fully exploit the capability of the search tool, it should be closely integrated with workflow systems. In fact, in our user organization, the refinement of the workflow system development was proceeding in parallel with the current project.
Second, even if the scope of the search tool is constrained within a departmental level of an organizational structure, the plan for the development of an ontology-embedded search tool should be a corporatewide effort. Specifically, think of knowledge sharing among different knowledge islands in advance. For instance, the project champion of this project successfully drew the cooperation and attention from the corporate IT departments, which ensured the success of the project.
Third, user support is a definite necessity. This type of project requires consultants or researchers to become heavily dependent on domain experts of user organizations.
Fourth, existing taxonomy-based information retrieval tools do not have sophisticated access control methods for archived document access. Therefore, the test environment should be carefully examined for the information security needs of the organization.
Finally, but not the least important, search tools can help in initial extraction of control vocabulary from document repositories. However, it is important to delay the implementation of ontology-enabled search tools until the ontology has been fully tested.
Knowledge management success is enhanced when applying a knowledge lens in an ontological manner. We found that the concept of ontology is embraced by non-IS practitioners when the focus of the ontological development emphasizes content, independently of programmatic formalisms. Ontology development is enhanced by starting with a specific knowledge perspective, which we term the knowledge lens. From this knowledge lens, control vocabulary is extracted and, by adopting a top-down and bottom-up perspective, the conceptual model is developed by applying relationships, attributes, and axioms. Knowledge management requires a continuous system of interaction and iteration with the knowledge owners to validate existing knowledge. Such iteration also allows for additional knowledge to be contributed as the knowledge lens becomes more apparent to all participants. The resulting ontology becomes useful as a foundation for interorganizational communication and ontology expansion, and also for training and intraorganizational value.
3. Kim, H.M. and Fox, M.S. Towards a data model for quality management Web services: An ontology of measurement for enterprise modeling. In A. Banks Pidduck, et al., Eds., CAISE 2002, LNCS 2348, (2002), 230244.
2In our project, the representation of the sub-models visually allowed for refinement to occur through a qualitiative process, that is, through interaction and discussion. It could be possible that a more formal process, such as a review board, might be necessary under circumstances where conflict among groups could not be resolved through more expeditious procedures.
©2004 ACM 0001-0782/04/1100 $5.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2004 ACM, Inc.
No entries found