Research and Advances
Artificial Intelligence and Machine Learning Conceptual modeling and system architecting

Evaluating Modeling Techniques Based on Models of Learning

To compare modeling techniques, combine grammar-based and cognitive-based approaches and test domain understanding.
Posted
  1. Introduction
  2. Basis for Comparison
  3. Framework for Comparison
  4. Evaluating Model Efficacy
  5. Conclusion
  6. References
  7. Authors
  8. Figures
  9. Tables

While there are many techniques for systems analysis, there are few methods or tools for performing comparative evaluations of the techniques. Here, we address the principles we have been using, suggesting combined theoretical and empirical approaches to assess technique performance.

Users of model techniques (typically systems analysts and software engineers) construct understanding from models using prior experience, so the information represented is not necessarily information understood. Consequently, the usefulness of any technique should be evaluated based on its ability to represent, communicate, and develop understanding of the domain. We therefore advocate using problem-solving tasks requiring model users to reason about the domain as part of their technique evaluation.

A clear definition of requirements is often critical to IS project success. Industry studies have found high failure rates, mainly for the following reasons: lack of user input; incomplete or unclear requirements; and changing requirements [9]. While it may be obvious that developing clear conceptual requirements is important for success, it is not apparent how the task is best accomplished. This problem was noted in [3], which suggested that the most difficult part of building software is “the specification, design, and testing of this conceptual construct, not the labor of representing it and testing the fidelity of the representation. We still make syntax errors, to be sure; but they are fuzz compared to the conceptual errors in most systems.”

The job of communicating about the application domain is aided by systems analysis modeling techniques. Techniques are routinely created for modeling business operations in terms of structure, activities, goals, processes, information, and business rules. Formal techniques are thus not unique to IS projects. When constructing buildings, for example, scale models and blueprints are common, and movies follow a storyboard and a script. In each case, the models communicate what needs to be accomplished to the project’s stakeholders. Building these conceptualizations can be viewed as a process whereby the people directly responsible reason and communicate about a domain in order to improve their common understanding of it. From this perspective, requirements development can be viewed as a process of accumulating valid information and communicating it clearly to others. This makes the process of requirements development analogous to the process of learning, and it is from this perspective that we approach the evaluation of alternative modeling techniques.

Which technique is right for a given project? Given their importance in IS projects, it is not surprising that many have been proposed [8]. Practitioners must often choose which one(s) to use with little or no comparative information on their performance. At the same time, new techniques continue to be developed. The abundance of techniques (and lack of comparative measures) has produced a need to compare their effectiveness [12]. Discussions of principles and specific approaches for such comparisons need to be based on three guiding principles:

  • Anchor the measurement of performance to theory;
  • Address issues related to a particular technique’s ability to represent information, as well as model users’ ability to understand the representation; and
  • View the analysis and modeling of a domain as a process of knowledge construction and learning.

Here, we present a framework for empirical comparisons of modeling techniques in systems analysis based on Mayer’s model of the learning process, as described in [7]. Following this approach, the knowledge acquired by a user of the systems analysis model can be evaluated using a problem-solving task, where the solutions are not directly represented in the model. These tasks measure the understanding the viewer of the model has developed of the domain being represented [6].

Modeling grammars are tools for creating representations of domains. In the context of systems analysis, a modeling grammar is a set of constructs (typically with graphical representations) and rules for combining the constructs into meaningful statements about the domain [11, 12]. Hence, grammars (and their graphic representations) are used in two tasks: create a model that is a representation of a domain, and obtain information about the domain by viewing a model “written” with the grammar. We term these two tasks “writing,” or representing, a model and “reading,” or interpreting, a model. In the writing process, a systems analyst reasons about a domain and formalizes this reasoning using the grammar (and the symbols used for representing grammar constructs). In a reading task, the model user (typically a software engineer) creates a mental representation of the domain based on the grammar’s rules, along with the mapping of the symbols to grammar constructs and the information contained in the model. Note, the outcome of writing is explicit, usually in the form of a graphical model. The outcome of reading is implicit, or tacit, as it is hidden in the reader’s mind.

Back to Top

Basis for Comparison

What is the most effective basis for comparing techniques: theory or phenomenon? It is important to consider the objective in any such comparison. If it is simply to find which technique performs better, then it might suffice to set up empirical tests and focus on observations of phenomena related to their use. However, observations alone cannot explain the differences between techniques or what aspects of a technique contribute to its efficacy. Thus, to seek more general principles for the effective design of modeling techniques, software developers need a theory to be able to hypothesize why differences might exist. The focus in early comparisons of modeling techniques was on empirical observations [1, 10]. More recent work has incorporated theoretical considerations as part of the comparison of grammars [2, 4, 5, 11]. Theoretical considerations can be used to both guide empirical work and suggest how to create more effective grammars.


Information represented is not necessarily the same as information understood.


Addressing techniques via their grammars provides for a theoretical approach to their effectiveness. Specifically, grammars can be evaluated by comparing them to a modeling benchmark. Such benchmarks would contain a set of basic constructs expressing what should be modeled by the grammar. A benchmark that is a set of standard generic modeling constructs is called a metamodel, or a model of a model. A model of a specific domain is an instantiation of the metamodel. More specifically, a metamodel can be defined as a model describing those aspects of the model that are independent of the domain (universe of discourse) described by that technique in a specific case [8]. For example, the Unified Modeling Language (UML) is defined by a metamodel specifying the elements of UML and their relationships; the UML metamodel is itself described using UML constructs.

If the benchmark is based on a set of beliefs of what might exist and happen in the modeled domain, it is called an ontology. The notions of metamodel and ontology are not completely different. A metamodel is usually based on some beliefs about what might exist in the modeled domains. However, such beliefs are often not explicated. An ontology can be general (as in Bunge’s ontological model in [12]) or specific to a domain (such as ontologies constructed for the medical and manufacturing domains).

The ability of a grammar to create models that capture the information about the modeled domain is called expressiveness. The expressiveness of a grammar can be evaluated by examining the mapping between the benchmark concepts (based on an ontology or on a metamodel) and the grammar’s constructs. When a generic metamodel is used as a benchmark, the grammar constructs are usually specializations of the metamodel’s concepts [9]. An examination of the mapping can reveal, in principle, deficiencies in the grammar [11]. For example, a benchmark concept to which there is no matching grammar construct would mean the grammar is incomplete. If more than one benchmark concept maps into the same grammar construct, models created with the grammar might lack clarity.

Since a benchmark-based evaluation method requires only objective knowledge of the constructs in a modeling grammar, it can be termed a grammar-based approach. The definition of a grammar-based approach involves three main steps: utilize a recognized benchmark—an ontology or a metamodel—for evaluating a grammar; establish clear differences among alternative grammars based on the benchmark; and highlight the implications of these differences by generating predictions on the performance of the grammars. The first two might require considerable effort but can be accomplished independent of any consideration of the people in the requirements process. In contrast, the third—predicting the effect of a grammar’s expressiveness on the effectiveness of its use by individuals—requires understanding the cognitive effects of models. Such effects are not easily predicted. Even if we know a certain grammar is ontologically expressive, we may have no theoretical line of reasoning to establish it is also preferable with respect to developing an individual’s personal understanding. If a logical link cannot be established between the features of a grammar and human understanding, the third step can be resolved only through direct empirical observation.

This theoretical analysis evaluates a grammar based on its ability to generate models that are good representations. Grammars can also be compared based on the information conveyed to the viewer of a model. While one modeling grammar may be highly expressive and hence superior in a grammar-based evaluation, a representation created with that grammar might be overly complicated, leading to difficulties in developing domain understanding by the person viewing the model. In short, information represented is not necessarily the same as information understood.

When “reading” a model, the outcome is implicit; it is in the reader’s mind. Thus, the evaluation of a grammar on the basis of its ability to convey information should take into account cognitive-based considerations. Specifically, it should recognize the difference between a representation and the resulting cognitive model developed by the viewer. It follows that grammar evaluations should account for the viewer’s information-processing activity. Since cognitive processes cannot be evaluated, except by observing tasks performed by humans, such evaluations are necessarily empirical.

Summing up our approach to evaluating modeling techniques, we recommend following three steps:

  • Establish whether there are differences in grammars among the techniques;
  • Suggest why these differences might lead to differences in grammar performance; and
  • Perform observations to assess grammar performance.

The first two can be based on theoretical considerations. The third requires cognitive approaches to explain performance differences and establish empirical procedures to measure whether or not differences exist.

Grammar-based and cognitive-based approaches should be considered complementary and not mutually exclusive. Grammar-based approaches can identify differences and generate predictions regarding grammar efficacy. Cognitive-based approaches can suggest ways to observe the effects of grammar differences and test the predictions. A strictly grammar-based approach, with no consideration of actual performance, will not lead to convincing arguments that grammatical differences matter. Similarly, developing cognitive-based approaches without establishing why differences among grammars exist would hinder a modeler’s ability to understand why certain grammars might or might not be advantageous. We therefore propose that grammar-based and cognitive-based approaches be combined to create methods for comparing modeling techniques.

Back to Top

Framework for Comparison

Given these considerations, we use Mayer’s framework of learning [7] for reasoning about empirical comparisons. It assumes that model viewers actively organize and integrate model information with their own previous experience.

We therefore recommend examining three antecedents of knowledge construction: content; presentation; and model viewer characteristics. The content represents the domain information to be communicated. The presentation method is the way content is presented, including grammar, symbols, and/or media. Model viewer characteristics are attributes of the viewer prior to viewing the content. They include knowledge of and experience with the domain and the modeling techniques used to present information. This framework is outlined in the figure here.

The three antecedents influence knowledge construction. This is a cognitive process, not directly observable, in which the sense-making activity is hypothesized to occur. The results are encoded into the model viewer’s own long-term memory, thus forming the basis for a new understanding. Mayer calls this new knowledge the “learning outcome,” modifying the model viewer’s characteristics, as shown in the figure, and being observed indirectly through learning performance tasks. We posit that subjects who perform better than other subjects on these tasks are assumed to have developed a better understanding of the material being presented.

The framework in the figure highlights a set of constructs for researchers to consider in developing empirical comparisons of modeling techniques. For example, recognition of the three antecedents suggests empirical designs have to control some antecedents while studying others. Researchers can therefore focus on differences in content, presentation, and model viewer characteristics when considering comparisons. The difference(s) among grammars is manifested in the “presentation.” Therefore, researchers need to control for the amount of content in different treatment groups, as well as individual characteristics (such as prior domain knowledge and knowledge of modeling techniques).

Back to Top

Evaluating Model Efficacy

Previous studies (such as [1]) evaluating modeling techniques have employed comprehension tests comprising questions about elements in the (usually graphical) model. Model comprehension is a necessary condition for understanding domain content. However, comprehension by the model viewer does not necessarily imply the viewer understands the domain being represented. Additional processing may be required for understanding [7]. Therefore, it is important to test domain understanding, not just comprehension of elements in the model.

Testing understanding. Mayer described his procedure for assessing learning via understanding in [7]. The related experiments typically include two treatment groups: one provided with a text description and a graphical model (the “model” group), the other with only a text description (the “control” group). After being viewed by participants, the materials are removed; the participants then complete a comprehension task, followed by a problem-solving task.

In the experiments described in [7], the comprehension task included questions regarding attributes of items in the description and their relationships. For example, participants were given information on the braking system of a car. Comprehension questions included: What are the components of a braking system? and What is the function of a brake pad? They were then given a problem-solving task, including questions requiring that they go beyond the original description, including: What could be done to make brakes more reliable? and What could be done to reduce the distance needed to stop? To answer, participants had to use the mental models they had developed to go beyond information provided directly in the model. Performance on such tasks indicated the level of understanding they had developed. Surprisingly, with graphical models, they provided more and better answers, even though the answers were not provided directly in the model.

Empirical comparisons. As summarized in the table here, problem-solving tasks have been used to compare modeling techniques [2, 4, 5]; using them to evaluate modeling techniques revealed significant differences in answers where no significant differences were found in comprehension tasks. For example, [2] compared two versions of entity-relationship diagrams describing how a bus company organizes its tours. One used optional properties; the other used mandatory properties with subtypes. A theoretical (ontology-based) analysis indicated a preference for mandatory properties. No difference between two modeling alternatives was found in the answers to comprehension questions, including: Can a trip be made up of more than one route segment? and Can the same daily route segment be associated with two different trip numbers?

Bus-route problem solving included such questions as: All seats on the bus have been taken, yet there is a passenger waiting to board the bus? and What could have happened to cause this problem? Researchers observed differences across the answers where participants viewing models with mandatory properties produced significantly more correct answers.

A similar pattern was found in [5] with OO and structured analysis methods; so did [4] with UML diagram decompositions. Note that significant differences in problem-solving tasks were observed, even though the answers to the questions were not provided directly in the diagram. Since the materials were taken away before participants were asked the questions, we can conclude that the differences reflect differences in participants’ cognition.

Back to Top

Conclusion

We have addressed the key principles for empirically evaluating systems analysis modeling techniques, focusing on three main points:

Theoretical grammar-based and empirical cognitive-based approaches should be combined. A combined approach predicts and explains what differences might exist among techniques and tests if these differences matter.

Information represented is not necessarily information understood. Model “readers” construct personal understanding from models using their prior experience. Modelers should therefore evaluate a modeling technique on its ability to represent, communicate, and promote an understanding of the domain.

Modeling techniques should be evaluated using tasks that require reasoning about the domain. To this end, we advocate using problem-solving tasks. Model users are not simply empty vessels to be filled with model information but knowledge constructors who learn about the domain by viewing models and integrating them with prior experience.

Back to Top

Back to Top

Back to Top

Figures

UF1 Figure. Modeling as knowledge construction [

Back to Top

Tables

UT1 Table. Studies involving problem solving measures.

Back to top

    1. Batra, D., Hoffer, J., and Bostrom, R. Comparing representations with relational and EER models. Commun. ACM 33, 2 (Feb. 1990), 126–139.

    2. Bodart, F., Patel, A., Sim, M., and Weber, R. Should optional properties be used in conceptual modeling? A theory and three empirical tests. Info. Syst. Res. 12, 4 (Dec. 2001), 384–405.

    3. Brooks, F. The Mythical Man-Month: Essays of Software Engineering, Anniversary Edition. Addison Wesley, Reading, MA, 1998.

    4. Burton-Jones, A. and Meso, P. How good are these UML diagrams? An empirical test of the Wand and Weber Good Decomposition Model. In Proceedings of the 23rd International Conference on Information Systems 2002, L. Applegate, R. Galliers, and J. DeGross, Eds. (Barcelona, Spain, Dec. 15–18, 2002).

    5. Gemino, A. Empirical Comparison of System Analysis Techniques. Ph.D. Thesis, University of British Columbia, Vancouver, BC, June 1998.

    6. Mayer, R. Multimedia Learning. Cambridge University Press, New York, 2001.

    7. Mayer, R. Models for understanding. Rev. Edu. Res. 59, 1 (spring 1989), 43–64.

    8. Oei, J., van Hemmen, L., Falkenberg, E., and Brinkkemper, S. The Meta Model Hierarchy: A Framework for Information Systems Concepts and Techniques. Tech. Rep. No. 92-17, Department of Informatics, Faculty of Mathematics and Informatics, Katholieke Universiteir, Nijmegen, The Netherlands, July 1992, 1–30.

    9. Standish Group International, Inc. Chaos 1994: Standish Group Report on Information System Development. Yarmouth, MA, 1995; see www.pm2go.com/sample_research/chaos_1994_1.php.

    10. Vessey, I. and Conger, S. Requirements specification: Learning object, process, and data methodologies. Commun. ACM 37, 5 (May 1994), 102–113.

    11. Wand, Y. and Weber, R. Information systems and conceptual modeling: A research agenda. Info. Syst. Res. 13, 4 (Dec. 2002), 203–223.

    12. Wand, Y. and Weber, R. On the ontological expressiveness of information systems analysis and design grammars. J. Info. Syst. 3 (1993), 217–237.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More