Research and Advances
Artificial Intelligence and Machine Learning What UML should be

Be Clear, Clean, Concise

UML needs to focus on its foundations.
Posted
  1. Introduction
  2. Language Abstraction Vs. Modeling
  3. Conclusion
  4. Authors
  5. Footnotes

The Clear, Clean, Concise (3C) UML2 proposal makes the language easier to understand and enables it to describe a broader range of systems, from Web agents and services to entire business communities. System designers can also use 3C concepts independently to make more sense of UML1, UML2, and any other modeling language.

3C establishes a set of fundamental modeling concepts, stipulates how they apply to systems, and uses them to redefine the plethora of existing UML concepts. 3C applies the method used in geometry, physics, and other sciences in which a few "primitive" terms (such as point and line) permit the definition of an indefinitely large number of other terms (such as parallelogram). A modeling language lets a designer express concepts about a system—some tied directly to system phenomena (such as objects) and others building on top of them (such as abstractions). A model element is a linguistic construct representing some specific system(s) built on these concepts; for example, an OrderNumberIssuer object is a linguistic construct representing part of an e-commerce system.

UML1 defines 144 distinct concepts. The U2P proposal is even more complex. The 3C UML infrastructure would replace this complexity with 15 primitive concepts and allow them to be extended indefinitely. This method maintains perfect consistency with 3C’s core 15: individuals (objects, actions, and associations), abstraction (combination and view), type, specification, role, template, time, place, possibility, and change. They can be used to define such UML concepts as attribute, behavior, classifier, use case, interface, and collaboration, as well as many others (such as process, activity, and data flow).

Here, we explain five of the 3C primitives and introduce two defined concepts—inheritance and subtyping—showing how primitive concepts are used to define additional concepts.

Individuals. Individuals are modeling concepts representing phenomena that can be observed in a system, in the same way a door in a blueprint represents a door in a building. Similarly, a model of a Java object represents that Java object as part of a runtime system. The 3C UML proposal1 partitions individuals into three fundamental sorts: objects, actions, and associations. UML1 does not make clear the categories of individuals it recognizes, while some UML2 proposals banish individuals from the modeling language altogether. Try to imagine a language in which only generalities are allowed; it would be impossible to verify any of its statements, since verification depends on observations about individuals.

Object. An object represents some system phenomenon the modeler wants to regard as stable over some indefinite time and as occupying a place; for example, people, corporations, automated systems, programs, software objects, database tables, and rows in a table can all be represented by objects.


3C believes UML should permit many different practices for many different purposes.


Action. An action represents a phenomenon that might be observed to happen in a system; for example, the installation of an application server on a hardware node, the instantiation of a Java object from a Java class, the insertion of a row in a table, or the trade of a stock, might all be represented by actions. An action represents the event itself, not the record of an event, which is represented by an object.

Types. Types classify individuals, which are then instances of the type; one individual may be of many types, including, say, father, Californian, and programmer.

Specifications. Specifications are descriptions of the phenomena modeled by elements in terms of UML’s graphical, natural, and formal sublanguages. However, specifications, as a matter of simple logic, cannot be (as the UML1 specification says they are) the same as the elements themselves. In the blueprint, the architect points and says, "This door is solid oak." But this particular specification of the door is not the door in the blueprint, which represents, rather than specifies, the actual door in the building. Instead, a model element can have a specification; for example, a specification for the object type EquilateralTriangle, might be "a triangle with three equal sides," but how can EquilateralTriangle be its own specification? The logical inconsistency of this UML1 position is also evidenced in other ways: The same type can have multiple specifications; for example, one cell phone system might specify CellLocation with polar coordinates, while another specifies it with Cartesian coordinates. Each specification applies to exactly the same locations.

Inheritance. Inheritance is a reuse relationship between specifications; for example, a specification of Rectangle can inherit from a specification of Square, since the sideLength attribute of Square can be reused, adding a heightLength attribute to get a Rectangle specification.

Subtyping. Subtyping, on the other hand, is a relationship between types; for example, the type Square is a subtype of the type Rectangle, since every square is a rectangle. Because UML1 (as well as the U2P, 2U, and OPM proposals for UML2) do not distinguish elements from their specifications, they confuse the concepts of inheritance and subtyping. Inheritance holds in one direction, subtyping in the opposite direction. UML1 (as well as the U2P, 2U, and OPM proposals) use only inheritance but either presume or require that subtyping march together with inheritance. This forced correspondence expresses the judgment of the language designer concerning best modeling practices. 3C believes UML should instead permit many different practices for many different purposes.

Back to Top

Language Abstraction Vs. Modeling

UML1 (as well as the U2P, 2U, and OPM proposals) seems, without being explicit, to make UML an abstraction of today’s class-oriented programming languages, including C++, Java, and C#. However, programming languages and their code cannot be modeled by other languages, in the sense a system can be modeled in a language. Instead, in its most common and best use, UML provides a higher-level language in which details of the program are suppressed. By adding these details and performing a few transformations, source code is produced—an essential capability that must be preserved.

A source code abstraction expressed in UML helps programmers plan their code before they fill in the details. A system model expressed in UML helps designers determine whether the source code will satisfy the system’s requirements.

Using UML for modeling, both the designer and the modeler must focus on the modeled subject, not on specifications. UML1 specifications, as well as some of their inspirations, such as Jacobson’s Object- Oriented Software Engineering, define model elements as specifications of things in systems, not as representations of them; for example "A class is a specification of …" and "A use case is a specification of …" If all the UML concepts are some kind of specification, no basic concepts are available from which UML can build its specifications. That is, the UML specification is a collection of definitions of different kinds of specifications, not different kinds of systems phenomena.

Should UML be a language for specifying systems or a language for specifying specifications? If this approach to defining modeling concepts were applied to physics, an atom would not be something found in matter but a kind of specification.

Moreover, to understand a system model, rather than a program abstraction, primitive modeling concepts must be independent of system boundaries, which can be drawn more or less narrowly. UML (even the current version) can be used in a fractal manner at any level of magnification; for example, entire enterprises can be treated as objects; so, too, can the automated systems within an enterprise, as well as the components within a system and the software objects within a component. Therefore, the same thing may be inside a system at one level of focus and outside a smaller system at another level.

Concepts that are relative to system boundaries (such as actor) should be defined in terms of boundary-independent concepts (such as role and object). This insight represents one of the great economies of the 3C proposal: The language learner need not start over for things inside and outside the system or treat things of different size inside the system as having entirely independent natures. For example, in 3C an interface to a programming language object is the same sort of thing as a port of a component.

Consider the following example showing how 3C concepts can be applied to understanding UML1. Many systems analysts find standard use-case doctrine confusing; for example, only some of the actions comprising a use case are represented by other use cases included in this first use case. Determining why is difficult because use-case inclusion is an inheritance relationship between use-case specifications, not a relationship between the action of the use case as a whole and the actions making up the use case. This relationship of parts to wholes between the actions (called steps) in a use case and the action of the use case as a whole is an example of a 3C "abstraction by combination." Each step in the use case is part of the use- case action, though not every step has an independent specification the use-case writer might wish to reuse. This 3C understanding of what is going on with use cases is not formal doctrine and would be vigorously attacked by most use-case gurus. But we have surely found it easier to explain and learn, and it actually helps the learner produce use-case models regarded by experts as correct.

Back to Top

Conclusion

Redefining the UML specification using the stable infrastructure provided by 3C will make understanding UML concepts easier for software designers and business modelers, facilitating clearer communication among them. This foundation of primitive terms will also make UML easier to learn, use, and automate, while extending its shelf life. It will be more adaptable to programming languages not yet designed and modeling styles not yet invented. It will also be applicable to the entire range of problems confronting the systems engineer, from business process design to network topology planning.

UML has been a boon to software development. UML1 works; UML2 will work even better. As with many good programs, the trick to UML’s utility is that nobody reads the specs, just the result. But at a certain level of complexity, such a trick becomes an obstacle. UML was created by incorporating the concepts of many interests. Further growth, without first clarifying foundations, will lead UML toward the obsolescence of scope-bloated systems not factored into components via a robust logical model.

Back to Top

Back to Top

    13C UML is based on the ISO Reference Model of Open Distributed Processing, model theory, scientific methodology, and natural language semantics (see www.cuml.org/references).

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More