Research and Advances
Artificial Intelligence and Machine Learning What UML should be

Evolution, Not Revolution

Any consideration of altering UML must account for its current user base and its potential role as the keystone of a new model-based method of software development.
Posted
  1. Article
  2. Conclusion
  3. Authors

Since its introduction by the Object Management Group in 1997, the Unified Modeling Language has been adopted by both practitioners and researchers at a rate exceeding even OMG’s most optimistic predictions. This popularity has confirmed the genuine need for such a standard to serve as a communication medium for both humans and software tools. However, the same forces have also created strong pressure to improve the effectiveness of UML and provide it with a multitude of new features.

To specify what the next major release of UML should look like, we first need a clear understanding of its purpose. UML was conceived as a general-purpose language for modeling object-oriented software applications. The reason we construct models of systems, whether of software or of buildings, is the same: to understand the suitability of a proposed solution before going to the trouble and expense of actually constructing it. As the complexity of software programs increases, challenging our comprehension, the need for modeling software becomes more apparent and more compelling.

To serve its purpose, a modeling language has to allow concise expression of the essential aspects of the software system being designed while omitting irrelevant detail. The basic constructs provided by the language must map closely to the problem domain. In contrast, programming languages, including those classified as "high level," such as C++ and Java, tend to be much closer to the underlying computing technology. In a sense, the distinction between modeling and programming languages is similar to the distinction between assembly and programming languages; it represents a further abstraction step away from the underlying implementation technology. This suggests the possibility of "programming" our applications using modeling languages instead of programming languages.

Using modeling languages to specify software is a primary motivation for OMG’s Model Driven Architecture (MDA) initiative, a plan for a series of industry standards based on the premise that, instead of programs, the primary artifacts of software development will be models created with modeling languages, including UML. With suitable technology, the corresponding programs can be generated automatically from the models, either fully or in large part. When fully realized, MDA will have a revolutionary effect on the way software is developed, comparable in scope to the revolution in productivity and reliability resulting from the introduction of the compiler.

Any practical consideration of key capabilities must account for the reality that UML, in its current form, is widely deployed and supported by numerous commercial and other tools. Hence, it is our view that progress is best served by building on this foundation, evolving UML rather than replacing it with something different. This evolution must occur in a number of different areas.

Precise definition. The original UML specifications, culminating with UML1, were based on input from a relatively large team of experts from different companies and with diverse backgrounds. It was inevitable that such diversity would result in ambiguities and semantic overlaps that have since prompted many questions about the specific meaning of certain UML concepts. A primary requirement for the future UML2 specification is their elimination by providing a precise definition of the semantics of each UML concept, including, where applicable, its dynamic (runtime) semantics. Increased precision is crucial for supporting the MDA objectives, especially construction of executable models and the ability to automatically generate programs from these models.

Consolidated semantic foundation. The effort to create a more precise definition of UML modeling concepts is much easier if the semantic foundations used to define them are themselves precise and clear. In the case of UML2, this foundation will be represented by the UML infrastructure, which consists of elementary UML abstractions, including Class, Association, and Instance. These core abstractions should then be combined and extended in various ways to produce the more complex modeling concepts of the full-fledged UML. The entire infrastructure must be concise and properly factored to eliminate semantic overlap and ambiguity. Yet it must also be expressive enough to serve as a base for UML, as well as for other modeling languages.

Support for multiple standard languages. The MDA initiative envisions a spectrum of different modeling languages for different purposes, including UML and the Common Warehouse Metamodel (CWM). All of them are to be defined using the standard OMG Meta-Object facility (MOF), a specialized modeling language for defining languages. Since the UML infrastructure already incorporates all the necessary constructs for defining other modeling languages, it is convenient to base the MOF on a subset of the infrastructure. This requires restructuring UML such that the MOF subset is clearly delineated.

Family of languages. A central tenet of UML is that there should be a single unified general-purpose modeling language consolidating the essential concepts of the object paradigm. On the other hand, different application domains require different domain-specific extensions of common concepts. As experience with the PL/I language demonstrates, cramming all useful concepts from diverse domains into a single language is not always practical. Hence, for languages, including UML, intended to cover a range of application domains, it is crucial to avoid the infamous "language bloat" syndrome while retaining all the advantages of a standard.

The current version of UML (1.4) offers a partial solution to this dilemma. It is a relatively compact language that can be specialized further—directly by users and standards bodies—whether for individual modeling or for technology domains; dialects of the general standard are called profiles and typically add domain-specific specializations of the standard UML concepts while preserving the semantics of the general standard. This capability means that tools supporting the general standard can be used to manipulate models based on profiles. It also means that practitioners with knowledge of the general standard can use that knowledge even when working with profiles.

In addition, to protect users of UML who do not need a full set of its features, the language must be partitioned into a set of separate and crisply defined compliance points. This partitioning will enable both users and tool builders to consider only the subset of the standard that interests them.


UML2 must consolidate UML’s existing features rather than add a multitude of new and changed ones.


Feature improvements. Modern software systems can be extremely complex. Among current users of UML, one major complaint is that some of its features do not scale to industrial-size problems. This deficiency requires some adjustment to the existing modeling features; we single out the following two:

Modeling the structure of component-based software systems. This feature is especially important for modeling software architectures and component technologies, including CORBA components, EJB, and .NET. It includes the ability to recursively decompose objects into structures of interconnected finer-grain objects to define the rules (protocols) governing interactions between such parts and to define multiple access points (ports) on objects.

Extended modeling of complex behavior. This feature includes the ability to hierarchically compose and combine individual behavior specifications; it also means removing some of the current constraints on activity graphs stemming from their definition in the context of state machines.

Although these additions will undoubtedly increase the girth of UML somewhat, they are necessary for coping with real-world problems. Following real-world complexity, successful technologies inevitably become more sophisticated as they respond to increasingly sophisticated requirements.

Formal graphical syntax and diagram interchange. UML is often referred to as a "visual" language, even though UML1 lacks a formal graphical syntax. However, such syntax is required for the interchange of graphical information captured in diagrams used with UML models. This need can be addressed through a standardized metamodel of UML diagrams and an expanded XML Metadata Interchange (XMI) format enabling the interchange of diagrams.

Migration for UML users. As noted earlier, there is already a significant community of UML users and tool vendors. The UML2 specification must therefore at least explicitly define the means for upgrading (automatically where possible) models based on prior versions of the standard. Moreover, modelers who prefer not to use the new features should be able to ignore them.

Back to Top

Conclusion

Revisions of popular languages are necessarily accomplished through gradual and prudent change—evolution, not revolution. UML2 must consolidate UML’s existing features rather than add a multitude of new and changed ones. This consolidation must yield greater precision and better modularization, both crucial to achieving the objectives of MDA.

Back to Top

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More