Research and Advances
Artificial Intelligence and Machine Learning What UML should be

Uml2 Must Enable a Family of Languages

UML2 needs to take advantage of new infrastructure to enable first-class specialization and variation of the UML superstructure language.
  1. Introduction
  2. Choosing Metamodels
  3. Insist on Extension Requirements
  4. Author
  5. Footnotes

Ever since the Object Management Group standardized the UML Profile RFP in 1997, DSTC and I have argued for including first-class extension mechanisms in UML. Now that the RFP for UML2 mandates a shared core infrastructure with OMG’s Meta-Object Facility (MOF2) and a first-class extension mechanism, we see even less reason to restrict modelers to UML’s existing primitive tagging mechanisms (profiles) for model extension. A new UML that fulfills this requirement could facilitate a powerful new toolset implementing UML’s superstructure (including Classes, Use Cases, and Activities) yet be flexible enough to cope with adaptations (such as OMG’s Common Warehouse Metamodel and Enterprise Distributed Object Computing standards), as well as future models reusing and extending UML.

UML1 took most of the best ideas from the graphical OO modeling languages of the early 1990s, along with some earlier abstractions, including state machines, seeking to make them more well-defined and coherent. At the time of its standardization in 1997, UML’s power derived from the ability of its class modeling language to describe high-level concepts in terms of classes of objects and their properties and relationships while directly modeling programming language artifacts in OO languages, including Smalltalk and C++.

The ability to model classes in OO programming languages is still useful with next-generation OO programming languages, including Java and C#. But the UML primitives of Class, Attribute, Operation, and Association are increasingly augmented by other concepts not native to UML, including Containers, Factories, Transactional Boundaries, Assemblies, and Publications and Subscriptions. For example, in a discussion among attendees at a recent meeting of OMG’s Analysis and Design Task Force, several mechanisms were proposed for modeling Enterprise JavaBeans (EJB) and Common Object Request Broker Architecture (CORBA) components in UML; while each reflects its own inherent strengths and weaknesses, all required innovative uses of UML that didn’t quite match the semantics defined by UML for the model elements being used.

Although UML2, which could be adopted by OMG as early as 2003, will probably incorporate primitives for the current generation of software engineering concepts, it will be several years too late for the modelers of recent paradigms of application development, including application server, loose coupled messaging, and Web services. By the time UML products actually support the new standard, there will doubtless be new trends in software function and development mismatching to various degrees the modeling concepts in the new version of UML.

Allowing an inferior UML1 style of extension mechanism to persist in UML2 represents a substantial barrier to the success of the OMG Model Driven Architecture.

UML’s other major strength is as a high-level concept modeling language, or, to call it what I use it for—a metalanguage. In spite of general misunderstanding of the relationship between UML and OMG’s unifying modeling framework—the MOF—the UML kernel is isomorphic in its expressive capabilities to the MOF.1 Indeed, the UML2 infrastructure (kernel) and MOF2 (Core) RFPs explicitly call for the unification of these two specifications. As a metalanguage, the UML kernel allows purpose-built modeling languages to cleanly express concepts needed to model new computing paradigms.

Various recent OMG RFPs have called for the reuse of UML to model application implementation styles even newer than OO programming. The people submitting proposals to these RFPs exhaustively attempted to reuse UML and, in spite of enormous political pressure for reuse, resorted to using MOF to create their own languages. For example, the OMG data warehousing standard—the Common Warehouse Metamodel—found it necessary to create a class model parallel to the one defined by UML. The OMG members standardizing the Enterprise Application Integration (EAI) "UML profile," a language for modeling message-based application integration, created their own metamodel that does not reuse UML. Meanwhile, the Enterprise Distributed Object Computing (EDOC) "UML profile" created a metamodel of the concepts representing high-level designs of component-based and event- and process-driven applications, then shoehorned the language into the most structurally interconnected, or permissive, part of the UML metamodel in order to satisfy a notional requirement for UML tool reusability. However, based on the diagrammatic-form output from mainstream UML tools, I doubt anyone will ever use the profile form of the EDOC model.

Of the first four OMG RFPs requesting a UML profile, only the one for CORBA resulted in an adopted standard using the UML extension mechanisms. It models the CORBA Interface Definition Language, an OO interface language closely matched to UML Classes. However, a number of small mismatches, some simply in terminology, combined with the inadequate profile mechanism available in UML1, make the profile definition inelegant and its use impractical without specialized tool support.

On the other hand, requiring system designers to start from scratch every time they want to define a language for modeling design instances for a novel system means much wasted effort, even when the metalanguage is sufficiently expressive. However, some concepts are common to most modeling languages:

Naming. Named elements and namespaces;

Structure. Containers, ports, connectors, and relationships;

Reuse. Inheritance, renaming, overriding, and composition; and

Refinement. Relationships for transformation, reification, and life cycle.

The best expression of new modeling constructs reuses such common elements, along with data types, patterns, and component libraries. This is no easy task, as noted in the recent literature on reusing OO code. Especially difficult is defining hierarchies of classes with enough generality to be reusable, without making them so general they do not capture the specific needs of application modelers at the most derived level. In software engineering, designing for reuse typically requires more effort than project timelines allow; two or more reuses might be needed before an optimal form is identified. Given that design of the UML1 metamodel was generally not concerned with reuse of abstract structure, it seems the only way to achieve a better result in UML2 is to use the existing metamodels—CWM, EAI, and EDOC—as test cases for UML2 metamodel reusability.

The activity graph model in UML1 provides an object lesson in reuse mismatching. Activity graphs superficially resemble state machines in that each is a directed graph with the ability to nest other graphs in their nodes. But the decision by the UML1 specification submitters to make activity graphs a specialization of state machines imposed severe constraints on activity graphs, which were not readily understood by users of the activity graph notation—nor, it seems, by tool vendors trying to implement the activity graph notation. As a result, almost all published activity graphs with more than a trivial number of activity states are not well formed.

In the years since UML1 standardization, a number of changes have been made by the various OMG UML revision task forces to the metaclasses describing state machines; however, they are not intended for use by the state chart model but for alleviating constraints imposed on their subtypes in the activity graph model. Having to change the state chart model to facilitate different semantics in the derived activity graph model should have set off alarm bells with OO modelers about the meaning of specialization relationships. Unfortunately, the changes have not alleviated the constraints or provided the activity graph model the flexibility it needs.

Given this hindsight, a more sensible approach for UML2 is to specify a relatively unconstrained common (abstract) metamodel for graphs on which state machines and activity graphs can place their own constraints.

Back to Top

Choosing Metamodels

Some application developers view OMG’s Model Driven Architecture (MDA) as an exercise in modeling in UML, marking the models with concepts from platforms and transforming the model elements into appropriate platform code, stubs, and deployment descriptors. I view it as an exercise in choosing appropriate standard or custom metamodels that abstract details away from platform specifications; software architects can then model platform-independent application architectures.

Metamodels will be built on the forthcoming UML2 infrastructure and include (but will not be limited to) the standard UML2 superstructure. Platform experts will define transformations from classes of these platform-independent models to produce more platform-specific models. Such transformations will have to provide placeholders for the details missing in the abstraction.

Once UML2 is available, I expect new application architecture metamodels will be standardized by specializing UML where appropriate and extending the UML infrastructure as needed. Likewise, all the platform-specific models will be instances of standard platform metamodels (such as those defined for EJB and message-oriented middleware in OMG’s EDOC and EAI standards).

The next generation of UML tools will support the first-class metamodel extension mechanisms mandated by the UML2 infrastructure RFP. They will provide built-in libraries of common graphical display elements (such as container boxes, name/value slots, ports, connectors, and expression fields) and graphical realization models describing how the models should be rendered. They will allow metamodelers to add abstract syntaxes for new languages using the core Class and Association constructs and specify (possibly multiple) concrete syntaxes using graphical realization models, as per the built-in UML2 superstructure. Given the known behavior of library base classes (such as Containers, Namespaces, Ports, and Connectors) and their default graphical depictions, new languages based on these common UML2-based abstract metaclasses will also have GUIs.

Back to Top

Insist on Extension Requirements

It is therefore incumbent on the voters participating in the OMG UML2 process to insist on fulfillment of the mandatory requirements for first-class extension of the UML metamodel—before voting to adopt any proposal. Allowing an inferior UML1 style of extension mechanism to persist in UML2 represents a substantial barrier to the success of OMG’s MDA. Moreover, it will splinter the new metamodels of architectural paradigms and platforms needed for MDA from the UML superstructure, preventing them from being close siblings sharing semantics, structure, notation, and tools.

Back to Top

Back to Top

    1The main reason the MOF, rather than the UML kernel, is used for metamodeling is that it includes a standard mapping to OMG's Interface Definition Language for programmatic access to model repositories; it also includes a standard mapping to XML for streaming model interchange. Moreover, in its guise as the Java Metadata Interface, MOF includes native Java interfaces for the same purpose.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More