Opinion
Computing Applications Technical opinion

Dataless Objects Considered Harmful

Novice programmers should be taught not only the value of faithfully representing objects but, even more important, multiple programming paradigms as well.
Posted
  1. Article
  2. References
  3. Author

While I owe apologies to Edsger Dijkstra, a leading member of the founding generation of computer science, for mimicking the title of Dijkstra’s influential 1968 Communications article [1], my intent here is consistent with Dijkstra’s spirit—promoting disciplined programming—whether it is procedure-oriented (PO) or object-oriented (OO).

As suggested in [3], in order to learn OO programming naturally, as opposed to procedurally, the "Hello, World" program (conventionally used to introduce programming in textbooks) should be changed to the following:

class HelloWorld {
   public static void printHello( ) {
   System.out.println("Hello, World");
   }
}
class UseHello {
   public static void main(String[ ] args) {
   HelloWorld myHello = new HelloWorld( );
   myHello.printHello( );
   }
}

However, this proposed OO revision is, in fact, harmful to beginning programmers, because it misleads them about what objects are truly about.

The intent of the revision is to show students from day one how an object is instantiated and how its behaviors are used. But the fundamental problem of the class HelloWorld is its dataless nature, meaning no instance data is encapsulated in the class. How would another object, say, yourHello (HelloWorld yourHello = new HelloWorld( )) be any different from myHello? It wouldn’t (other than a different memory reference). In fact, myHello.printHello( ) can be replaced by HelloWorld.printHello( ) with no object needed. The revised "Hello, World" program sends a misleading message to novices about the kind of classes they will be expected to create later on—which are, in essence, different from HelloWorld. These classes would encapsulate not only the methods but, more important, the data (or instance attributes) the methods act on as well. Without instance data being encapsulated, a class is simply a container for holding (static) methods that, while possibly helping improve the code’s organization from a maintenance perspective, provides no essential benefit to problem solving. On the other hand, a PO program is easily turned into an OO program by grouping (static) methods into classes and even into subclasses if the programmer so chooses, regardless of whether there are "Is-A" relationships among the classes.

If teachers want novice programmers to learn about objects, a faithful representation is required, as in the following example:

class Message {
   String messageBody;
   public void setMessage(String newBody) {
   messageBody = newBody;
   }
   public String getMessage( ) {
   return messageBody;
   }
   public void printMessage( ) {
   System.out.println(messageBody);
   }
}
public class MyFirstProgram {
   public static void main(String[ ] args) {
   Message mine = new Message ( );
   mine.setMessage("Hello, World");
   Message yours = new Message ( );
   yours.setMessage("This is my first program!");
   mine.printMessage( );
   System.out.println(yours.getMessage( ) + "—" + mine.getMessage( ) );
   }
}

Although this program is a bit more involved than the "Hello, World" revision cited earlier (even though no more essentially different constructs have been introduced), it conveys much more about object-based programming, as outlined in the following characteristics:

  • Abstract data type. A class specifies an abstract data type encapsulating both data and operations performed on that data;
  • Operations construction. Operations are implemented using a language construct called (in Java) method. While the signature (or parameter list) of a method may vary significantly, methods are characterized in two ways: value-returning and void;
  • Instance attributes. Without its instance attributes being specified, an object is meaningless; that is, the method setMessage must be called before any other method is invoked, though constructors are normally used to initialize values of instance data;
  • Classes for different purposes. Whereas the class Message is used to specify a blueprint for making concrete messages (objects), the other class (MyFirstProgram) is used simply to satisfy a Java language requirement that the "main" method be wrapped in a public class;
  • Main method. The main method, for which the syntax may change from one language to another, is where the program flow is defined (in this case, where messages are processed); and
  • Reusability. Messages are reusable for making new messages, and methods are generally reusable.

Moreover, this representation example is easily extended to addressing inheritance (for example, an email message or a memo can be derived by adding more relevant attributes to a message that includes only a message body) and polymorphism (for example, an email or a memo can have its own version of printMessage).

The reason the author of [3] suggested his version of "Hello, World" was his concern that novices would unlearn PO programming, a popular justification for teaching objects-first in university programming courses. However, teaching dataless objects fundamentally defeats the notion of teaching objects-first. Aside from the fact that programs using only dataless objects are, in essence, procedural, novices must still learn PO programming.

There are at least four main reasons for doing so. First, PO programming is not simply about writing procedures or functions but about the stepwise refinement of functional decomposition and disciplined practices toward programming. In solving sophisticated business problems, objects (whether local or remote) don’t communicate in a program without a main procedure, whereby the business process is faithfully and structurally executed. A main procedure is not only the client of the various business objects, it can also be (and usually is) the client of the methods that address the functional decomposition of a business process, regardless of whether these methods are standalone or grouped in dataless classes. On the other hand, for programming-in-the-small, or coding the details of a design, the idea of structured programming remains applicable to designing class-member methods with an appropriate level of cohesion, coupling, and reusability.

Second, a real-world problem may be inherently PO. For instance, despite the fact that matrices and vectors can be simulated with abstract data types, most scientific computation problems are PO, whereby a solution process is decomposed into procedures that communicate with one another based on the same set of input data. The result is many intermediate variables that don’t correspond to separate entities of the application domain. A solution (or part of its decomposition) is not generally an object, since other solutions may differ in every detail in terms of how they work; hence, it’s practically impossible for a programmer to define a class that would allow instantiating any of the possible solutions.

Third, it is a fact that inheritance is computationally costly. Therefore, even for problems that can be represented in either way (PO or OO), developers often choose the PO approach over performance or complexity concerns, as often seen in real-time applications.

Fourth, if object orientation is in fact revolutionary in terms of the way software developers think when designing a program, it is an evolutionary paradigm in terms of what they actually do. In such languages as Pascal and C, heterogeneous data is encapsulated separately from the procedures that act on the data. In SIMULA 67, a class looks more like a single procedure, so an instance of a class is in fact an activation of the procedure without needing activation record deallocation when the control is transferred. These encapsulation options were united in later languages (such as C++ and Java) that allow users to define abstract data types in a much broader sense with inheritance and polymorphism as the core features of the paradigm supported by the language.

As an OO language designer, the author of [3] demonstrated the deep rootedness of the design of an OO language (Eiffel in his case) as part of the revolution that structured programming brought to software construction. Given the history of evolutionary development of OO programming, learning PO programming (in addition to its practical value) gives novices an opportunity to appreciate the way OO solves problems in terms of simulating real-world objects, as well as the way these objects communicate, hence facilitating (not hampering) the learning process.

University teachers were still not convinced in the mid-1990s (nearly 30 years after OO first appeared in SIMULA 67) that OO should be taught to, at least, upper-level students. These teachers feared that the complex nature of OO might not justify the effort, should OO be replaced later by something better. Today, we face another extreme: learning programming exclusively via objects and the OO paradigm. Yet many advocates of teaching objects-first (along with textbooks) often fail to identify the (real-world) problems for which solutions involving OO would demonstrate OO’s superiority over PO.

Instead, novice programmers have seen OO-dressed-up classical programming examples with no real advantages, twisted design structures meant solely to accommodate the needs of inheritance or polymorphism, and insufficient coverage of basic algorithms and structured programming skills demanded in more advanced courses.

It should be obvious that learning real-world business or scientific problem solving through programming must allow multiple paradigms, as one paradigm (or a mix of paradigms) may be a better choice than others in overall design decision making. In a real-world example, a 2004 study [2] suggested that the separation of data and actions—contrary to the philosophy of the OO paradigm—may be better suited to many business applications than other problem-solving paradigms. However, until we agree on teaching approaches that would give students in beginning programming classes balanced exposure to commonly used programming paradigms, many of these students will continue to struggle and be confused.

Back to Top

Back to Top

    1. Dijkstra, E. Go To statement considered harmful. Commun. ACM 11, 3 (Mar. 1968), 147–148.

    2. Kambayashi, Y. and Ledgard, H. The separation principles: A programming paradigm. IEEE Software 21, 2 (Mar.–Apr. 2004), 78–87.

    3. Meyer, B. From structured programming to object-oriented design: The road to Eiffel. Structured Program. 10, 1 (1989), 19–39.

    4. Westfall, R. Hello, World considered harmful. Commun. ACM 44, 10 (Oct. 2001), 129–130.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More