Computer science is both a science and an art. Yet, when it comes time for implementation, there is a combination of artistic flare, nuanced style, and technical prowess that separates good code from great code.
Computer science is both a science and an art. Its scientific aspects range from the theory of computation and algorithmic studies to code design and program architecture. Yet, when it comes time for implementation, there is a combination of artistic flare, nuanced style, and technical prowess that separates good code from great code.
Like art, code is simultaneously subjective and non-subjective. The non-subjective aspects of coding include "hard" ideas that must be followed to create good code: design patterns, project structures, the use of common libraries, and so on. Although these concepts lay the foundation for developing high-quality, maintainable code, it is the nuances of a programmer's technique and toolsalignment, naming, use of white space, use of context, syntax highlighting, and IDE choicethat truly make code clear, maintainable, and understandable, while also giving code the ability to clearly communicate intent, function, and usage.
This separation between good and great code occurs because every person has an affinity for his or her own particular coding style based on his or her own good (or bad) habits and preferences. Anyone can write code within a design pattern or using certain "hard" techniques, but it takes a great programmer to fill in the details of the code in way that is clear, concise, and understandable. This is important because just as every person may draw a unique meaning or experience from a single piece of artwork, every developer or reader of code may infer different meanings from the code depending on naming and other conventions, despite the architecture and design of the code.
From another angle, programming may also be seen as a form of "encryption." In various ways the programmer devises a solution to a problem and then encrypts the solution in terms of a program and its support files. Months or years later, when a change is called for, a new programmer must decrypt the solution. This is usually not an enviable task, which can mainly be blamed on a failure of clear communication during the initial "encryption" of the project. Decrypting information is simple when the necessary key is present. So, too, is understanding old code when special attention has been paid to what the code itself communicates.
To address this issue, some works have defined a single coding standard for an entire programming language,7 while others have acquiesced to accepting naming conventions as long as they are consistent.6 Beautiful code has been defined in general terms as readable, focused, testable, and elegant.1 The more extreme case is the invention of an entire programming language built around a concrete set of ideals, such as Ruby or Python. Ruby emphasizes brevity, simplicity, flexibility, and balance.4 The principles behind Python are clear in the Zen of Python,5 where the focus lies on beauty, simplicity, readability, and reliability.
Our approach to this issue has been to develop a system of coding guidelines (available online3). While these guidelines come from an educational environment, they are designed to be useful to practitioners as well. The guidelines are based on a few broad principles that capture some fundamental principles of communication and elevate the notion of coding conventions to a higher level. The use of these conventions will also improve the sustainability of a code base. This article looks at these underlying principles.
One area not considered here is the use of syntax highlighting or IDEs. While either one may make code more readable (because of syntax highlighting or code folding, among others) and easier to manage (for example, quickly looking up or refactoring functions and/or variables), our guidelines have been developed to be IDE and color neutral. They are meant to reflect foundational principles that are important when writing code in any setting. Also, while IDEs can help improve readability and understanding in some ways, the features found in these tools are not standard (consider the different features found in Visual Studio, Eclipse, and VIM, for example). Likewise, syntax highlighting varies greatly among environments and may easily be changed to match personal preference. The goal of the following principles is to build a foundation for good programming that is independent of the programming IDE.
In a recent ACM Queue article, Poul-Henning Kamp2 makes the fascinating point that much of the style of programming languages stems from the ASCII character set and typewriter-based terminals. Programming languages make no use of the graphical properties and options of modern devices. While code must be written with the clarity of good English grammar, it is not English text. Instead it is more like math and tables.
This is a far-reaching principle. First, it speaks directly to the use of fonts. Do not use a variable-width (proportional) font for program code, as code is not text. Fixed-width fonts (for example, Courier and Data Gothic) look appealing and allow easy alignment of code. Proportional (variable-width) fonts prevent proper alignment, and even more importantly, do not "look like" code.
While one should continue to think of a program as a sequence of actions or as an algorithm at a high level, each section of code should also be thought of as a presentation of a chart, table, or menu. In figures 1, 2, and 3 notice the use of vertical alignment to show symmetry. This is a powerful method of communication.
In the case when a long line of code spills into multiple lines, we suggest breaking and realigning the code.a For example, instead of

use

or

A programmer creates a name for something with full knowledge of its use, and often many names make sense when one knows what the name represents. Thus, the programmer has this problem: creating a name based on a concept. The true challenge, however, is precisely the opposite: inferring the concept based on the name! This is the problem that the program reader has.
Consider the simple name
sputn
taken from the common C++ header file <iostream.h>. An inexperienced or unfamiliar programmer may suddenly be mentally barraged with a bout of questions such as: Is it an integer? A pointer? An array or a structure? A method or a variable? Does sp stand for saved pointer? Is sput an operation to be done n times? Do you pronounce it sputn or s-putn or sput-n or s-put-n?
We advocate basing names on conventional English usagein particular, simple, informal, abbreviated English usage. Consider the following more specific guidelines:
Some examples of this broad principle are shown in Figure 4.
There is an interesting but small issue when considering examples such as:

While countFiles is a good name, it is not an optimal name since it is a verb. Verbs should be reserved for procedure calls that have an effect on variables. For functions that have no side effects on variables, use a noun or noun phrase. One does not usually say

but rather

We suggest that

is a slight improvement. More importantly, this enforces the general rule that verbs denote procedures, and nouns or adjectives denote functions.
All other things being equal, shorter programs are always better. As an example, local variables that are used as index variables may be named i, j, k, and so on. An array index used on every line of a loop need not be named any more elaborately than i. Using index or elementNumber obscures the details of the computation through excessive description. A variable that is rarely used may deserve a long name: for example, MaxPhysicalAddr. When variable names are long, especially if there are many of them, it quickly becomes difficult to see what's going on. A variable name can often be shortened by relying on the context in which it is used. For example, the variable Store in a stack implementation rather than StackStore.
Major variables (objects) that are used frequently should be especially short, as seen in the examples in Figure 5. For major variables that are used throughout the program, a single letter may encourage program clarity.
While written and spoken communication may reach a high level of clarity, it is often left wanting of meaning if not accompanied by the personal touch of nonverbal cues and tendencies. An individual's body language helps clarify the spoken word. In a similar sense, the programmer relies on white spacewhat is not said directlyin the code to communicate logic, intent, and understanding.
An example is the use of blank lines between conceptually different sections of code. Blank lines should improve readability as they separate logically different segments of the code and thus provide the literary equivalent of a section break. Appropriate places to use blank lines include:
Consider the code listing in Figure 6. Individual blank spaces should also be used to show the logical structure within a single statement. Strategic blank spaces within a line simplify the parsing done by the human reader. At a minimum, blank spaces should be included after the commas in argument lists and around the assignment operator "=" and the redirection operators "<<" and ">>".
On the other hand, blank spaces should not be used for unary operators such as unary minus (−), address of (&), indirection (*), member access (.), increment (++), and decrement (−−).
Also, if it makes sense, put two to three statements on one line. This practice has the effect of simplifying the code, but it must be used with discretion and only where it is sensible to do so.
The case statement used in Figure 1 brings up a general point: very simple decision statement structures can be tersely presented, showing the alternative code simply, and, if possible, without braces, as in the example in Figure 7.
It is not uncommon for simple conditions to be mutually exclusive, creating a kind of generalized case statement. This, as is common practice, can be printed as a chain, as in Figure 8.
Of course, it may be that the structures are truly nested, and then one must use either nested spacing or functions to indicate the alternatives. Again, the general point is to let the structure drive the layout, not the syntax of the programming language.
In the brace wars, we do not take a strong stand on the various preferences shown in Figure 9, but we do feel strongly that the indent is vital, as it is the indent that shows the structure.
The ability to communicate clearly is an issue that is faced in all facets of the human experience. Programmers must achieve a level of clarity, continuity, and beauty when writing code. This means focusing on the code and its clarity, balance, and symmetry, not on its length or comments. While this concept does not advocate the removal of comments or negate their use and importance in appropriate situations, it does suggest that programmers must use comments wisely and judiciously. The focus should be on developing code that, for the most part, clearly communicates intent and functionality. This practice will automatically reduce the need for many comments.
Although the guidelines presented here are used in an educational setting, they also have merit in industrial environments. Students who are educated using these guidelines will most likely use them (or some variant) as they enter industry. To demonstrate this, we have developed an example that applies these guidelines to two very different styles. The first is the Unix style. It is terse, often making use of vowel deletion, and is often found in realistic applications such as operating-system code. This is not to imply that all or most system programmers use this style, only that it is not unusual. Figure 10 shows a small example of this style.
We call the second style the textbook style, as illustrated in Figure 11. Again, this in no way means to imply that all or most textbooks use this style, only that the style in the example is not unusual. In this style the focus is on learning. This means that there is frequent commenting, and the code is well spread out. For the purposes of learning and understanding the details of a language, this style can be excellent. From a practical perspective or for any program of some scale, this style does not work well as it can be overwhelming to use or to read. Moreover, this style makes it difficult to see the overall design, as if one is stuck under the trees and cannot see the forest around.
Figure 12 is a rework of the function in figures 10 and 11, using the guidelines discussed here to make a smooth transition between academic and practical code. This figure shows a balance of both styles, relying more directly on the code itself to communicate intent and functionality clearly. Compared with the textbook style, the resultant code is shorter and more compact while still clearly communicating meaning, intent, and functionality. When compared with the Unix style, the code is slightly longer, but the meaning, intent, and functionality are clearer than the original code.
Figure 13 illustrates the guidelines presented here in another setting. This is a function taken from a complex program (10,000 lines) related to power-system reliability and energy use regarding PHEVs (plugin hybrid electric vehicles). The program makes numerous calculations related to the effect that such vehicles will have on the current power grid and the effect on generation and transmission systems. This program attempts to evaluate the reliability of power systems by developing a model for reliability evaluation using a Monte Carlo simulation.
While the previous examples show the merit of the guidelines presented here, one argument against such guidelines is that making changes to keep a certain coding style intact is time consuming, particularly when a version-control system is used. In the face of a time-sensitive project or a project that most likely will not be updated or maintained in the future, the effort may not be worthwhile. Typical cases include class projects, a Ph.D. thesis, or a temporary application.
If, however, the codebase in question has a long lifespan or will be updated and maintained by others (for example, an operating system, server, interactive Web site, or other useful application), then almost any changes to improve readability are important, and the time should be taken to ensure the readability and maintainability of the code. This should be a matter of pride, as well as an essential function of one's job.
The authors would like to thank David Marcus and Poul-Henning Kemp for their insightful comments while completing this work, as well as the software engineering students who have contributed to these guidelines over the years.
Related articles
on queue.acm.org
Beautiful Code Exists, if You Know Where to Look
George Neville-Neil
http://queue.acm.org/detail.cfm?id=1454458
Software Development with Code Maps
Robert DeLine, Gina Venolia and Kael Rowan
http://queue.acm.org/detail.cfm?id=1831329
Reading, Writing, and Code
Diomidis Spinellis
http://queue.acm.org/detail.cfm?id=957782
1. Heusser, M. Beautiful code. Dr. Dobb's (Aug. 2005); http://www.ddj.com/184407802.
2. Kamp, P-H. Sir, please step away from the ASR-33! ACM Queue 8, 10 (2010); http://queue.acm.org/detail.cfm?id=1871406.
3. Ledgard, H. Professional coding guidelines. 2011 Unpublished report, University of Toledo; http://www.eng.utoledo.edu/eecs/faculty_web/hledgard/softe/upload/.
4. Molina, M. What makes code beautiful. Ruby Hoedown, 2007.
5. Peters, T. The Zen of Python. PEP (Python Enhancement Proposals). Aug. 20, 2004; http://www.python.org/dev/peps/pep-0020/.
6. Reed, D. Sometimes style really does matter. J. Computing Sciences in Colleges 25, 5 (2010), 180187.
7. Sun Developer Network. Code conventions for the Java programming language, 1999; http://java.sun.com/docs/codeconv/.
Figure 1. Use of vertical alignment to show symmetry.
Figure 2. Example of cluttered presentation.
Figure 4. Examples of basing names on conventional English usage.
Figure 5. Keeping names short and simple.
Figure 6. Example of code that uses white space well.
Figure 7. Decision statement structure, tersely presented.
Figure 8. Case statement presented as a chain.
Figure 9. Examples of K&R, ANSI, and Whitesmiths coding styles.
Figure 10. Example of a systems-programming coding style.
Figure 11. Example of a textbook coding style.
Figure 12. Example of a coding style using the guidelines presented here.
Figure 13. Realistic and complex example of code following the guidelines presented here.
©2011 ACM 0001-0782/11/1200 $10.00
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from permissions@acm.org or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2011 ACM, Inc.
Hi, you are not taking your twitter communication serious, are you?
The article is extremely inconsistent with regard to the examples.
The guidelines described here are NOT applied to the examples.
A very good example is Fig. 13: http://deliveryimages.acm.org/10.1145/2050000/2043191/figs/f13.jpg
I hope it is just a technical problem like tabs vs. spaces and an editing process that is not used to handling source code.
Please consider fixing this article, it is just embarrassing.
Thanks!
Thanks for the comments and reading of the paper! There are some errors in the examples, and we appreciate you bringing this to our attention. We will see if we can get them corrected.
--Robert C. Green
The errors in the figures were very distracting, e.g., Fig. 1 was a disaster:
-- several duplicate case labels
-- "case default" is invalid syntax
-- System.out.printIn is not defined
Whoever wrote this gets an "F".
All writers in Communications should compile their examples before submitting -- it's an old rule that still holds.
Thanks for your article!
I would have liked to see how this relates to Knuth's Literate Programming... Any ideas on that?
While I’m not very familiar with literate programming, I believe there is definitely some overlap with what we have presented. For example, consider this excerpt from Knuth’s “Literate Programmingâ€:
"The practitioner of literate programming can be regarded as an essayist, whose main concern is with exposition and excellence of style. Such an author, with thesaurus in hand, chooses the names of variables carefully and explains what each variable means. He or she strives for a program that is comprehensible…"
This is a statement that could have easily been included in this paper and clearly works well with what we have presented. This is particularly true regarding the guideline “Let Simple English be Your Guideâ€.
One difference appears to be our guideline of “Focus on the Code, Not the Comments.†Literate programming seems to be focused deeply on comments and documentation instead of letting the code speak for itself through its design.
Again, this is a cursory comparison and it would be interesting to perform a deeper and more through comparison with our guidelines.
-Rob Green
Thanks a lot for your inspiring article. Every developer should strive for better code quality and readable variable/class/method names. My gold standard when it comes to code quality is "Clean Code: A Handbook of Agile Software Craftsmanship (Robert C. Martin)". There are a number of differences to your suggestions, for example Robert C. Martin does not recommend the horizontal alignment of code, because he thinks the real problem is the length of the lists - and I agree with that, although I used to align my code for years (it's still useful with certain programming languages or assembler code, when you can't avoid long lists).
Displaying all 6 comments
Comment on this article
Signed and anonymous comments submitted to this site are moderated and will appear if they are relevant to the topic and not abusive. Your comment will appear with your username if you are signed into the site, and will be anonymous if you are not signed in. View our policy on commentsLog in to Submit a Signed Comment
Sign In »
Sign In
To submit a signed comment, sign in using your ACM Web Account username and password if you are an ACM member, Communications subscriber or Digital Library subscriber.Create a Web Account »
An email verification has been sent to youremail@email.com
ACM veries that you are the owner of the email address you've provided by sending you a verication message. The email message will contain a link that you must click to validate this account.NEXT STEP: CHECK YOUR EMAIL
You must click the link within the message in order to complete the process of creating your account. You may click on the link embedded in the message, or copy the link and paste it into your browser.Continue as an anonymous user »