Teaching Real-World Programming

In this post, I describe a ubiquitous style of programming that, to my knowledge, has never been formally taught in the classroom.

In most programming classes, students write programs in a single language (e.g., Java, Python) and its standard library; they might use a well-documented third-party library for, say, graphics. Students fill in skeleton code templates provided by instructors or, at most, write a few chunks of code "from scratch." Specifications and interfaces are clearly defined, and assignments are graded using automated test suites to verify conformance to specs.

What I just described is necessary for introducing beginners to basic programming and software engineering concepts. But it bears little resemblance to the sorts of programming that these students must later do in the real world.

Over my past decade of programming, I’ve built research prototypes, extended open-source software projects, shipped products at startups, and engaged in formal software engineering practices at large companies. Regardless of setting, here are the typical steps that my colleagues and I take when starting a new project:

1. Forage: Find existing snippets of code to build my project upon. This might include code that I wrote in the past or that colleagues sent to me in various stages of bit-rot. If I’m lucky, then I can find a software library that does some of what I want; if I’m really lucky, then it will come with helpful documentation. Almost nobody starts coding a real-world project "from scratch" anymore; modern programmers usually scavenge parts from existing projects.

2. Tinker: Play with these pieces of existing code to assess their capabilities and limitations. This process involves compiling and running the code on various inputs, inserting "print" statements to get a feel for when certain lines execute and with what values, and then tweaking the code to see how its behavior changes and when it breaks.

(Now loop between steps 1 and 2 until I’m satisfied with my choice of building blocks for my project. Then move on to step 3.)

3. Weld: Try to attach ("weld") pieces of existing code to one another. I might spend a lot of time getting the pieces compiled and linked together due to missing or conflicting dependencies. Impedance mismatches are inevitable: Chances are, the code I have just welded together were never designed to "play nicely" with one another or to suit the particular needs of my project.

4. Grow: Hack up some hard-coded examples of my new code interfacing with existing "welded" code. At this point, my newborn code is sloppy and not at all abstracted, but that’s okay — I just want to get things working as quickly as possible. In the process, I debug lots of idiosyncratic interactions at the seams between my code and external code. Wrestling with corner cases becomes part of my daily routine.

5. Doubt: When implementing a new feature, I often ask myself, "Do I need to code this part up all by myself, or is there some idiomatic way to accomplish my goal using the existing code base or libraries?" I don’t want to reinvent the wheel, but it can be hard to figure out whether existing code can be molded to do what I want. If I’m lucky, then I can ask the external code’s authors for help; but I try not to get my hopes up because they probably didn’t design their code with my specific use case in mind. The gulf of execution is often vast: Conceptually simple features take longer than expected to implement.

6. Refactor: Notice patterns and redundancies in my code and then create abstractions to generalize, clean up, and modularize it. As I gradually refactor, the interfaces between my code and external code start to feel cleaner, and I also develop better intuitions for where to next abstract. Eventually I end up "sanding down" most of the rough edges between the code snippets that I started with in step 4.

(Now repeat steps 4 through 6 until my project is completed.)

I don’t have a good name for this style of programming, so I’d appreciate any suggestions. The closest is Opportunistic Programming, a term that my colleagues and I used in our CHI 2009 paper where we studied the information foraging habits of web programmers. Also, I coined the term Research Programming in my Ph.D. dissertation, but the aforementioned six-step process is widespread outside of research labs as well. (A reader suggested the term bricolage.)

Students currently pick up these hands-on programming skills not in formal CS courses, but rather through research projects, summer internships, and hobby hacking.

One argument is that the status quo is adequate: CS curricula should focus on teaching theory, algorithm design, problem decomposition, and engineering methodologies. After all, "CS != Programming," right?

But a counterargument is that instructors should directly address how real-world programming — the most direct applications of CS — is often a messy and ad-hoc endeavor; modern-day programming is more of a craft and empirical science rather than a collection of mathematically-beautiful formalisms.

How might instructors accomplish this goal? Perhaps via project-based curricula, peer tutoring, pair programming, one-on-one mentorship, or pedagogical code reviews. A starting point is to think about how to teach more general intellectual concepts in situ as students encounter specific portions of the six-step process described in this post. For example, what can "code welding" teach students about API design? What can refactoring teach students about modularity and testing? What can debugging teach students about the scientific method?

My previous CACM post, "Teaching Programming To A Highly Motivated Beginner," describes one attempt at this style of hands-on instruction. However, it’s still unclear how to scale up this one-off experience to a classroom (or department) full of students. The main challenge is striking a delicate balance between exposing students to the nitty-gritty of real-world programming while also teaching them powerful and generalizable CS principles along the way.