Sign In

Communications of the ACM

BLOG@CACM

Teaching Programming the Way It Works Outside the Classroom


View as: Print Mobile App ACM Digital Library Full Text (PDF) In the Digital Edition Share: Send by email Share on reddit Share on StumbleUpon Share on Hacker News Share on Tweeter Share on Facebook
Philip Guo

Credit: Twitter

http://cacm.acm.org/blogs/blog-cacm/159263-teaching-real-world-programming/fulltext
January 7,2013

In this post, I describe a ubiquitous style of programming that, to my knowledge, has never been formally taught in the classroom.

In most programming classes, students write programs in a single language (e.g., Java, Python) and its standard library; they might use a well-documented third-party library for, say, graphics. Students fill in skeleton code templates provided by instructors or, at most, write a few chunks of code "from scratch." Specifications and interfaces are clearly defined, and assignments are graded using automated test suites to verify conformance to specs.

What I just described is necessary for introducing beginners to basic programming and software engineering concepts. But it bears little resemblance to the sorts of programming that these students must later do in the real world.

Over my past decade of programming, I have built research prototypes, extended open-source software projects, shipped products at startups, and engaged in formal software engineering practices at large companies. Regardless of setting, here are the typical steps my colleagues and I take when starting a new project:

  1. Forage: Find existing snippets of code to build my project upon. This might include code I wrote in the past or that colleagues sent to me in various stages of bit-rot. If I am lucky, I can find a software library that does some of what I want; if I am really lucky, then it will come with helpful documentation. Almost nobody starts coding a real-world project "from scratch" anymore; modern programmers usually scavenge parts from existing projects.
  2. Tinker: Play with these pieces of existing code to assess their capabilities and limitations. This process involves compiling and running the code on various inputs, inserting "print" statements to get a feel for when certain lines execute and with what values, and then tweaking the code to see how its behavior changes and when it breaks. (Now loop between steps 1 and 2 until I am satisfied with my choice of building blocks for my project. Then move on to step 3.)
  3. Weld: Try to attach ("weld") pieces of existing code to one another. I might spend a lot of time getting the pieces compiled and linked together due to missing or conflicting dependencies. Impedance mismatches are inevitable: Chances are, the code pieces I have just welded together were never designed to "play nicely" with one another, or to suit the particular needs of my project.
  4. Grow: Hack up some hard-coded examples of my new code interfacing with existing "welded" code. At this point, my newborn code is sloppy and not at all abstracted, but that is okayI just want to get things working as quickly as possible. In the process, I debug lots of idiosyncratic interactions at the seams between my code and external code. Wrestling with corner cases becomes part of my daily routine.
  5. Doubt: When implementing a new feature, I often ask myself, "Do I need to code this part up all by myself, or is there some idiomatic way to accomplish my goal using the existing code base or libraries?" I do not want to reinvent the wheel, but it can be hard to figure out whether existing code can be molded to do what I want. If I am lucky, I can ask the external code's authors for help; but I try not to get my hopes up because they prob- ably did not design their code with my specific use case in mind. The gulf of execution is often vast: conceptually simple features take longer than expected to implement.
  6. Refactor: Notice patterns and redundancies in my code and then create abstractions to generalize, clean up, and modularize it. As I gradually refactor, the interfaces between my code and external code start to feel cleaner, and I also develop better intuitions for where to next abstract. Eventually I end up "sanding down" most of the rough edges between the code snippets that I started with in step 4.

(Now, repeat steps 4 through 6 until my project is completed.)

I do not have a good name for this style of programming, so I would appreciate any suggestions. The closest is Opportunistic Programming, a term my colleagues and I used in our CHI 2009 paper where we studied the information foraging habits of web programmers. Also, I coined the term Research Programming in my Ph.D. dissertation, but the aforementioned six-step process is widespread outside of research labs as well. (A reader suggested the term bricolage.)

Students currently pick up these hands-on programming skills not in formal CS courses, but rather through research projects, summer internships, and hobby hacking.

One argument is that the status quo is adequate: CS curricula should focus on teaching theory, algorithm design, problem decomposition, and engineering methodologies. After all, "CS != Programming," right?

But a counterargument is that instructors should directly address how real-world programmingthe most direct applications of CSis often a messy and ad hoc endeavor; modern-day programming is more of a craft and empirical science rather than a collection of mathematically beautiful formalisms.

How might instructors accomplish this goal? Perhaps via project-based curricula, peer tutoring, pair programming, one-on-one mentorship, or pedagogical code reviews. A starting point is to think about how to teach more general intellectual concepts in situ as students encounter specific portions of the six-step process described in this post. For example, what can "code welding" teach students about API design? What can refactoring teach students about modularity and testing? What can debugging teach students about the scientific method?

My previous CACM post, "Teaching Programming To A Highly Motivated Beginner," describes one attempt at this style of hands-on instruction. However, it is still unclear how to scale up this one-off experience to a classroom (or department) full of students. The main challenge is striking a delicate balance between exposing students to the nitty-gritty of real-world programming, while also teaching them powerful and generalizable CS principles along the way.

Please post your thoughts as comments or email me at philip@pgbo-vine.net.

Back to Top

Readers' comments:

Over the last year, I've spent a fair number of cycles thinking about the disconnect between how people actually build code and how we teach programming in classrooms. This article hits on some of the same points I've thought about, especially with regard to the fact that programming today is increasingly about information foraging and composition of existing solutions. The process is defined by experimentation and iteration.

It's also more of a social task, with online resources providing and peers providing increasing amounts of support. I've been thinking about these factors as I start my new job, slinging code for the Googs. As with many large companies there is a ton of existing code, and I have set up some pair programming sessions with folks on my team. These folks have a ton more experience with the tools that I will be using than I do. They can point me to resources I would have a hard time finding myself.

I wonder if coursework can mirror reality here. One idea that might work is the creation of a new course that paired undergrads across years, perhaps sophomores and seniors. The seniors should have more experience working on a project through coursework and internships.

One option is to provide open-ended projects based on the skills the seniors bring to the table. A senior of app programming experience might be asked to create a new Android or iPhone game, whereas one that had been working on building systems might be asked to create a large-scale data processing app on top of Hadoop. Another option might be to have a two-stage course focused on a specific project, separated between years. Current sophomores would have to come back in two years and share their knowledge.

Of course this is one of many solutions, but I think we can do a better job of teaching the social aspects of building software.
        Kayur Patel

I think it is difficult to mirror real-world programming scenarios in the classroom, because the classroom isn't the real world. But that doesn't mean that a programming course for beginners has to be structured along short, isolated programming exercises using a single programming language. For a few years now we have been pursuing a first-year programming course under the headline "object-oriented development of web applications." Of course, during this class the students program mainly using a single language, in this case it is Smalltalk. But because a web application is being developed they have to learn the usage of a web framework (Seaside), HTML, and CSS from the beginning. Later on, JavaScript supervenes.

During the whole course, each student develops one single web application. The development is guided by a series of online lectures. For the first steps unit tests are provided by the instructors, but during further progress of the project the students have to develop their own unit tests.

At the beginning of the second semester, the students have to build teams with three members. They learn to develop software cooperatively and to merge components developed independently.

As the complete course content is presented by online lectures, the instructors can play a role as coaches instead of lecturers. This allows them to adjust heterogeneous previous knowledge of the students.

Of course, not all real-world programming issues can be anticipated this way, but we think this kind of programming course allows many more typical problems to be covered than during a conventional one.
        Johannes Brauer

Back to Top

Author

Philip Guo is a visiting research scientist at edX. In the fall, he will join the Massachusetts Institute of Technology (MIT) Computer Science and Artificial Intelligence Laboratory (CSAIL) as a postdoctoral scholar.


©2013 ACM  0001-0782/13/08

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from permissions@acm.org or fax (212) 869-0481.

The Digital Library is published by the Association for Computing Machinery. Copyright © 2013 ACM, Inc.


Comments


Franklin Chen

I agree that programming as it exists in the real world is not taught and that this is a problem. I think this is a symptom of a larger problem: yes, programming is not computer science, just as building robots is not linear algebra, calculus, or physics. Problem is, we expect people who build robots to have had a decade of education, before arriving in college at all, to have mastered some basic math and science. But computer science, the "nuts and bolts" of programming, is not taught K-12. If all the low-level stuff of algorithms, data structures, and so forth could be taken for granted, then there could be an entire discipline revolving around programming that students could begin working on upon arrival in college. Imagine the difficulty of coming up with an undergrad curriculum for would-be physicists and roboticists if students arrived in college without knowing how to add. That's we face today in anything involving computation.


Anonymous

You're forgetting one thing - in the real-world you also have a purpose, a goal for what you are trying to create or a problem you are trying to solve. You also have a wealth of background and tacit knowledge about the constraints and affordances of programming. If you just had students mimic the actions of how programming is done in the real world, they would be doing it without understanding the reason for it.

Search for "problem-based learning" for more information on types of approaches in education that have students work on real-world, messy problems.


Anonymous

This style of programming goes by the name "code quilting". Part of the difficulty with moving this workflow method into the classroom is that it requires schools to redefine "cheating."


William Billingsley

We've had a course running for a couple of years now that seeks to addresses these issues, as well as the inherently collaborative nature of software engineering. We've published papers on it both at ICSE in 2012 and at the ACM's ITiCSE conference this year.

It's a second year course that in its first two iterations had approximately 70 students working on a common code-base: tinkering, extending, integrating their work, etc. This year we have 170 students. And we're on a path towards opening the course to the world.

(This also means that rather than have students work on small greenfield exercises, they're working on a project that in terms of numbers-of-programmers is possibly larger than some of them will work on in their first post-degree jobs.)

It's a follow-on to a more introductory programming course -- we expect students to come in to our course knowing the basic syntax of the language.

Billingsley, W. & Steel, J. 2013. A comparison of two iterations of a software studio course based on continuous integration. ACM conference on Innovation and technology in computer science education (ITICSE). 213-218

S, J.G & Billingsley, W. 2012 Using continuous integration of code and content to teach software engineering with limited resources. International Conference on Software Engineering (ICSE). 1175-1184


Gail Murphy

The ubiquity of a wide variety of frameworks and libraries and the growth of resources such as stackoverflow has changed the way in which a lot of software is developed. At UBC, we have been bringing this changing approach into our second year curriculum in Computer Science. We have been immersing students into Java by showing them how to read and take apart existing Java code. We spend time in lecture showing how to navigate through code, refactor code, and test hypotheses about how the code might work. By the end of the course, the students augment an existing Android application by integrating new features based on widespread web services, such as the Yelp API. With this approach, we have been able to achieve several of the steps you outline. This approach assumes more of a top-down learning style where students are able to operate without understanding all of the details of the software they write. We are continually trying to introduce new approaches into the course to balance out the needs of those with a more bottom-up learning style that would like to understand the ramifications of every statement they write. There are challenges in not only making the new style of software development more systematic and explainable to others but in also meeting the needs of the variety of learners one finds in a large classroom. Thanks for raising this important issue!


View More Comments