In this post, I describe a ubiquitous style of programming that, to my knowledge, has never been formally taught in the classroom.
In most programming classes, students write programs in a single language (e.g., Java, Python) and its standard library; they might use a well-documented third-party library for, say, graphics. Students fill in skeleton code templates provided by instructors or, at most, write a few chunks of code "from scratch." Specifications and interfaces are clearly defined, and assignments are graded using automated test suites to verify conformance to specs.
What I just described is necessary for introducing beginners to basic programming and software engineering concepts. But it bears little resemblance to the sorts of programming that these students must later do in the real world.
Over my past decade of programming, I’ve built research prototypes, extended open-source software projects, shipped products at startups, and engaged in formal software engineering practices at large companies. Regardless of setting, here are the typical steps that my colleagues and I take when starting a new project:
1. Forage: Find existing snippets of code to build my project upon. This might include code that I wrote in the past or that colleagues sent to me in various stages of bit-rot. If I’m lucky, then I can find a software library that does some of what I want; if I’m really lucky, then it will come with helpful documentation. Almost nobody starts coding a real-world project "from scratch" anymore; modern programmers usually scavenge parts from existing projects.
2. Tinker: Play with these pieces of existing code to assess their capabilities and limitations. This process involves compiling and running the code on various inputs, inserting "print" statements to get a feel for when certain lines execute and with what values, and then tweaking the code to see how its behavior changes and when it breaks.
(Now loop between steps 1 and 2 until I'm satisfied with my choice of building blocks for my project. Then move on to step 3.)
3. Weld: Try to attach ("weld") pieces of existing code to one another. I might spend a lot of time getting the pieces compiled and linked together due to missing or conflicting dependencies. Impedance mismatches are inevitable: Chances are, the code I have just welded together were never designed to "play nicely" with one another or to suit the particular needs of my project.
4. Grow: Hack up some hard-coded examples of my new code interfacing with existing "welded" code. At this point, my newborn code is sloppy and not at all abstracted, but that’s okay -- I just want to get things working as quickly as possible. In the process, I debug lots of idiosyncratic interactions at the seams between my code and external code. Wrestling with corner cases becomes part of my daily routine.
5. Doubt: When implementing a new feature, I often ask myself, "Do I need to code this part up all by myself, or is there some idiomatic way to accomplish my goal using the existing code base or libraries?" I don’t want to reinvent the wheel, but it can be hard to figure out whether existing code can be molded to do what I want. If I’m lucky, then I can ask the external code's authors for help; but I try not to get my hopes up because they probably didn’t design their code with my specific use case in mind. The gulf of execution is often vast: Conceptually simple features take longer than expected to implement.
6. Refactor: Notice patterns and redundancies in my code and then create abstractions to generalize, clean up, and modularize it. As I gradually refactor, the interfaces between my code and external code start to feel cleaner, and I also develop better intuitions for where to next abstract. Eventually I end up "sanding down" most of the rough edges between the code snippets that I started with in step 4.
(Now repeat steps 4 through 6 until my project is completed.)
I don’t have a good name for this style of programming, so I’d appreciate any suggestions. The closest is Opportunistic Programming, a term that my colleagues and I used in our CHI 2009 paper where we studied the information foraging habits of web programmers. Also, I coined the term Research Programming in my Ph.D. dissertation, but the aforementioned six-step process is widespread outside of research labs as well. (A reader suggested the term bricolage.)
Students currently pick up these hands-on programming skills not in formal CS courses, but rather through research projects, summer internships, and hobby hacking.
One argument is that the status quo is adequate: CS curricula should focus on teaching theory, algorithm design, problem decomposition, and engineering methodologies. After all, "CS != Programming," right?
But a counterargument is that instructors should directly address how real-world programming — the most direct applications of CS — is often a messy and ad-hoc endeavor; modern-day programming is more of a craft and empirical science rather than a collection of mathematically-beautiful formalisms.
How might instructors accomplish this goal? Perhaps via project-based curricula, peer tutoring, pair programming, one-on-one mentorship, or pedagogical code reviews. A starting point is to think about how to teach more general intellectual concepts in situ as students encounter specific portions of the six-step process described in this post. For example, what can "code welding" teach students about API design? What can refactoring teach students about modularity and testing? What can debugging teach students about the scientific method?
My previous CACM post, "Teaching Programming To A Highly Motivated Beginner," describes one attempt at this style of hands-on instruction. However, it's still unclear how to scale up this one-off experience to a classroom (or department) full of students. The main challenge is striking a delicate balance between exposing students to the nitty-gritty of real-world programming while also teaching them powerful and generalizable CS principles along the way.
Please post your thoughts as comments or email me at [email protected].
Over the last year, I've spent a fair number of cycles thinking about the disconnect between how people actually build code and how we teach programming in classrooms. This articles hits on some of the same points that I've thought about, especially with regard to the fact that programming today is increasingly about information foraging and composition of existing solutions. The process is defined by experimentation and iteration.
It's also more of a social task, with online resources providing and peers providing increasing amounts of support. I've been thinking about these factors as I start my new job, slinging code for the Googs. As with many large companies there is a ton of existing code, and I've set up some pair programming sessions with folks on my team. These folks have a ton more experience with the tools that I will be using than I do. They can point me to resources that I would have a hard time finding myself.
I wonder if coursework can mirror reality here. One idea that might work is the creation of a new course the paired undergrads across years, perhaps sophomores and seniors. The seniors should have more experience working on a projects through coursework and internships.
One option is to provide open ended projects based on the skills the seniors bring to the table. A senior of app programming experience might be asked to create an new Android or iPhone game, whereas one that had been working on building systems might be asked to create a large scale data processing app on top of hadoop. Another option might be to have a two stage course focused on a specific project, separated between years. Current sophomores would have to come back in two years and share their knowledge.
Of course this is one of many solutions, but I think we can do a better job of teaching the social aspects of building software.
During the whole course each student develops one single web application. The development is guided by serious of online lectures. For the first steps unit tests are provided by the instructors but during further progress of the project the students have to develop their own unit tests.
At the beginning of the second semester the students have to build teams with three members. They learn to develop software cooperatively and to merge components developed independently.
As the complete course contents is presented by online lectures the instructors can play a role as coaches instead of lecturers. This allows them to adjust heterogeneous previous knowledges of the students.
Of course, not all real-world programming issues can be anticipated this way, but we think this kind of programming course allows many more typical problems to be covered than during a conventional one.
My experience programming has been like that; creating, reusing, testing, all mingled together with the self-critizism based on all those concepts you learn at school: big O, maintainability, patterns, data struture, algorithms, etc... Then, you iterate over this to refine your code and make it the best you can.
I talked about software bricolage in my first-year programming course today: http://patricklam.ca/ece155, Lecture 12. I think it's very useful for students to at least hear about "programming beyond the classroom", even if they don't get to practice it for a while. (Our students have a fourth-year design project, so they will get to do some bricolage for sure.)
Great Article! I Used to be a full-time coder, and before that I was a Designer. Now I run a team of coders because Using tips like these, I found ways to work smarter than most other development teams. It's great that you have shared this, so here are some things we do a little differently. Maybe we can make a beautiful Kluge process for programmers and CS alike.
1) Assess, know you probably do this also, but I think that this is an essential part of the coding process. To assess the project and come up with a high-level description, process flows, wireframes, IO diagrams. No point getting the materials to build without having a good idea what you are building...
2) Forage, Like you we are not bothered that a code-base is not "ours". We actually like the idea of using MIT or GPL code that we can feed back to the community and have updated externally to billable hours. The strange part about this is that we do not just look for code in our language, but languages we are competent in reading, so we have actually done full source ports of c# to python and php. We have also converted some of my very old VB into C#, Java and C++ apps.
3) Adapt. Now here is an area we probably differ. We do like to create adapters for the functionality we use, so we might have some C code within a C++ project. We know that the code will not be very OOP, so we will create a separate object, rather than do a re-code. This keeps the original code on it's original purpose and adds our access to that code through a specialized object or set of objects that conform to our needs and business logic.
4) Record. We believe in starting a new repository for each project we use and updating the changes we make, storing the parts of the system(s) separately. There is nothing worse than working on a project that does ABCDE... perfectly and then when project FGHI comes in, you have to start again. This kinda rests on 3) to work properly, but hey-ho...
5) Integrate (like welding, but we try to keep as much separate as possible, so we just term it integrating, so nobody goes building all-singing, all-dancing objects...
7) Re-factor or improve our additions. The base-code should work fine, with our adapters sitting over the front of any code we rest upon
My personal approach is "Code now and re-factor later". I only found out after a few years of coding that my design patterns have well known names. The most important thing is that one needs to be proud of his code and constantly improve his skills.
Well thought out. While generally true, when using old (bit-rot) code segments, it is because you are aware of the functionality of code you have previously written and you adapt those "patterns" with the functionality that you need and adapt or mold the interfaces to meet the requirements. (assuming there is a design or architecture to the software your are creating). Also, if you think in terms of design, analyzing design elements within Step 2 prevents the Weld and Grow steps for code that does not contain the design elements you need.
I used to call what your learned in college, getting your tool set and programming in the real world was learning how to use that toolset.
You never took one of my programming classes.
I started programming when you had to build your own "Libraries" since there was no internet to steal (er. find) code from except six ways to program "Hello World". I have found the best model is the "Little Red Book" method. It comes from Mao's quote, "The guerrilla must move amongst the people as a fish swims in the sea." To me it means you have to understand not only what the result is but the path that has to be taken to get there and recognize the way points. That way no matter what code you steal (er borrow) or if you have to create the code on your own; the goal is reached by all parts at the same time so that Program, Customer and Paycheck (PCP) are in harmony.
Chances are that somebody right out of school will be assigned to some sort of maintenance task. So the task of learning a codebase (with or without design documentation) and having to fix a few bugs in it? That would be useful - and real world - experience. From there move on to adding a feature, maybe with the side-task of having to update the design documentation.