Dear KV,
I just signed on to a new project and started watching commits on the project’s GitLab. While many of the commits seem rational, I noticed one of the developers was first committing large chunks of code and then following up by commenting out small bits of the file, with the commit message “Silence warning.” No one else seemed to notice or comment on this, so I decided to ask the developer what kinds of warnings were being silenced. The reply was equally obscure—”Oh, it’s just the compiler not understanding the code properly.” I decided to run a small test of my own, and I checked out a version of the code without the lines commented out, and ran it through the build system. Each and every warning actually made quite a bit of sense. Since I’m new to the project, I didn’t want to go storming into this person’s cubicle to demand he fix the warnings, but I was also confused by why he might think this was a proper way to work. Do developers often work around warnings or other errors in this way?
Forewarned If Not Forearmed
Dear Forewarned,
Let me commend your restraint in not storming into this person’s cubicle and, perhaps, setting it and the developer alight, figuratively speaking of course. I doubt I would have had the same level of restraint without being physically restrained. I am told screaming at developers is a poor way to motivate them, but this kind of behavior definitely warrants the use of strong words, words I am not, alas, allowed to use here. But I commend to you George Carlin’s “Seven Words You Can Never Say on Television”1 as a good starting point. If you find that too strong you can use my tried-and-true phrase, “What made you think …” which needs to be said in a way that makes it clear you are quite sure the listener did not, in fact, think at all.
Once upon a time compilers were notoriously poor at finding and flagging warnings and errors. I suspect there are readers old enough to have seen unhelpful messages such as, “Too many errors on one line (make fewer),” as well as remembering compilers where a single missing character would result in pages of error output, all of which was either misleading or wrong.
There is a lesson here for both tool writers and tool users. If you write a tool that cries wolf too often then the users of that tool, in the absence of a new and better tool, will simply ignore the warnings and errors you output. Between warnings and errors, the latter are easier to get right, because the tool can, and should, stop processing the input and indicate immediately what the problem was. Communicating the problem is your next challenge. The error message I mentioned here came from a real, for-pay product sold by a company that went on to make quite a lot of money—it was not generated by some toy compiler created by a second-year college student. Looking back through previous Kode Vicious columns you will find plenty of commentary on how to write good log messages, but for tool writers, in particular those who write tools for other engineers, there are a couple of key points to keep in mind.
The first point is to be specific. Say exactly what was wrong with the input you were trying to process. The more specific your message, the easier it is for the user of the tool to address the problem and move on. Given that computer languages are complex beasts, being specific is not always easy, as the input received may have sent your compiler off into some very odd corners of its internal data structures, but you must try to maintain enough state about the compilation process to be able to make the warning or error specific.
The second point is even simpler: tell the consumer exactly where, down to the character in the file if possible, the error occurs. Older compilers thought the line was enough, but if you are looking at a function prototype with five arguments, and one of them is wrong, it is best if your tool says exactly which one is causing the issue, rather than making the rest of us guess. A blind guess on five arguments gives you a 20% chance, and if you think tool users do not have to guess blindly very often, then you are one of those engineers who never have to deal with random bits of other people’s code.
If you want a good example of a tool that tries to adhere to the two points I have laid out, I recommend you look at Clang and the LLVM compiler suite. Their errors and warnings are clearer and better targeted than any I have used thus far. The system is not perfect, but it beats other compilers I have used (such as gcc).
If you are a tool consumer you had better be quite sure of your knowledge of the underlying system so you can say, with better than 90% probability, that a warning you receive is a false positive. Some readers may not know this, but we programmers have a bit of an issue with hubris. We think we are modeling in our heads what the code is doing, and sometimes what we have in our heads is, indeed, a valid model. That being said, be prepared to be humbled by the tools you are using. Good tools, written by good tool writers, embody the knowledge of people who have spent years, and in some cases decades, studying exactly what the meaning of a code construct is and ought to be. Think of the compiler as an automated guru who is pointing you to a higher quality of code. There are certainly false gurus in the world, so it pays to pick a good one, because the false ones will surely lead you into a world of programming pain.
KV
Dear KV,
I saw your response to Acquisitive in the June 2015 Communications.3 I liked your response, but would have liked to see you address the business side.
Given that computer languages are complex beasts, being specific is not always easy.
Once the acquisition is completed, then Acquisitive’s company owns the software and assumes all of the associated business risks. So my due diligence on the code would have included ensuring the code in question was actually written by the engineers at the other company or that it was free and open source software where the engineers were in compliance with the associated open source license. There is a risk that one or more of the engineers brought the code from a previous employer or downloaded it from some online source where the ownership of the code was uncertain. In short, management’s request of Acquisitive should be seen not only as checking the functionality and quality of the code, but also protecting the company against litigation over the associated IP.
Moving up in an organization comes with the need to understand the business and management issues of that organization. Management’s request of Acquisitive might also be seen as a test of whether he has the right business instincts to move higher than the “architect” role to which he was promoted. Someone with a good tech background and strong business knowledge becomes a candidate for CTO or other senior roles.
Business and Management
Dear Business,
You are quite right to point out the issues related to the provenance of the software that Acquisitive has to review and that this ought to also be on the list when reviewing code that will be reused in a commercial or even an open-source context. The number of developers who do not understand source code licensing is, unfortunately, quite large, which I have discovered mostly by asking people why they chose a particular license for their projects. Often the answer is either “I did a search for open source” or “Oh, I thought license X was a good default.” There are books on this topic, as I’m sure you know, such as Lindberg2 but it is very difficult to get developers to read about, let alone understand, the issues addressed in those books. But for those who want to be, or find themselves thrust into the role of Acquisitive, this type of knowledge is as important as the ability to understand the quality of acquired code. Anyone who thinks working through a ton of bad code is problematic has not been deposed by a set of lawyers prior to a court case. I am told it is a bit like being a soccer goal tender, but instead of the players (lawyers) kicking a ball at you, they are kicking you instead.
A basic understanding of copyright and licensing can go a long way, at least in asking the correct questions.
From a practical standpoint, I would expect Acquisitive to ask for the complete commit log for all the code in question. Rational developers—and there are some—will actually put in a code comment when they import a foreign library. They may even notify their management and legal teams, if they have them, about the fact they are using code from some other place. Very few large systems are cut from whole cloth, so the likelihood a system being reviewed contains no outside code is relatively small. Asking the legal team for a list of systems that have been vetted and imported should also be on Acquisitive’s checklist, although it does require talking to lawyers, which I am sure he is inclined to do.
Harking back to the theme of the original letter, even with these pieces of information in hand, Acquisitive should not trust what they were told by others. Spot-checking the code for connections to systems or libraries that are not called out is laborious and time consuming, but, at least in the case of open source code, not insurmountable. Some well-targeted searches of commonly used APIs in the code will often sniff out places where code might have been appropriated. Many universities now use systems to check their students’ code for cheating, and the same types of systems can be used to check corporate code for similar types of cheats.
A basic understanding of copyright and licensing can go a long way, at least in asking the correct questions. In open source we have two major types of licenses, those that control the sharing of code and those that do not. The GPL family of licenses is of the controlled type; depending on the version of the license (LGPL, GPLv2, and GPLv3) the programmer using the code may have certain responsibilities to share changes and fixes they make to the code they import. The BSD family of licenses does not require the programmer using the code to share anything with the originator of the code, and is used only to prevent the originator from being sued. It is also important to verify that the license you see in the file has not been changed. There have been cases of projects changing licenses in derived code, and this has caused a number of problems for various people. A reasonable description of common open source licenses is kept at opensource.org (http://opensource.org/licenses), and I would expect Acquisitive to have looked that over at least a few times during the review.
Lastly, I am not a lawyer, but when I deal with these topics I make sure I have one on my side I trust, because the last thing I want to do is bring a knife to a gun fight.
KV
Related articles
on queue.acm.org
Commitment Issues
George Neville-Neil
http://queue.acm.org/detail.cfm?id=1721964
Making Sense of Revision-control Systems
Bryan O’Sullivan
http://queue.acm.org/detail.cfm?id=1595636
20 Obstacles to Scalability
Sean Hull
http://queue.acm.org/detail.cfm?id=2512489
Join the Discussion (0)
Become a Member or Sign In to Post a Comment