Software Drift

Dear KV,

I have been porting some code between two related—but divergent—open source projects. The project I am porting to was forked from another project many years ago, and it seems the names of the APIs have changed and no one is tracking what is going on in the other project. This has led to APIs for exactly the same type of function having different names, even if what they do is the same. As you can imagine, this is maddening and means that the first part of “porting” is more like mechanical translation from one name to another. I could see this happening with proprietary software that is developed in secret, but both of these projects are developed publicly with long-term mailing lists and public repositories. Surely, there has to be a better way.

Porting Parted Paths

Dear PPP,

I see you have fallen into the common trap we all fall into when building software: We assume people are not the problem and everything has a nice, neat, technical solution. Alas, that is not the case. In fact, in open source—even more than in corporate development, where one can use the carrot of cash and the stick of firing people—people can go their own way. A fork often develops in an open source project because of personal disagreements, which are, of course, always couched in technical terms. Alice believes she is designing or implementing the software in a better, stronger, faster way than Bob, and, of course, Bob thinks he is doing the better job. And so they part ways, and fork.

Now, sometimes forks, which are a lot like divorces, are messy and sometimes they are amicable, but as anyone who has ever gone through a breakup knows, even the most amicable among them usually go through a cooling-off period. Even though you have agreed “to be friends,” you do not contact or see the other person for a while. So too, with forks in software.

Thus, we get software drift. Alice adds a feature but does not tell Bob, and why should she? Bob can read the mailing list as well as any other Tom, Dick, or Mary, and the repository is open. Why should Alice tell Bob anything? Bob, of course, feels the same way. So now the codebases drift apart, slowly or quickly but inevitably and inextricably.

Since the systems have a common parent, they probably work in the same technical domain, and therefore the features and fixes that will be added are probably similar. KV happens to have an example case at hand: Two operating systems that diverged before they added SMP (symmetric multiprocessing) support. When an operating system adds SMP to an existing kernel, the first thing we think of is locks, those handy-dandy little performance killers that we have all been sprinkling around our code since the end of Dennard scaling.

There are many types of locks—just look in any operating system textbook—but the most basic of them is the mutex. You would think that this very simple type of lock would have a very simple API that one system could easily copy from the other. Why make the APIs different when what they express is the same? I do not know. Ask Bob and Alice, because you can see what they did here:

Alice's Mutex

void

mtx_init(struct mtx *mutex, const char *name, const char *type, int opts);

void

mtx_destroy(struct mtx *mutex);

void

mtx_lock(struct mtx *mutex);

void

mtx_unlock(struct mtx *mutex);

Bob's Mutex

void

mutex_init(kmutex_t *mtx, kmutex_type_t type, int ipl);

void

mutex_destroy(kmutex_t *mtx);

void

mutex_enter(kmutex_t *mtx);

void

mutex_exit(kmutex_t *mtx);

I will not address the “Who got there first?” question, but whoever did, maybe, just maybe, it might have been a good idea for Alice and Bob to cooperate and use similar names for the APIs. What is ironic—wait, not ironic, angering—about all this is that even the argument lists of three of the four APIs listed here are the same.

Now, just because we are having so much fun, let’s compare what the respective manual pages say about these functions:

The mtx_lock() function acquires an MTX_DEF mutual exclusion lock on behalf of the currently running kernel thread. If another kernel thread is holding the mutex, the caller will be disconnected from the CPU until the mutex is available (i.e., it will block).

Acquire a mutex. If the mutex is already held, the caller will block and not return until the mutex is acquired.

I was going to reorder these statements to make it a bit more difficult to guess which line was from which system, but left them in matching order.

“Ah, but wait!” I hear you cry. “Maybe Alice’s mutex is better than Bob’s!” or vice versa. The underlying performance characteristics actually do not matter here. We are talking about the semantic load of remembering two names for the same thing, and who wants that extra bit of cognitive load sprinkled throughout the codebase? Not KV! If a rose by any other name would still smell just as sweet, then a mutex by another name is still a basic locking primitive that doesn’t need to be named twice.

To top all this off, there is a third fork of the same codebase that also has its own, uniquely named set of mutex functions, I am not going to copy and paste more annoying code into this column. I will simply commend all readers to look before they code and not splatter the world with nine billion names for mutexes or any other API.

Related articles

Kode Vicious

Forked Over

https://queue.acm.org/detail.cfm?id=2611431

Open Source to the Core

John Hubbard, Apple Computer

https://queue.acm.org/detail.cfm?id=1005064

Open-Source Firmware

Jessie Frazelle

https://queue.acm.org/detail.cfm?id=3349301