A while ago I was asked to audit a number of software projects under way at an insurance company. These projects were to be the basis for a number of later developments, so any schedule slippage would likely have a significant and cascading effect on future plans and projects. People on the projects were concerned about their committed schedules, budgets, and progress to date.
The development teams presented me with a large stack of paper consisting of requirements specifications, architecture documents, development plans, and status reports for four key IT projects. The task was to wade through the pile and then come out with a verdict: would the projects work? Did they have sufficient resources? Had they overcommitted? Were they likely to overrun in cost, in schedule, in both? Would they get a thumbs up or a thumbs down?
This is a highly competent company with visionary leaders and staff. Indeed, requesting independent validation of plans and performance is a characteristic of a high-performing company. In this particular case, people suspected there might be some issues with the project estimates and the resulting commitments. The audit would give them an outsider’s view of the projects and maybe identify where there might be issues. Perhaps they were hoping I could just give them a clean bill of health and allay any concerns.
It is a challenge to be given a mound of data on a set of large and complicated projects and render any kind of meaningful opinion on them in just a couple of days. It is easy to get overwhelmed with detail to the point where you cannot see the woods for the trees and the trees for the leaves. To make sense of this kind of situation, I try to run a set of project estimates, rather as if I was evaluating the project for early feasibility and commitment. I use a variety of estimation techniques to "model" the projects. This modeling activity consists of setting up estimation frameworks so they match, as closely as possible, the performance basis of the projects, using such things as expected scope and productivity levels and ranges. Since these projects had actually started, we also had the luxury of having some in-flight progress metrics available. The metrics were basic, consisting mostly of expended effort to date with some staffing levels, but they did provide some touchpoints against which to test the original assumptions.
If we don’t quite know how big or complicated a task is, it is always safer if we allocate more resources and take more time.
So I plugged through the piles of paper, collected more data in interviews, tested assumptions with members of the teams, and built models of the projects. I checked the behavior of the models against what we knew about the projects, and then built more models. I compared the results with the commitments and plans, trying to find the model(s) that "best fit" what appeared to be happening.
And I found out some interesting things.
Most estimation processes are highly parametric. That is, they require key aspects of the product, the project, and the organization to be defined as a set of parameterized values. The discipline of trying to define values for these parameters is actually quite an effective method for forcing a high-level understanding of the projects. For instance, if you can reasonably scope a project by defining its size and its complexity, it is possible to avoid digging into the enormous detail of project plans and task lists. This approach does require some knowledge of the product, but it can keep it at a manageable and, to those short of both time and context, a usable level.
Yield to Productivity
Another aspect that must be characterized is the behavior of the development environment in terms of its ability to build the system. There are many facets to a company’s capacity to build software. They relate to the competence of the people involved, the way the people are organized, the tools they use, and how the organization itself works. The word most commonly used to describe this collection of capabilities is "productivity." Personally, I am not a fan of the word in this context, since I think it strongly implies a manufacturing paradigm, and developing software is not manufacturing. However, the phrase I would prefer to use: "the rate of cooperative consistent operational knowledge acquisition, transcription, and validation" might be a little unwieldy. So I guess I’ll stick to "productivity."
This productivity consists of the sum effect of many factors operating in the development environment. According to Barry Boehm , the most significant of these factors are the personnel attributes such as personnel capability and experience. Given that software development is a primarily a cooperative learning activity, this seems perfectly reasonable. However, there are many other attributes of the environment that need to be characterized. These include the effect of tools, processes, documentation levels, and risk management. Factors outside the direct area of development also come into play. The volatility or stability of functional requirements and the relationship with the end-user customer are two that clearly have an effect on productivity almost independent of development capability.
The Model is the Message
One could argue that a major part of the job of any manager engaged in the business of software is to understand the product and understand the development environment that is tasked to build that product. Creating estimation models requires the explicit characterization of these two entities in a controlled, defensible, and disciplined manner. So it is not too big a step to assert that, since understanding these things is fundamental to managing the development of software, a good way to achieve this is to go build estimation models.
An estimation model is a high-level categorization of many of the more pertinent factors that govern software development. Once built, we can use these models to figure out what is most likely to happen when we actually try to develop the system. If the model is set up in a way that these factors can be easily manipulated, we can use it as a kind of simulator. We can, for instance, process what might happen if we added or removed people from the project. We could forecast the effect of adding or removing functions or installing new tools. We could run multiple models varying key parameters and assumptions and then match the models’ outputs against reality as it unfolds to see which model is the closest fit.
This is what I did with the projects I was asked to assess and evaluate.
Risk as Probability
There are techniques and tools available that allow us to calculate the level of risk on a project. The simplest way of describing the risk is as a probability. Some estimation tools, such as SLIM-Estimate, have this capability built in.1 Other tools and techniques usually require some manipulation of the input data to achieve a similar result. The probability describes the likelihood of achieving a specific target (say, a schedule goal of 20 months) as a straight percentage. This is an important consideration in any estimate, since estimates are by their very nature probabilistic.
Few companies seem to use this approach. In my experience, most organizations develop project estimates based on a planning paradigm. The commitment estimate is created from a detailed task-based work breakdown structure and the summary numbers of schedule, cost, and staff that are used to resource the project are just that; they are summaries of a plan.
Even though its probability may not be explicitly stated, a planning-based estimate, unless "risk contingency" has been explicitly (or surreptitiously) applied, is a "50%" estimate. Any underestimation of work will cause the project to run over and any overestimation will cause it to run under. A 50% estimate is based on what we expect to happen given what we know at the time of estimating and it becomes invalidated when something we don’t expect to happen, happens. For a 50% estimate to end up being "accurate" one of two things must occur:
- Nothing unexpected happens.
- The "bad" unexpected things that slow the project down are canceled out by the equally unexpected "good" things that speed it up. In  I identified this as a "lucky" (as opposed to an "accurate") estimate.
So a 50% plan will "fail" the moment that one bad thing happens to the project, unless it is offset by one good thing. In practice, companies often resort to the unpaid overtime degree of freedom to help manage risk. In doing so, they make the development team work long hours in an attempt to counteract the effect of realized risk. Sometimes estimators will bury risk management resources (aka "contingency" or "pad") in the lower-level planning tasks. They do this simply as a natural defensive reaction to their lack of knowledge and the prospect of lots of unpaid overtime. This is a very rational tactic—if we don’t quite know how big or complicated a task is, it is always safer if we allocate more resources and take more time. It is the suspected presence of this hidden risk reserve that causes some management to peremptorily slash estimates when they land on their desks.
So if we routinely tackle projects at 50% probability, or even at higher levels of probability due to the effect of embedded contingency, we should expect that half of all our projects would be successful. But this is not the case.
When I asked what was the cost of risk on these insurance projects, I was asking: what is their mortality cost?
Four and 20%
With my insurance client’s four projects, my best-fit assessment came up with schedule probabilities that ranged from 15% to 25% likelihood of success, averaging out around 20%. This calculation made for some interesting discussions. The combined probability of achieving all four projects on time is below 0.2% (assuming the projects are independent variables). That’s less than one chance in five hundred. These were some of the most important projects in the company and it appeared they were taking a gamble that made the spin of a roulette wheel look like a sure thing. Even allowing for significant inaccuracy in the available data, risk calculation, and variable dependencies, getting these projects finished on time was clearly a really long shot.
By way of making the point, I did some calculations based on the expected costs of the projects and then asked: "What is the cost of risk on these projects?"
The premium quoted in all life insurance policies contains what is called a "mortality cost." This is essentially the calculated cost of carrying the risk of insurance. For older people in ill health, this cost of risk is naturally rather high. For people engaged in dangerous professions, such as fire fighters, there is an increase in likelihood of injury or fatality and therefore an increase in mortality cost.
We see the same thing in the stock market. Higher-risk investments must be compensated by a greater payback, otherwise no one would ever take the risk. So when I asked what was the cost of risk on these insurance projects, I was asking: what is their mortality cost?
The Insurance Business
The insurance business is interesting in many ways. It has been around for a long time—Lloyd’s of London has been insuring risk since at least 1688. It is also a business where they don’t really sell anything, except perhaps a promise. There is no "product" in the sense of something tangible that one actually purchases, owns, and uses. Indeed, most of us hope to defer the "use" of a life insurance policy for as long as possible. The insurance industry is also highly metrics driven and has been so for over 100 years. Actuaries and underwriters spend most of their time trying to figure out, sometimes in excruciating detail, just exactly how much risk they are taking and what is the cost of that risk.
But very few companies seem to explicitly calculate the cost of risk on their software projects, and the insurance industry appears to be no exception. This company had committed to a set of critical projects with no real quantifiable measure of the risk they were undertaking.
With probing it appeared that people thought the probability of success of these projects was rather high—certainly over 50%—though nobody could fix on what the risk level really was, how they would figure it out, or what threshold would be acceptable.
This situation is not unique by any means, not even within the insurance and finance world. But it is a little ironic that a business whose primary function and skill is to precisely calculate the cost of risk does this on just about everything it does.
Except software projects.