Author: Tony Welsh

Tony Welsh, president of Barbecana, Inc., was one of the founders of Welcom and was personally responsible for the development of the algorithms in Open Plan (time analysis, resource scheduling, and risk analysis), as well as being heavily involved in the design of the company’s earned value tool, Cobra. Welcom was sold to Deltek in March 2006. Tony recently formed Barbecana and developed Full Monte, a Microsoft Project add-in which performs cost and schedule risk analysis using Monte-Carlo simulation. Barbecana has just released version 2.0 of Full Monte, which includes new features such as unlimited correlations between task durations, visual feedback on entered distributions and correlations at the time of data entry, export of reports to CSV format, and sophisticated new sensitivity analysis capabilities. Full Monte is available at Barbecana.com, or visit the website to download a free 30-day trial.

Designated Drivers

There has been a lot of talk this year about the use of drivers in schedule risk analysis. The idea is that the correlations between task durations are the result of one or more outside influences (the drivers) and that their influence on task duration can and should be modeled directly. As a software vendor, I have considered supporting drivers and probably will do so in the future, but I do have some concerns. Before going into these concerns, I should explain the idea as I understand it. The sampled duration for a task is modified by one or more randomly generated multiplicative factors representing the drivers. The value of each driver is drawn from a distribution roughly centered on 1. For example, if a task is affected by two drivers these might be represented by two triangular distributions T(0 .7, 1.0, 1.2) and T(0.9, 1.0, 1.4). On a particular iteration, we might sample 1.04 and .98, so the value sampled for the task duration is multiplied by 1.04 times .98, or about 1.02. While I can see some circumstances where this idea might be useful, my concerns relate to its inappropriate use as a substitute for correlations. It is not easy to know what the final distribution of the task duration is or how it is correlated with other tasks tied to one or more of the same drivers. Protagonists say that real people do not think in terms of correlation coefficients and so do not need to know what these are, but that leads to new problems. Firstly, one has to estimate the distribution of some pretty abstract things; instead of estimating the distribution of how long a task will take, one has to estimate the distribution not of the factor itself (say the weather) but of the effect of this factor on task durations. This is such an abstract concept that its value is not knowable even after the event. And once you have determined that the driver effect ranges say from 0.8 to 1.3 this applies to all tasks tied to that driver. So it is impossible to model the situation where two tasks are affected by the same outside influence but to different degrees. (You can make the correlation coefficients different, assuming you care about these, but only by changing one or both of the standard deviations in ways you may not want.) And it is impossible to model negative correlations. I have also come across a fallacy resulting from a misunderstanding about what exactly is meant by two tasks being influenced by the same factor. It is important to understand that this means the two tasks are both influenced by a particular realization of that factor. That is, if on a particular Monte Carlo iteration the sampled value is 1.15, then this factor applies to all tasks tied to that driver. So, you cannot model a factor like estimation uncertainty using a driver, as I have actually heard suggested, because the uncertainty could go either way on each individual task; the fact that all estimates are subject to this source of uncertainty does not make them correlated. I fear that such misunderstandings will result in bad models, and hence bad results. Drivers also require more data entry than directly specifying correlations, and very much more if one wants to do what-if comparisons of different degrees of correlation. (Changing the degree of correlation between two tasks, while leaving their distributions approximately the same, requires changing at least 6 parameters, whereas the same exercise using correlations requires changing only 1 or 2 parameters and leaves the distributions exactly the same.) One advantage of drivers is that they can be applied to events which may not happen at all. You can apply a probability of a driver being active, allowing them to be used to model improbable but possibly catastrophic risks of the sort generally found in a risk register. This is not usually the primary reason for doing quantitative risk analysis, but I think this is the only sensible context in which to use drivers. My conclusion is therefore that drivers have a place in quantitative risk analysis, but not as a way of modeling correlations.  

Dealing With Uncertainty in Project Schedules

Project schedules are concerned with the future. A project schedule is, in effect, a projection or forecast of how and when the project will be performed. This projection is typically made using the critical path method (CPM), which means that the projected dates are based upon the estimated durations of the various tasks and the logical dependencies between them. The critical word in the above is “estimated.” The durations that we put into our CPM model are just estimates or predictions. And as Niels Bohr, Yogi Berra, or Samuel Goldwyn (depending on who you ask) once may have pronounced, “Prediction is very difficult, especially about the future.” Everything we think we know about the future is subject to uncertainty, and duration estimates are no exception. The amount of uncertainty varies, depending upon how much the current project resembles previous ones, how much relevant historical data we have, and so on. But we can be certain that there is some uncertainty. Recognition of this fact is essential to creating realistic project plans. The way we recognize uncertainty about task durations is to replace single-point estimates with a range of possible values. For example, instead of saying a task will take five days, we say it will take somewhere between three and eight days. Often this will be augmented by giving a “most likely” value, in this case maybe five days. This two- or three-point estimate may be further refined by specifying the shape of a probability distribution to represent the data. Historically, this has included a subset of the beta distribution sometimes called beta-PERT, though there is no particular reason to favor this. In fact, there is some empirical evidence to suggest that the lognormal distribution is more appropriate. What About PERT? Developed by the US Navy in the 1950s to manage the Polaris missile program, PERT (Program Evaluation and Review Technique) was an early attempt to incorporate uncertainty into the project schedule. It was based upon the assumption that task durations had a beta-PERT distribution represented by three parameters — the optimistic value A, the most likely value B, and the pessimistic value C — and that this resulted in a mean of (A+4B+C)/6 and a standard deviation of (C-A)/6, though both of these are approximations. Having made these approximations, PERT pretty much forgets about uncertainty. It performs a CPM calculation based upon the mean values of the durations given by the above formula. This isn’t very different from doing a deterministic CPM based upon the most likely value B. In the symmetrical case, where B = (A+C)/2, it is no different at all. The above procedure produces an estimate of the expected finish date, and this is augmented by an estimate of the standard error of the estimate based upon the fact that variances are additive. The variance is the square of the standard deviation, so if we square the standard deviation estimates for each task on the critical path, add these together, and take the square root of the result, we get an estimate of the standard error of the project finish date. The problem is that it totally ignores any paths other than the one that appears critical based upon the presumed expected durations. In reality, there may be paths that are nearly critical on this basis and that, given the uncertainty in the duration estimates, might in actual fact turn out to be critical. It turns out that these other paths interact in a rather complex way, creating a systematic error, or bias, which means that PERT underestimates how long the project will take. (In his excellent book, The Flaw of Averages, Sam Savage explains the danger of using averages to represent a range of possible outcomes. At the very least this approach fails to reflect the uncertainty inherent in the data; but in many cases it actually introduces bias due to what he calls the “Strong Flaw of Averages.” Project networks are such a case.) Monte Carlo Simulation Monte Carlo Simulation is a technique that has application in many areas. In particular, it can be used wherever we have a predictive model with stochastic elements that interact in a way to make analysis intractable. A project network is such a model, because the various tasks combine in complex ways, sometimes in series and sometimes in parallel. The only practical way to make projections based on such models is Monte Carlo simulation. Monte Carlo simulation gets its name from the famous casino and proceeds by: Generating random samples from distributions representing the random variables in a model — in the present case, the durations of the tasks in a project network; Calculating the model as if these sampled values were the known actual values — in the present case, to determine values of interest such as the project completion date; and Repeating this process a large number of times (known as “trials”) in order to build up a histogram of these values of interest. Monte Carlo simulation thus takes into account the uncertainty associated with all task durations, not just those on the supposed critical path. As a result, it’s quite possible that the critical path will be different on different trials, in other words with different random samples from the duration distributions. And this is what causes “merge bias” (to be discussed shortly). But even in the absence of such bias, it’s worthwhile to get a realistic assessment of the likely project completion date. The following graph shows the cumulative probability of completing a project by any particular date. The expected finish date, which is also the median finish date, is 7/21/12. But there’s a 10% chance that the project won’t finish before 8/6/12. If the completion date is critical, for example because of penalty clauses, one would want to make sure that the date enshrined in the contract is not the expected date or the most likely date, but a date that one can be reasonably confident of meeting. Merge Bias Merge bias is probably the single most important reason why PERT is incorrect and why Monte Carlo simulation is the only way to properly process uncertainty in project networks. Merge bias occurs whenever two or more paths converge in a network and the uncertainty about their durations is such that any of them might turn out to be critical. To illustrate, consider a task with just two predecessors that run in parallel. Suppose that each could take anything from one to six days with equal probability. And suppose further that they won’t take fractions of a day. This may sound rather contrived, and it is; I’m doing it so we can simulate it with a pair of dice and not get involved in a lot of calculus. The expected duration of the individual tasks is 3.5. But what is the expected time for them both to be complete Lets use the dice to find out. The following table shows all 36 possible outcomes when throwing two dice. The numbers inside the grid show what the highest dice face would be in each combination. For example, if one throws a three dice and a five dice, five would be the highest dice face. We can’t illustrate this so easily for more dimensions, which correspond to more parallel tasks, but we can still calculate the odds. With three parallel tasks the expected value is almost five days; with five tasks, it’s about 5.4 days; and with 10 tasks, it’s about 5.8 days. (For details on how this is calculated, you can download an Excel spreadsheet from Barbacana). Notice that two would be the highest dice thrown only three times in all 36 possible outcomes. Six, on the other hand, surfaces no fewer than 11 times. So even with just two parallel paths we can see that the time it takes for both of the tasks to be complete is likely to be longer than the deterministic estimate of 3.5 days. If we do the math, we find that the expected value is actually almost 4.5 days. When a Single Point Estimate Isn’t Good Enough… Merge bias is one good reason to do schedule risk analysis. But as I’ve shown, there are other reasons too. If meeting a particular completion date is important — for example, because you’re building facilities for the Olympics — then a single-point estimate simply isn’t good enough. In fact, the only thing you can be certain of with a deterministic projection is that it will be wrong. A projection represented by a probability distribution based upon Monte Carlo simulation is always more realistic.