What Nate Silver Can Teach Engineers

Nate’s Silver’s book, The Signal and the Noise (The Penguin Press, New York, 2012) was released in the run-up to the 2012 election. Silver and the FiveThirtyEight.com project were well on their way to accurately predicting most of the electoral college and Senate results and there was a good deal of interest and controversy in the political media regarding his work. The book is fundamentally about making predictions in an environment of uncertainty. Silver uses examples from a variety of disciplines including weather, baseball, economics and gambling to illustrate why some predictions fare better than others. However, these lessons can be applied to anyone whose field puts them in the prediction business. And engineers, whether they realize it or not, are in the prediction business. After all, engineering analysis and design of engineered systems require making predictions about the performance of systems, typically with uncertain loads and initial conditions, and with the stakes being the health, welfare and safety of the public.

Forecasting vs. Prediction

Silver actually prefers to describe the results of his models as “forecasts” since they are probabilistic, rather than predictions, which are deterministic. Thinking about problems probabilistically is one of the fundamental themes of the book since doing so accommodates uncertainty and mitigates bias. As it turns out, predictions arising from “expert” opinion alone are often quite bad and sometimes no better than random chance. For example, Silver found that the predictions of political pundits on the McLaughlin Group were wrong about half the time – thus about as good as a flip of a coin – regardless of political ideology. Punditry fails to make good predictions due to gross overconfidence, particularly by underestimating unlikely outcomes and overestimating likely outcomes. This overconfidence may come from underestimating one’s biases, failing to understand the subjectivity of one’s viewpoint and often having too simplistic a world-view and constructing narratives that downplay evidence that contradicts one’s views.

Silver devotes a lot of ink to problems with predictions, which have a variety of causes, including overconfidence, bias, lack of rigor, noisy data, complexity and inadequate underlying theory. With advances in computer and information technology, our computational ability is no longer the most limiting factor. Arguably, ease of computation is exacerbating prediction problems, since it is increasingly easy to generate numbers without understanding the underlying physics, interactions or assumptions.

You might be surprised that weather forecasters and gamblers are among the heroes of the book. Weather forecasts have improved significantly over time as more data and better prediction methods have become available. Gamblers, understand uncertainty intuitively, make frequent predictions in the form of bets and are strongly incentivized to make accurate, unbiased predictions. So what makes a good forecast? Silver provides three metrics:

Quality:

Quality is a measure of how the forecast compared to the event of interest. More precisely, if you ran the same experiment, with the same initial conditions the frequency that the event of interest occurs should converge on the likelihood that you forecast. For example, it should rain about 40 percent of the time that a weather forecaster predicts a 40 percent chance of rain. A high-quality forecast is well calibrated; the feedback produced by comparing actual events to the forecast is used to improve the forecast. Quality forecasts require a lot of practice and a data-rich environment.

Consistency:

Consistency is whether the forecast was the best forecast possible at the time it is made. In some respects, this might also be thought of as honesty of the forecast. For example, some weather forecasters intentionally over predict the chance of rain, especially when the chance of rain is less than 20 percent. The justification for this “wet bias” is that people do not like being caught unprepared for rain. However, the impression that weather forecasts are overly conservative has resulted in people failing to heed evacuation warnings during hurricanes. Many deaths during hurricane Katrina were attributable to people expecting that the hurricane would be less severe than was forecast. In fields like economics in which complex systems and lack of underlying theory often prevent making good predictions, the honesty of the forecast can be improved by simply disclosing the uncertainty. People use forecasts to make decisions and adjust their behaviors to their expectation of the bias or accuracy of the forecast.

Economic Value:

The economic value of a forecast is a measure of how useful it is for decision making. For example, a weather forecast is useful if it is more accurate than “persistence”, the assumption that the weather tomorrow will be the same as today, or climatology, the long-term average of weather conditions. The economic value also includes the trade-offs between the use of more complicated methods and the additional effort they require versus the improvement in quality and consistency they might provide. Silver postulates a “Pareto Principal” to forecasting that about 80 percent of the possible accuracy of a prediction is generated by about 20 percent of the possible effort. Beyond this, additional effort produces a rapidly declining marginal return in the accuracy of the prediction.

Engineering Predictions

So what does any of this have to do with engineering? Modeling of engineered systems is based on good basic theory, which allows the development of models used for design methods, building codes and other performance standards. However, engineering models are relatively data-poor. While there is a long history of academic research in engineering disciplines, long-term performance data for in-service systems is lacking. Therefore common mathematical models of engineering systems may be limited, leading to the risk of “out-of-sample” predictions in which a model is used beyond the range of conditions for which it was calibrated.

Some common problems in engineering, like soil-structure interaction and foundation problems, for example, are fraught with many of the same challenges as the prediction problems Silver discusses in his book because they are complex, dynamic and nonlinear. The behavior of a dynamic system at one point in time influences its behavior in the future. Dynamic, nonlinear systems are subject to large errors resulting from small errors in initial conditions. In addition, the behavior of a complex system can be hard to predict due to interacting parts of the system, even if the behavior of those parts is relatively simple. Not all models for engineering systems have all of these attributes, usually because it is possible to simplify the model to be better behaved, but still be reasonably useful. For examples, in structural engineering models, linear elasticity and idealized joints are often assumed to eliminate nonlinearity and reduce complexity.

Consider how well engineering predictions fare against Silver’s criteria of quality, consistency and economic value. It should be evident that the criteria are related and potentially conflicting. For engineers, the reliability of a model of engineering systems typically takes precedent due to the consequence that may accompany failure. Reliability may come at the expense of quality. If consistency is equivalent to honesty, this is analogous to a wet-biased weather forecast. A more consistent forecast would be one that better balances reliability and accuracy, which typically requires effort to reduce uncertainty and requires some economic justification.

In practice, the prediction necessary to analyze or design an engineering system commonly employs simple and conservative methods. The system is predicted to be “OK” or “no good”, a decision is made about the system that is manifest in design drawings or a report and the documentation of the prediction goes in a file, often never to be seen again unless there is a failure. In a lot of applications, this is a reasonable approach.

Too often, however, an overly simple approach, coupled with compounding conservative assumptions, produces an unreasonable outcome that is then presented overconfidently. Like Gulf Coast residents who ignored hurricane warning because they became accustomed to overly conservative forecasts, project stakeholders adjust their behavior to overly-conservative engineering. As a result, engineers have a seemingly worsening reputation for excess conservatism. If engineers are broad-brushed as overly conservative, the only basis of comparison among them is their fee. If all engineering design is overly conservative, then there is plenty of incentive for contractors to cut corners and little incentive for rigorous quality control. Deviation from good practice can become the norm because most of the time everyone gets away with it.

The inevitable result is elevated construction cost and risk, and sometimes disappointing performance of the completed project. In addition, other problems can occur for engineers who are not overly conservative or when the fundamental uncertainty of a problem warrants a conservative approach. It becomes difficult to argue for adequate fees to do more accurate analyses. You may have to expend time and energy to justify code-required design details to a client. During construction, you may have to defend against contractors looking to “value engineer” out important components, like seismic details or waterproofing (I have seen both). How many other problems in construction have prediction issues as an underlying cause?

Solutions

There are steps we can take to improve the quality, honesty and economic value of the predictions underlying the analysis and design of engineered systems. Silver provides three principles for prediction, all of which are worth considering in engineering problems:

Think probabilistically

To make good predictions, it is necessary to be comfortable with uncertainty in the inputs and with the model itself, as well as understand uncertainty as it applies to biases and assumptions applied to the problem. Silver emphasizes the use of Bayes Theorem, a simple mathematical expression for an intuitive concept: that when we learn something new about an event of interest, it changes our assessment of the likelihood of the event of interest occurring. If my average trip downtown on the Washington Metro is 30 minutes and I learn that the Metro is on fire, then I know that the likelihood that I arrive in less than 30 minutes is less than 50 percent (that is, the duration will likely be longer than average).

Engineering design methods accommodate uncertainty, but accounting for uncertainty is usually baked into the methodology or underlying models in a manner that is not transparent to the user. For example, a structural factor of safety will account for uncertainty associated with overload, material understrength, manufacturing tolerances and geometric imperfections, among other things. Since the design methods can be learned and used without understanding the underlying uncertainties, a lot of engineers do not consider or understand the uncertainties. Consequently, the approach to uncertainty taken by engineers varies widely from those who will compound conservative assumptions to others who believe their personal judgment to be superior to building codes. However, most engineers present their predictions confidently and deterministically, whether they are qualitative or quantitative.

If engineers fail to communicate uncertainty, project stakeholders cannot think about the project probabilistically and will assume that the project as designed will perform perfectly. They may, therefore, assume incompetence or malfeasance when confronted with any deviation from their own expectation about how the construction should proceed or how the project should perform. Unfulfilled, unrealistic expectations are a common source of claims. Communicating assumptions and uncertainties in our methods and advising stakeholders of risks and trade-offs would go a long way towards improving the consistency of engineering predictions.

Make a lot of forecasts

Forecasters improve with experience. Models improve with feedback from comparing predictions with actual events. Therefore, if a forecast is worth making, Silver advocates making the best forecast you can today, but to not be too vested in it. As more information becomes available or circumstances change, the prediction will change. A forecast should stabilize as uncertainty is decreased. When a hurricane landfall approaches, the forecasted location and strength upon landfall becomes more certain. The uncertainty of a hurricane’s path and strength when it is several days away does not justify being afraid to try to predict it.

In a lot of engineering applications, a prediction is made once, often early in the design process. Since changing a model, parameter or assumption absent of material changes in circumstances may be viewed as a liability, an early prediction may remain unchanged for the duration of the project. Any prediction made early, when uncertainty may be greatest, may have to rely on arbitrary conservatism to be reliable. Consider a project in which foundation settlement estimates must be made in the geotechnical report before the foundations are designed and will not be modified because the geotechnical engineer is not involved in the design. This is common with building projects and often results in excessively conservative design parameters and incomplete or uncoordinated recommendation, increasing cost, risk or both.

The alternative would be the Bayesian approach in which the system behavior would be predicted during each stage of the project, using models appropriate for the available information at the time. As the project proceeds, more information becomes available and the models should become more certain, allowing design optimization or monitoring of actual behavior. If the observed behavior and model do not converge, it may provide an early indicator of deviation from the expected behavior of the system. This can be used to trigger contingency plans. Thus the Bayesian approach provides opportunities to reduce costs and control risk.

Look for consensus

A consensus among multiple, independent forecasts typically suggests greater certainty and accuracy of the forecast. Even opinions derived from “expert” opinion can be improved by polling multiple experts. Group forecasts can be 15 to 20 percent more accurate than an individual forecast because approaching a problem from multiple perspectives allow the aggregation of a larger variety of information. Silver warns against “magic bullet” forecasts that use few parameters to predict the behavior of complex systems. Incorporating qualitative information into an otherwise quantitative forecast, as Silver describes the Cook Political Report and National Weather Service doing, provides an effective means of aggregating different types of information.

For engineers, it is unusual to have multiple engineers modeling the same systems independently. However, it is still good practice to approach a problem from multiple perspectives. This is commonly a part of quality control of design calculations and other documents. Sometimes checking computer program output involves comparison with simplified or approximate methods. It can be useful to begin a problem with a simple model to define the system to a certain level and then use the result from that model to build a progressively more complex model. The results of each model can be compared and the differences in behavior should be understood before proceeding to the next step. Comparing the results of a model to an engineer’s experience is always an important, albeit potentially biased way of looking for consensus, but can be effective if the effort is made to understand the difference between the model and the expected result.

Conclusion

During the 2012 presidential campaign, Silver was harshly criticized by political journalists and pundits for predicting that an Obama reelection was quite likely. The polling data, as well as “fundamentals”, such as economic indicators and incumbency advantage, provided support for FiveThirtyEight’s model, yet experienced political commentators insisted that the election was a toss-up because they ‘felt’ it was so.

Like Silver and his fellow data journalists, we engineers must defend ourselves against punditry, in our case, contractors, clients and our less technically sophisticated colleagues who ‘feel’ that our solutions are wrong, because their biases, overconfidence, limited experience and/or lack of fundamental understanding of theory do not allow them to understand the limitations of their own predictions.

Overcoming this challenge will require a change in our approach to problems or at least how we communicate our solutions. We need to think about the uncertainty, assumptions and biases applied to our methods. We should use models that are appropriate to the circumstances and use more accurate methods when the quality of available information and economic value warrant it. Most importantly, we should communicate uncertainties, risks and trade-offs to our clients so they can think probabilistically and make better decisions. We need more forecasting and less punditry.

Richard J. Driscoll, PE is a structural and foundation engineer licensed in the District of Columbia and six states and the owner of Richard J. Driscoll, Consulting Engineer. The information and statements in this document are for information purposes only and do not comprise the professional advice of the author or create a professional relationship between reader and author.