- Organized simplicity. Systems with few elements. Analyzed mathematically.
- Disorganized complexity. Systems with many elements acting randomly. Analyzed statistically.
- Organized complexity. Systems with many interconnected elements. Analyzed with operations research/model-building methods.
Joe Martino tells TOF that "One of the most horrible examples I ever encountered was the use of a Cobb-Douglas Production Function to predict the effectiveness of bombing the Ho Chi Minh trail. When I first saw the model it fairly screamed 'wrong!' But the people who put it together saw nothing wrong with it."And yet, these things have so many pretty equations they seem like they damn well ought to work. And they do, in some cases. Kingsbury Bearings has a model for hydraulic bearings that works well in predicting the performance of new bearing designs.* So what's the problem? Operationally, what is the difference between a model that is "useful" and one that is "true"?
(*) hydraulic bearings. TOF digresses. At the entrance to the Kingsbury plant is a placard honoring the Kingsbury Bearing installed in Holtville #5 in 1912. TOF inquired of his hosts in the 1980s when he spent some time with them: "How long did it last?" "We don't know yet," was the response. "It's still running." As of 2008, it was still running with an estimated TTF of 1300 years. That's craftsmanship!
The Most Famous Model In the History of the World, Maybe.Great Ptolemaic Smackdown. The Ptolemaic model of the heavens was fabulously successful for 1,460 years -- about as long as a Kingsbury bearing at the Holtville Dam. It predicted sunrises, sunsets, eclipses, and sundry other stellar phenomena with tolerable accuracy. If the proof of the pudding is in the eating,* surely the proof of the model is in its forecasts. Or maybe that just is the difference between "useful" and "true."
That a model makes good predictions, as the Ptolemaic (and later Tychonic) model did, is no assurance that the real world matches the internal arrangements of the model. Let's call it the Turing Fallacy. There's always more than one way to skin a cat, and more than one way to model a phenomenon. And you cannot proceed automatically from post hoc results to propter hoc models.
(*) proof. Means "to test", as in "proving grounds" or the proof of whiskey.
Data, Data Everywhere, and Not a Jot to Think
What we observe is not nature itself, but nature exposed to our method of questioning.
– Werner Karl Heisenberg, Physics and Philosophy: The Revolution in Modern Science
IOW, how you ask the question determines the sort of answers you can get. If your method of questioning is a hammer, Nature will testify to her sterling nail-like qualities. So Facts have meaning only relative to a conceptual Model of the phenomenon.
What Do You Mean, "Probably"?
|Give me data!" he cried. "I can't|
make bricks without straw!"
-- The Copper Beeches
Usually, the evidence E is extended by tacking on “I believes.” For example, “I believe p is represented by a normal distribution with parameters, μ and σ.” These “I believes” comprise the model, M. This makes the model "credo-ble."*
So the probability is assigned to propositions p, given evidence E and model M:
|A bad Model|
(*) structured relationship. In Latin, fictio. Facts acquire meaning as part of a fiction.
But nothing in this life is certain. All is fraught with uncertainty.* That includes models. Yet, the conclusions, the press releases, the requests for your money always seem couched in the language of certainty. Let's take a look at "A Certain Amount of Uncertainty" models deal with. Not all the freight is statistical.
(*) TOF is suddenly struck by the term "fraught." Can a situation be 'fraught with certainty'?
What else might we be fraught with? These are weighty matters.
A Model of a Model
What does a model even look like?
First, there's the real-world situation or system that you want to understand, the context that determines what's in the system and what's outside the system.
Example: In modelling velocities, Newton confined himself to macroscopic bodies traveling well below the speed of light. He didn't know he did this, since quantum events and relativistic speeds were beyond anyone's experience at the time. But in consequence his model stumbles whenever it crosses the boundaries into the very fast or the very small. TOF tells you three times: a model that gives a good account of Situation A may not do so for B.
Second, there's the model structure itself, a sort of skeleton that in some manner "fills in" and "supports" the blob with nodes and links: various factors and their relationships. These relationships are often expressed in mathematical form.
Third, there are the inputs, or signals from outside the system boundary; and the outputs or performances.
How Models Go Bad
|Models gone bad|
- Context Uncertainty
- Model Structure Uncertainty
- Input Uncertainty
- Parameter Uncertainty
- Model Outcome Uncertainty
1. Context Uncertainty. This is uncertainty in framing the situation to be modeled: e.g., have we properly identified the problem to be solved, the system boundaries, the economic, political, social, environmental, and technological situation, etc. Basically, it's hard to model a situation if you don't know what you're talking about. Although that does not seem to ever stop anyone.
|What exactly is the situation?|
This sort of uncertainty cannot be quantified, which can be a pain in the butt when clients want to know if we are "95% confident" in a conclusion. There is no mathematical-statistical approach to this, and we often wind up simply brainstorming a consensus figure. Just because a number is announced is no guarantee that something has been measured.
Ed Schrock used to call this sort of thing Type III Error: "getting the right solution to the wrong problem."
2. Model Uncertainty. This is uncertainty in designing and executing the model itself, that is error in:
- doing the right thing and
- doing the thing right.
|Obamacare web portal|
Aquinas once wrote: "The suppositions that these astronomers have invented need not necessarily be true; for perhaps the phenomena of the stars are explicable on some other plan not yet discovered by men." (De coelo, II, lect. 17) That is, there is always more than one way to skin a cat -- and more than one way to model a situation. For example:
- Is a photon a particle or a wave?
- Should we apply a mean or a median -- or a harmonic mean?
- Copernican or Tychonic?
|Obamacare web portal|
- Bugs (software errors): keystroke errors in model source code, unclosed loops, registers not set properly, etc.
- Malfs (hardware faults): malfunctions in the technical equipment used to run the model, equipment capabilities, available bandwidth, etc.
3. Input UncertaintyThis is uncertainty in the data describing the reference (base case) system and in the external driving forces that influence the system.
3.1. Uncertainty about the external driving forces and their magnitudes -- especially drivers not under the control of policymakers. Model-users tend to assume they have control of system simply because they've identified the Xs. There is also uncertainty regarding the system response to these forces, leading to model structure uncertainty.
3.2. Uncertainty about the system data (e.g., land-use maps, data on infrastructure, business registers, etc.) on which the model will operate. These may be in error in a variety of ways. The information may be wrong, outdated, missing, or (more subtly) be from outside the problem situation. Examples:
a) The predictive sample of the 1936 US presidential election, in which the target population (context) was "all registered voters" but the sampling frame (system data) comprised mostly lists of phone numbers. The two did not coincide: during the Depression, many voters did not have telephones, and those who did not voted differently from those who did.
Uncertainty about system data is generated by a lack of knowledge of the properties of the underlying system and deficiencies in the description of the variability. Modelers often take their databases and such very much for granted. A historical example: Copernicus' model failed not only because his model structure insisted on pure Platonic circles for the orbits, but also because he used the old Alphonsine Tables of astronomical data, which were rife with centuries of copyist errors and which transferred to the new Prussian Tables.b) Data for general climate models come from weather stations established and sited for other purposes. These stations often have missing data, have suffered damage, of have gone out of use. Or they may be subject to local effects, such as concrete or asphalt surroundings that may be acceptable for air traffic control -- where you want to know the specific conditions, heat island and all -- but not for other purposes.
4. Parameter Uncertainty
Parameters are constants in the model, supposedly invariant within the chosen context and scenario. There are the following types of parameters:
- Exact parameters: universal constants, e.g.: e.
- Fixed parameters: considered exact from previous investigations, e.g.: g.
- A priori chosen parameters: based on prior experience in similar situations. (The uncertainty must be estimated on the basis of a priori experience.)
- Calibrated parameters: essentially unknown from previous investigations or cannot be used from previous investigations due to dissimilarity of circumstances. These must be determined by calibration. They directly affect model structure uncertainty.
5. Output UncertaintyThis is uncertainty in the predictions of the model and is typically a combined effect of all the other uncertainties. But specifically it includes uncertainties in the reference data used to perform the calibrations.
Another Fine Math You've Gotten Us Into
A common problem is when a metric of interest (density of coal in a bunker) is for practical reasons difficult or impossible to measure. If we can construct a model by which the density is expressed as a function of radiation backscatter, then we can measure the more accessible metric and convert the results to equivalent density, which was done at the link.
The chart shows two envelopes around the regression line. The inner envelope (red) is a confidence bound for the regression line itself -- the "slop in the slope." That is, it tells us how closely the analysis has pinned down the parameter b1. The outer envelope (blue) is the prediction interval, which tells us what likely densities could account for the observed backscatter. This is what is really of interest. If the bunker backscatter measures 6000, the density is most likely somewhere betwixt 64 and 70. Which means it might not be. It is possible to have a very precise interval around a very wrong value. But that is a topic for another day.
The point here is that the model, simple as it is, is subject to other uncertainties that cannot be addressed quite so mathematically. One is how widely it applies.
- We have already decided that this particular regression will not apply to other coal shipments because the very relationship Y=b0+b1X may differ.
- It would not be appropriate to apply the model to bunkers with backscatters much beyond 5400 to 6500. It might still be valid, but we don't know. The b1 parameter was calculated from the calibration data, and the results cannot be extrapolated beyond the range that was used.
- There is also the question of whether a one-variable linear regression is the best model. There may be other Xs we could include beside the backscatter that would give us a tighter prediction. That the calibration points do not fall on a perfectly straight line indicates that there are other factors in play. The coefficient of determination is 91%, which means (loosely speaking) that 91% of the variation in density is accounted for by its relationship to backscatter, leaving 9% unaccounted for.
- The parameter b1 was estimated using eleven calibration tube samples. This produced part of the uncertainty in the parameter estimation; that is, in the slope of the regression line. Was this sample size sufficient? Sufficient for what purposes?
- The backscatter is not measured with perfect precision. The vertical line at 6000 implies the measured backscatter at the bunker was exactly 6000. In reality, all instruments suffer from uncertainties related to precision and reproducibility. That vertical line should be a band of probabilities. The uncertainty in X is propagated through the model to an uncertainty in Y over and above what is shown here.
- The Y values on the calibration data were also measured with uncertainty.
- In addition to the instrument, there are uncertainties related to the technician: technique, attentiveness, skill, and so forth. The coal samples were supposed to be taken according to ASTM D2234 and D2013. Were they? What about the calibration tubes? Do the scribe lines mark the exact volumes for the calbration?
Coming next, a closer look at some uncertainties in models, with special attention to that model.
Part III link.
- Box, George E.P., William G. Hunter, J. Stuart Hunter. Statistics for Experimenters, Pt.IV “Building Models and Using Them.” (John Wiley & Sons, 1978)
- Curry, Judith and Peter Webster. “Climate Science and the Uncertainty Monster” Bull. Am. Met. Soc., V. 92, Issue 12 (December 2011)
- El-Haik, Basem and Kai Yang. "The components of complexity in engineering design," IIE Transactions (1999) 31, 925-934
- von Hayek, Friedrich August. "The Pretence of Knowledge," Lecture to the memory of Alfred Nobel, December 11, 1974
- Petersen, Arthur Caesar. "Simulating Nature" (dissertation, Vrije Universiteit, 2006)
- Swanson, Kyle L. "Emerging selection bias in large-scale climate change simulations," Geophysical Research Letters
- Turney, Jon. "A model world." aeon magazine (16 December 2013)
- Walker, W.E., et al. "Defining Uncertainty: A Conceptual Basis for Uncertainty Management in Model-Based Decision Support." Integrated Assessment (2003), Vol. 4, No. 1, pp. 5–17
- Weaver, Warren. "Science and Complexity," American Scientist, 36:536 (1948)