The TOF Spot: America's Next Top Model -- Part II

Wednesday, March 5, 2014

America's Next Top Model -- Part II

PREVIOUSLY, WE DISCUSSED the emergence during and after WW2 of model-building as a means for dealing with organized complexity, a distinction, first made in 1948 by Dr. Warren Weaver:

Organized simplicity. Systems with few elements. Analyzed mathematically.
Disorganized complexity. Systems with many elements acting "randomly." Analyzed statistically.
Organized complexity. Systems with many interconnected elements. Analyzed with operations research/model-building methods.

George Box said that "all models are wrong" because of the uncertainties to which models are subject. Honest modelers are obligated to assess these uncertainties and to inform their clients of them. Alas, too few do, and models are presented, understood, and press-released as being more certain than they actually are.

Joe Martino tells TOF that "One of the most horrible examples I ever encountered was the use of a Cobb-Douglas Production Function to predict the effectiveness of bombing the Ho Chi Minh trail. When I first saw the model it fairly screamed 'wrong!' But the people who put it together saw nothing wrong with it."

And yet, these things have so many pretty equations they seem like they damn well ought to work. And they do, in some cases. Kingsbury Bearings has a model for hydraulic bearings that works well in predicting the performance of new bearing designs.* So what's the problem? Operationally, what is the difference between a model that is "useful" and one that is "true"?

(*) hydraulic bearings. TOF digresses. At the entrance to the Kingsbury plant is a placard honoring the Kingsbury Bearing installed in Holtville #5 in 1912. TOF inquired of his hosts in the 1980s when he spent some time with them: "How long did it last?" "We don't know yet," was the response. "It's still running." As of 2008, Wikipedia tells us, it was still running with an estimated TTF of 1300 years. That's craftsmanship! It's also a system whose elements and interactions are pretty well understood.

The Most Famous Model In the History of the World, Maybe.

Faithful Reader may recall TOF's extended discussion of the Great Ptolemaic Smackdown. The Ptolemaic model of the heavens was fabulously successful for 1,460 years -- slightly longer than a Kingsbury bearing at the Holtville Dam. It predicted sunrises, sunsets, eclipses, and sundry other stellar phenomena with tolerable accuracy. If the proof of the pudding is in the eating,* surely the proof of the model is in its forecasts. Or maybe that just is the difference between "useful" and "true."

That a model makes good predictions, as the Ptolemaic (and later Tychonic) model did, is no assurance that the real world matches the internal arrangements of the model. (See Cartwright: "How the Laws of Physics Lie.") It is an "associative law" not a "causal law." Let's call it the Turing Fallacy. There's always more than one way to skin a cat, and more than one way to model a phenomenon. And you cannot proceed automatically from post hoc results to propter hoc models. Even Hume recognized that.

(*) proof. Means "to test", as in "proving grounds" or the proof of whiskey.

Data, Data Everywhere, and Not a Jot to Think

Heinlein, self-demonstrating

In “The Year of the Jackpot,” Robert A. Heinlein famously said. "A fact has no 'why.' There it stands, self-demonstrating." But nothing could be further from the truth. Facts are meaningless by themselves. Einstein (whose name double-rhymes with Heinlein) put it this way: "Theory determines what can be observed." This led Heisenberg to his famous observation that:

What we observe is not nature itself, but nature exposed to our method of questioning.

– Werner Karl Heisenberg, Physics and Philosophy: The Revolution in Modern Science

IOW, how you ask the question determines the sort of answers you can get. If your method of questioning is a hammer, Nature will testify to her sterling nail-like qualities. So Facts have meaning only relative to a conceptual Model of the phenomenon.

What Do You Mean, "Probably"?

Give me data!" he cried. "I can't
make bricks without straw!"

-- The Copper Beeches

A model is a mechanism to assign probabilities to propositions p, given evidence E:

Pr(p|E)

Usually, the evidence E is extended by tacking on “I believes.” For example, “I believe p is represented by a normal distribution with parameters, μ and σ.” These “I believes” comprise the model, M. This makes the model "credo-ble."*

(*) credo-ble. No, TOF will not apologize. He's glad he said it. Glad, he tells you!

So the probability is assigned to propositions p, given evidence E and model M:

Pr(p|EM)

You could lump E and M together into the same formal symbol, but it is never a good idea -- that's "never" as in "not ever" -- to confuse the assumptions of your model with actual, like, you know, evidence.

A bad Model

TOF tells you three times: There is no such thing as a probability without evidence+model; that is, without facts and a structured relationship among those facts.* Most probabilities you see in the press releases of activist groups, government agencies, businesses, and other special interests have no more punch than Pee-Wee Herman. The worst model is where you claim you are not using one, because that means that you are unaware of your own assumptions. Nothing good can come from that.
(*) structured relationship. In Latin, fictio. Facts acquire meaning as part of a fiction.

But nothing in this life is certain. All is fraught with uncertainty.* That includes models. Yet, the conclusions, the press releases, the requests for your money always seem couched in the language of certainty. Let's take a look at "A Certain Amount of Uncertainty" models deal with. Not all the freight is statistical.
(*) TOF is suddenly struck by the term "fraught." Can a situation be 'fraught with certainty'?
What else might we be fraught with? These are weighty matters.

A Model of a Model

What does a model even look like?

First, there's the real-world situation or system that you want to understand, the context that determines what's in the system and what's outside the system.

Example: In modelling velocities, Newton confined himself to macroscopic bodies traveling well below the speed of light. He didn't know he did this, since quantum events and relativistic speeds were beyond anyone's experience at the time. But in consequence his model stumbles whenever it crosses the boundaries into the very fast or the very small. TOF tells you three times: a model that gives a good account of Situation A may not do so for Situation B.

Second, there's the model structure itself, a sort of skeleton that in some manner "fills in" and "supports" the blob with nodes and links: various factors and their relationships. These relationships are often expressed in mathematical form.

Third, there are the inputs, or signals from outside the system boundary; and the outputs or performances.

How Models Go Bad

Models gone bad

Perceptive Reader immediately notes several possible sources (or locations) of uncertainty.*

Context Uncertainty
Model Structure Uncertainty
Input Uncertainty
Parameter Uncertainty
Model Outcome Uncertainty

(*) Walker, W.E., et al. "Defining Uncertainty: A Conceptual Basis for Uncertainty Management in Model-Based Decision Support." Integrated Assessment (2003), Vol. 4, No. 1, pp. 5–17

1. Context Uncertainty. This is uncertainty in framing the situation to be modeled: e.g., have we properly identified the problem to be solved, the system boundaries, the economic, political, social, environmental, and technological situation, etc. Basically, it's hard to model a situation if you don't know what you're talking about. Although that does not ever seem to stop anyone.

What exactly is the problem?

The more thoroughly the system is understood, the more realistic the models of it will be; but Stakeholders often have "different perceptions of reality," to put it mildly. Funding sources may have desired outcomes already in mind. Even the expert groups called in to build the model may frame the problem to employ the tools they are comfortable with. They will come with hammers in hand and boldly forecast nails in our futures.

This sort of uncertainty cannot be quantified, which can be a pain in the butt when clients want to know if we are "95% confident" in a conclusion. There is no mathematical-statistical approach to this, and we often wind up simply brainstorming a consensus figure. Just because a number is announced is no guarantee that something has been measured.

Ed Schrock used to call this sort of thing Type III Error: "getting the right solution to the wrong problem."

2. Model Uncertainty. This is uncertainty in designing and executing the model itself, that is, Error in:

doing the right thing and
doing the thing right.

In modeling, this is Structural Uncertainty and Technical Uncertainty.

Obamacare web portal

2.1 Structural Uncertainty (Not doing the right thing).
Aquinas once wrote: "The suppositions that these astronomers have invented need not necessarily be true; for perhaps the phenomena of the stars are explicable on some other plan not yet discovered by men." (De coelo, II, lect. 17) That is, there is always more than one way to skin a cat -- and more than one way to model a situation. For example:

Is a photon a particle or a wave?
Should we apply a mean or a median -- or a harmonic mean?
Copernican or Tychonic?
Etc.

The modeler must decide which variables to include in the model, which to exclude, what mathematical relations exist among them, etc. Perhaps some of the discarded (or unknown!) factors are more important than we thought? Basically, the model might stink like three-day old fish even if the underlying context has been correctly understood.

Obamacare web portal

2.2 Technical Uncertainty (Doing the thing right). A fresh fish can be spoiled in the cooking, and a good model can be screwed up in the execution. These flaws can be bugs or malfs.

Bugs (software errors): keystroke errors in model source code, unclosed loops, registers not set properly, etc.
Malfs (hardware faults): malfunctions in the technical equipment used to run the model, equipment capabilities, available bandwidth, etc.

Again, it is very difficult, if not impossible to quantify model uncertainty. What is the probability Pr(p|EM) that we have overlooked a key variable in the M? Almost by definition, we cannot know this. One approach is to compare our model outputs with the outputs of other models of the same situation. If multiple independent models agree, we have more [but unquantified] confidence in them. (But only if they are truly independent.)

3. Input Uncertainty

This is uncertainty in the data describing the reference (base case) system and in the external driving forces that influence the system.

3.1. Uncertainty about the external driving forces and their magnitudes -- especially drivers not under the control of policymakers. Model-users tend to assume they have control of system simply because they've identified the Xs. This can be exacerbated when a congeries of actual measurements is given a single symbolic name for convenience. One is then at risk of treating this composite variable as if it were a real-world measurement. (See the model of coups in sub-Saharan Africa.) There is also uncertainty regarding the system response to these forces, leading to model structure uncertainty.

3.2. Uncertainty about the system data (e.g., land-use maps, data on infrastructure, business registers, etc.) on which the model will operate. These may be in error in a variety of ways. The information may be wrong, outdated, missing, or (more subtly) be from outside the problem situation. Examples:

a) The predictive sample of the 1936 US presidential election, in which the target population (context) was "all registered voters" but the sampling frame (system data) comprised mostly lists of phone numbers. The two did not coincide: during the Depression, many voters did not have telephones, and those who did not voted differently from those who did.

b) Data for general climate models come from weather stations established and sited for other purposes. These stations often have missing data, have suffered damage, of have gone out of use. Or they may be subject to local effects, such as concrete or asphalt surroundings that may be acceptable for air traffic control -- where you want to know the specific conditions, heat island and all -- but not for other purposes.

Uncertainty about system data is generated by a lack of knowledge of the properties of the underlying system and deficiencies in the description of the variability. Modelers often take their databases and such very much for granted. A historical example: Copernicus' model failed not only because his model structure insisted on pure Platonic circles for the orbits, but also because he used the old Alphonsine Tables of astronomical data, which were rife with centuries of accumulated copyist errors and which transferred to the new Prussian Tables.

4. Parameter Uncertainty

Parameters are constants in the model, supposedly invariant within the chosen context and scenario. There are the following types of parameters:

Exact parameters: universal constants, e.g.: e.
Fixed parameters: considered exact from previous investigations, e.g.: g.
A priori chosen parameters: based on prior experience in similar situations. (The uncertainty must be estimated on the basis of a priori experience.)
Calibrated parameters: essentially unknown from previous investigations or cannot be used from previous investigations due to dissimilarity of circumstances. These must be determined by calibration. They directly affect model structure uncertainty.

Calibration of the parameters against historical data produces the uncertainty that is most readily expressed in statistical form. Modelers often concentrate too closely on how well they have estimated the parameters at the expense of how well they have actually estimated the model outcomes -- which is what the clients are actually interested in.

5. Output Uncertainty

This is uncertainty in the predictions of the model and is typically a combined effect of all the other uncertainties. But specifically it includes uncertainties in the reference data used to perform the calibrations.

Another Fine Math You've Gotten Us Into

Another fine math...

Some of these uncertainties can be expressed mathematically -- "There is a 95% confidence that the parameter falls between the following bounds." But often these are the least important uncertainties. Who cares what the slop in the slope is?

A common problem is when a metric of interest (density of coal in a bunker) is for practical reasons difficult or impossible to measure. If we can construct a model by which the density is expressed as a function of radiation backscatter, then we can measure the more accessible metric and convert the results to equivalent density, as was done at the link.

The calibration data is shown in red. These were tubes packed to a known density whose backscatter was then measured. (This sort of thing had to be done for each shipment of coal to the power plant, since the relationship between density and radiation differed for different veins of coal (Context!). A linear regression was deemed a reasonable model over the range of interest both for empirical reasons (the data was accounted for) and for scientific reasons. So Y=b0+b1X

The chart shows two envelopes around the regression line. The inner envelope (red) is a confidence bound for the regression line itself -- the "slop in the slope." That is, it tells us how closely the analysis has pinned down the parameter b1. The outer envelope (blue) is the prediction interval, which tells us what likely densities could account for the observed backscatter. This is what is really of interest. If the bunker backscatter measures 6000, the density is most likely somewhere betwixt 64 and 70. Which means it might not be. It is possible to have a very precise interval around a very wrong value. But that is a topic for another day.

The point here is that the model Y=b0+b1X, simple as it is, is subject to other uncertainties that cannot be addressed quite so mathematically. One is how widely it applies.

We have already decided that this particular regression will not apply to other coal shipments because the very relationship Y=b0+b1X may differ.
It would not be appropriate to apply the model to bunkers with backscatters much beyond 5400 to 6500. It might still be valid, but we don't know. The b1 parameter was calculated from the calibration data, and the results cannot be extrapolated beyond the range of those data.
There is also the question of whether a one-variable linear regression is the best model. There may be other Xs we could include that would give us a tighter prediction. That the calibration points do not fall on a perfectly straight line indicates that there are other factors in play. The coefficient of determination is 91%, which means (loosely speaking!) that 91% of the variation in density is accounted for by its relationship to backscatter, which leaves 9% unaccounted for.
The parameter b1 was estimated using eleven calibration tube samples. This produced part of the uncertainty in the parameter estimation; that is, in the slope of the regression line. Was this sample size sufficient? Sufficient for what purposes?
The backscatter is not measured with perfect precision. The vertical line at 6000 implies the measured backscatter at the bunker was exactly 6000. In reality, all instruments suffer from uncertainties related to precision and reproducibility. That vertical line should be a band of probabilities. The uncertainty in X is propagated through the model to an additional uncertainty in Y over and above what is shown.
The Y values on the calibration data were also measured with uncertainty.
In addition to the instrument, there are uncertainties related to the technician: technique, attentiveness, skill, and so forth. The coal samples were supposed to be taken according to ASTM D2234 and D2013. Were they? What about the calibration tubes? Do the scribe lines mark the exact volumes for the calibration?

And so forth and so on. Fortunately, this was done by engineering at a power plant, not psychologists in a university. Practical reason kept things sane. They were not trying to discover a new scientific finding. They were trying to estimate the properties of the coal they were about to burn, and the answers obtained in this fashion were 'good enough.'

Coming next, a closer look at some uncertainties in models, with special attention to that model.

Part III link.

References

Box, George E.P., William G. Hunter, J. Stuart Hunter. Statistics for Experimenters, Pt.IV “Building Models and Using Them.” (John Wiley & Sons, 1978)
Cartwright, Nancy. "How the Laws of Physics Lie."

Curry, Judith and Peter Webster. “Climate Science and the Uncertainty Monster” Bull. Am. Met. Soc., V. 92, Issue 12 (December 2011)

El-Haik, Basem and Kai Yang. "The components of complexity in engineering design," IIE Transactions (1999) 31, 925-934

von Hayek, Friedrich August. "The Pretence of Knowledge," Lecture to the memory of Alfred Nobel, December 11, 1974
Jackman, Robert W., "The Predictability of Coups d'Etat: A Model with African Data." (Am.Pol.Sci.Rev. (72) 4, (Dec. 1978)

Petersen, Arthur Caesar. "Simulating Nature" (dissertation, Vrije Universiteit, 2006)

Swanson, Kyle L. "Emerging selection bias in large-scale climate change simulations," Geophysical Research Letters

Turney, Jon. "A model world." aeon magazine (16 December 2013)

Walker, W.E., et al. "Defining Uncertainty: A Conceptual Basis for Uncertainty Management in Model-Based Decision Support." Integrated Assessment (2003), Vol. 4, No. 1, pp. 5–17
Weaver, Warren. "Science and Complexity," American Scientist, 36:536 (1948)

23 comments:

Sophia's FavoriteMarch 5, 2014 at 9:23 PM
Oddly enough, Holmes misquotes that biblical reference. He actually says "bricks without clay". My sister quoted it a while back and I was sure she was wrong, but then I discovered no, Holmes (or Doyle) was, and she was quoting their misquote correctly.
ReplyDelete
Replies
AnonymousMarch 6, 2014 at 9:10 AM
Fraught with danger, obviously. Peril even!

I have often claimed to be fraught with certainty, but most people insist that's a vice on my part.
ReplyDelete
Replies
roystgnrMarch 6, 2014 at 11:53 AM
I usually avoid saying "thank you for this post" so as to avoid cluttering up your comments page, but this was a truly exceptional outline; thank you for this post.

Is that Schrock book the best introduction to Context Uncertainty you could recommend?
ReplyDelete
Replies
Xena CatolicaMarch 6, 2014 at 3:17 PM
It certainly clarifies how writing "The Wreck of the River of Stars" might be a diversion after long days at the office. Seems like going to an Aquinas conference ought to be tax deductible as professional development....
ReplyDelete
Replies
William NewmanMarch 6, 2014 at 4:04 PM
a few disjointed remarks:

1. It might be helpful to relate the remarks about the scope of Newton's model to the (unremarked AFAICS) scope of the Copernican model. One of the nice things about Newton's model is that it covered more things (some existing things, like comets, and some easy-to-imagine things, like Jules Verne's voyage to the moon) in a consistent way. The old astronomical models performed impressively well (especially when no one as careful as Tycho Brahe was checking them) but one of their limitations was they treated as essential distinctions things that Newton blew past.

2. You wrote "That is, there is always more than one way to skin a cat -- and more than one way to model a situation." You give examples of one sense in which this is true. There is another sense in which this is true: models can look very different and still come to similar or even exactly-equal results. E.g., Hamiltonian mechanics, Lagrangian mechanics, and Newtonian mechanics look very different, and it's not just superficial: their mathematical plumbing is so different that a problem that is easy to solve in one is hard to solve in another. But they are exactly equal, no different than Roman numerals vs. Arabic numerals. A similar precisely-the-sameness apparently exists between the Feynman, Schwinger, and Tomonaga representations of QED, but it's very hard to see: one of the things that Freeman Dyson is known for is showing it.

This apparently caused a lot of confusion in early quantum mechanics (1920 or so) as people disagreed over things which later turned out to mean the same thing in practice. I studied quantum mechanics in the 1980s, and encountered the lingering fallout from this: especially a tendency to be very impatient about possibly-superficial disagreements about representation of things, and jump quickly to asking whether there is any essential disagreement about what will be observed when a particular experiment is performed.

3. You might want to point interested readers at http://yudkowsky.net/rational/technical/ which says some useful related things (about uncertainty and partial correctness, e.g.) at a more detailed level than you are likely to, because while it tries to be easy to read it makes much heavier use of equations and numerical examples than I remember seeing on your blog.

(I think Curry is right to ask many of those uncertainty questions, but while I understand why you choose to write at the avoid-equations-and-numbers level, I don't understand why she seems to want to investigate only at the avoid-equations-and-numbers level. Besides the Yudkowsky Bayesian-lite page above, or a text like Jaynes' _Probability Theory_, the machine learning people have done a lot of quantitative inference stuff that bears on the questions of inference and attribution, importantly including quantitative information-theoretic generalizations of Occam's Razor which seem to bear rather directly on Curry's investigations of we can sensibly conclude from climate models, and about them. See e.g. http://stellar.mit.edu/S/course/6/sp08/6.080/courseMaterial/topics/topic1/lectureNotes/lec20/lec20.pdf and Gruenwald _the Minimum Description Length principle_ for pointers into the two main ways I am aware of --- which are closely related, but not as equivalently identical as e.g. the Newton/Lagrange/Hamilton example I gave above. Curry is a professor of atmospheric sciences and coauthor of a book on thermodynamics; it seems as though the basics of either the VC or MDL approaches should should be fairly lightweight math compared to her background, especially the fluid mechanics.)
ReplyDelete
Replies
AnonymousMarch 7, 2014 at 11:38 AM
"Context uncertainty" calls to mind an old joke:

A lawyer objected, and the judge asked why. When the lawyer finished explaining, the judge replied. "You reason eloquently, sir, and your argument is compelling. I sincerely hope you one day try a case in which it may be relevant. Overruled."
ReplyDelete
Replies
GyanMarch 10, 2014 at 2:54 AM
Didn't the Ptolemaic model had a known but ignored problem with the varying brightness of the planets as seen from the Earth?
ReplyDelete
Replies
GyanMarch 10, 2014 at 7:24 AM
"A model is a mechanism to assign probabilities to propositions p, given evidence E:"

A rather instrumental view. A physicist would rather talk in terms of understanding. A model helps us to understand the system.

And what does "probability" mean here?
If I assign a probability, say 0.3, to a proposition X, what precisely I have done?
ReplyDelete
Replies
UnknownMay 2, 2014 at 2:26 AM
This comment has been removed by a blog administrator.
ReplyDelete
Replies
arquitecsolJanuary 29, 2019 at 10:29 AM
Hi.
Are you the author of the first image? (Organized simplicity, Disorganized complexity, Organized complexity), I would lie to cite it on a work.
ReplyDelete
Replies

Add comment

The TOF Spot

Wednesday, March 5, 2014

America's Next Top Model -- Part II

The Most Famous Model In the History of the World, Maybe.

Data, Data Everywhere, and Not a Jot to Think

What Do You Mean, "Probably"?

Pr(p|E)

Pr(p|EM)

A Model of a Model

How Models Go Bad

Models gone bad

3. Input Uncertainty

4. Parameter Uncertainty

Parameters are constants in the model, supposedly invariant within the chosen context and scenario. There are the following types of parameters:

5. Output Uncertainty

Another Fine Math You've Gotten Us Into

References

23 comments:

Wonder and Anticipation, the Likes of Which We Have Never Seen

Followers

Interesting Sites

Wednesday, March 5, 2014

America's Next Top Model -- Part II

The Most Famous Model In the History of the World, Maybe.

Data, Data Everywhere, and Not a Jot to Think

What Do You Mean, "Probably"?

Pr(p|E)

Pr(p|EM)

A Model of a Model

How Models Go Bad Models gone bad

3. Input Uncertainty

4. Parameter Uncertainty

Parameters are constants in the model, supposedly invariant within the chosen context and scenario. There are the following types of parameters:

5. Output Uncertainty

Another Fine Math You've Gotten Us Into

References

23 comments:

Wonder and Anticipation, the Likes of Which We Have Never Seen

How Models Go Bad

Models gone bad