# The Hierarchy Of Models: From Causal (Best) To Statistical (Worst)

Written by William M Briggs

There is a hierarchy of models in the sense they offer insight into the thing modeled. The order of importance is: causal, deterministic, probabilistic, statistical. Most models use mixtures of these elements.

All models have this form: a set of premises, which include any number of facts, truths, supposeds, data, and such forth, and a proposition of interest, which is the thing being modeled conditional on those premises.

A classic—or perhaps better, as you’ll agree, *classical*—causal model “Socrates is mortal” given “All men are mortal and Socrates is a man.” The model predicts Socrates will die be*cause* of the *nature* of all men. It is man’s nature to die, and Socrates (and you, dear reader) are among the race of men. We know*all* men are mortal from the necessarily limited sample of observations of past men, and from the induction of these dead men to the entire race.

Causal models give insight into or make use of the nature, the universal essence, of the thing of interest. Causal models require universals; they also require induction because we know the validity of all universals, natures, essences, through a type of induction.

Deterministic models are common in mathematics and are usually stated in a form such that the proposition of interest is a function of premises, like this: y = f(x). The “x” is placeholder for any number of premises. An example of a functional form of a deterministic model is y = a + bx^{3}, which shows there are three explicit premises, the “a”, “b”, and x^{3}, and one implicit, which is the form or arrangement of these premises. This equation might give the numerical level of some thing as a function of a, b, and x. It says, “Given a, b, and x, y will *certainly* be at a + bx^{3}.” The equation *determines* y, but doesn’t explain the essence of the cause.

Some causal models may be put in equation form, but not all deterministic models are also causal. The equation given applies to a black box with two readouts, a “y” and “x”, and a dial is discovered to change the “x”. The formula is induced based on rotating the dial and noting the values of y and x. Only in the weakest sense can we say we have discovered the essence of the machine: we don’t even know what the values imply. Interestingly (and obviously to mathematical readers), more than one equation can be found to fit the same data (premises), which is also proof we have not learned the nature of the machine.

Probabilistic models abound. Given “This is a two-state object and only one state of s_{1} or s_{2} may show at any time”, the probability “The object is in state s_{1}” is 1/2. Note carefully that no such real object need exist; and neither must real objects exist for causal or deterministic models, as should be obvious.

There isn’t any understanding of essence or nature of this object in this probability model: we don’t know the workings. If we did, we’d have a deterministic or causal model. The probability is thus only a measure of our state of knowledge of the truth of the proposition and not of the essence of object. Probability models are silent on cause.

The last and least are statistical models. These are always *ad hoc* and conflate probability and decision or mistake probability with essence. Statistical models are a prominent cause of the vast amount of over-certainty which plagues science.

Statistical models purport to say that x causes y, or that x is “linked to” y, through the mechanism of hypothesis testing, via frequentist p-values or Bayesian Bayes factors, but though x may really be a cause of y, or x really may be linked to y in some essential way, the statistical judgment that these conditions are so is always a fallacy.

Hypothesis testing conflates decision with probability; nothing in any hypothesis test gives the desired probability “Given x, what is the probability of y”; instead, testing says, based on *ad hoc* criteria, x and y are mysteriously related (“linked”) or that x causes y. These inferences are *never* valid. The importance of this logical truth cannot be overstated. This why so many statistical models report false results. (A reminder that a logical argument can be invalid but still have a true conclusion; the conclusion is just true for other reasons than the stated argument.)

Lastly, statistical models purport to report “effect size”, which is a measure of the importance of x on y. This “effect size” always either false or an assertion given far too much confidence (I used this word in its plain-English sense). Effect sizes say something about a premise inside x (a parameter or parameters) and not x itself, hence they are always over-certain. This form of over-certainty is eliminated by moving to a probability model.

More about this topic in the must-read get-it-now don’t-do-another-analysis-without-reading-it *Uncertainty: The Soul of Modeling, Probability & Statistics*.