Empirical support Glymour-bootstrap style

Clark Glymour introduced his bootstrapping methodology, ca. 1980, as a way to gauge relevant evidence for a hypothesis by data (deliverances of measurement) relative to a theory. Especially striking in this conception of evidence was its application to parts of the same theory, via other parts of the theory, relative to that theory.

As a conception of evidence this clashed with the (Lockean) precept that probability must a proportionate to the evidence.  For the relevant evidence, as defined, could be greater for a given theory T relative to itself than for a part T0 of T, relative to itself, though of course the probability of T cannot be greater than the probability of T0

By now, however, it is practically a common place that in general, measurement is theory-guided: what counts as a measurement of a given quantity, or what quantity a given procedure measures (if any), is theory-relative.  While some quantities count as directly measurable, theories involve theoretical quantities, and a quantity is theoretical exactly if all ways of measuring its value on a given system presuppose that the theory applies to that system.

Nevertheless, if measurement results are consistently in accord with a theory, that is accepted as empirical support for that theory. This sort of support can be good reason for accepting the theory, even if it does not correspond to how likely it is that the theory is true. 

So I propose to restate Glymour’s bootstrapping conception as a conception of empirical support. This form of support consists in the measurement of the theoretical quantities relative to the theory itself:

the initially partial, eventually full, determination of their values, from results of direct measurement, via theoretical calculations provided by the theory itself.

Focus on linear modelling

 Glymour highlights examples of linear modeling  He explains the reasons for this in the course of a response to Nancy Cartwright concerning causal modeling:

“An idea almost exactly as old as the twentieth century is that, in some circumstances, correlations among measured features may be explained by postulating that each measured variable is a linear function of unobserved causes that influence two or more measured variables, and of independently distributed unobserved causes particular to that measured variable. The joint probability density on all of the variables is determined by the joint density of the unmeasured variables and by the linear coefficients. The idea is basic to factor analysis, and is still a common model in psychometrics and elsewhere in social science.” (Glymour 1999: 56)

General format of a theory

My presentation will be fairly close to Glymour’s, with terminology and some details inevitably changed when we read bootstrapping as an insight into empirical support.

As general format for a theory we choose to identify a theory T with a triple <W, Q, CT> where

W is a non-empty set (the possibilities, possible states of affairs, possible states of a certain kind of system),

Q is a set of real-valued quantities defined on W, and is closed under at least composition and polynomials, and other operations if specified,

CT is a set of basic propositions, that is, a set of equations and inequalities between members of Q.

We will refer to the members of CT as the postulates or laws of T.

Sp(T), the possibility space of T, is the set of all members of W that are the solutions of all the constraints in CT .  

Note: as defined, an equation q = r is true iff for all w in W, q(w) = r(w); e.g. (q + q)/2 = q. 

data base is a set E of basic propositions of form q = E*(q), with function E* defined on a subset DE of Q, such that there is at least one possibility w in W such that E*(q) = q(w) for each q in DE.  Note well that w need not be in Sp(T): data might be found that are in direct contradiction with any given theory.

Definition.  Data base E provides relevant support for basic proposition H relative to theory T exactly if E has some alternative E’ and CT some subset C0 that does not include H, such that:

(1) CT ∪ E ∪ {H} has a solution.

(2) C0 ∪ E’ has a solution.

(3) All solutions of C0 ∪ E are solutions of H.

(4) No solutions of C0 ∪ E’ are solutions of H.

Clauses (2) and (4) ensure that the result due to finding data E is significant:  the data found could have been otherwise, and could have been incompatible with H and C0

The term “bootstrap” is apt, for when H is itself a postulate of T then the relevant support is support for part of T relative to T itself.  

Definition. Theory T has optimal relevant support by a given data base E exactly if E provides relevant support for each of T’s postulates, relative to T.

Examples to make two important points about empirical relative support.

We can call relevant support empirical if the data are measurement results of directly measurable quantities.  

(1) Less support for weaker theories

The first point to make is that a logically stronger (and so no more, or less probable) theory may have more relevant support than a weaker one.  Here are two small theories T and T0, where T implies T0, which have q and s as quantities, and q is the only directly measurable quantity.  

For any quantity r we define the ‘indicator’ quantity IXr:   IXr (w) = 1 if r(w) is in X, and = 0 otherwise. 

T has postulates 

T1. q(I[0,1]q) ≤  s ≤ q

T2. s = 1

T0 has postulates 

T1.  q(I[0,1]q) ≤  s ≤ q  

T3.   0 ≤ s ≤  1

Postulate T1, which they have in common, rules out negative values for s or q, and implies that if q is in [0, 1] then q = s.

If E = {q = 1} and the alternative evidence E’ is {q = x} with 0 < x < 1, then E provides relevant support for T2 relative to T.  

But neither E nor anything else can provide relevant support for T3, relative to T0.  The reason is that if any data base E’ = {q = y} is offered as alternative then E’ must be consistent with T1 so that y  ≥  0.  If y is in [0,1] then s = y is a solution of E’ plus T3.  And if y > 1, then  T1 implies only that 0 ≤  s  ≤ y, so any value of s in [0, 1] is a solution of {T1} ∪ E’,  that is also a solution of {T3}.

Empirical grounding: additional to support

A better, somewhat longer example, will be a theory with optimal relevant support, in which nevertheless additions are needed to secure complete empirical grounding.   

Definition.  Theory T is empirically grounded exactly if, for each theoretical quantity q in Q there is a data set E of values of directly measurable quantities, which has a solution in Sp(T) and is such that the value of q is the same in all solutions of CT ∪ E.

What this means is that there are, for each theoretical quantity q of T, empirical conditions compatible with T under which the value of q is uniquely determined by measurement results.   

Let q, r, s be the only directly measurable quantities, and let theory T+ have postulates

P1        q = x + u

P2        r = u – z

P3        s = x + z

The postulates imply for example that q – r = s, which is an empirical prediction.  But the postulates do not by themselves determine the values of the theoretical quantities x, u, z.         For example the following are two solutions of all three equations, each of which implies data set E = {q = 3, r = 4, s = -1}:

x =1, u = 2, z = -2

            x = 1.5, u = 1.5, z = -2.5

Taking that data set, supposedly obtained by measurement, we find that it provides relevant support for P3 relative to theory T.  The subset C0 of postulates used in the calculation is {P1, P2} and the alternative to E is E’ = {q =3, r = 4, s = 1}.  For P1 and P2, together with the data that q =3, r = 4 entail that x + z = -1, in accord with that fact that q – r = s. 

We can similarly show that E provides relevant support for P1 and for P2 relative to the theory. For adding P2 and P3 we arrive at x + u = q, and indeed, 4-1 = 3.  Similarly, subtracting P3 from P1 we arrive at u – z = r, and indeed, 3 – (-1) = 4. 

So this little theory has optimal relevant support from E.

BUT, in this example, do we see the criterion of empirical grounding satisfied for quantity z, in the conditions in which E is the set of data actually found?  That would require:

(*) There is some subset C00 of C such that all solutions of C00 ∪ E assign the same value to z.

In this case data set E assigns values to each of the directly measurable quantities. Therefore failure of this condition implies that even under perfect conditions for measurement, the value of z may not be determined.

That is in fact the case is illustrated above:  given data set E, it is still possible that z has either value -2 or – 2.5 (and perhaps others).  

As it stands, the theory is not strong enough to design a measurement procedure to determine the value of z on the basis of the values of the directly measurable quantities.  

But if the theory is made just a little stronger, the criterion can be satisfied.  From the above we can deduce that q + r = x + 2u -z.  So the following postulate, if added, makes u to be constant 0:

            P4        q + r = x – z

If we let that postulate be in C0 as well as the other three, then, with the data that q = 3 and r = 4 we see that besides x + z = -1 we have that x – z = 7, hence z = -4.

So we see in this example that to achieve empirical grounding, even optimal relevant support is not enough.  A theory has to be logically strong enough to design the proper measurement procedures, that is procedures that count as measurements of its quantities relative to itself.  And so a stronger theory may gather empirical support that a weaker theory, or one of its sub-theories, cannot have. But it needs to be very strong to achieve (complete) empirical grounding.

Nature has the last word

Something more is required for a good theory:

Concordance.  If q is a theoretical quantity of T then all data sets that are actually obtained as measurement results will yield the same value for this theoretical quantity relative to T.

Obviously, concordance cannot be guaranteed. Whether there is concordancethat is strictly an empirical question.  

REFERENCES

Glymour, Clark (1980) Theory and Evidence. Princeton University Press: Princeton.

Glymour, Clark (1999) “Rabbit hunting”. Synthese 12: 55-78.

Van Fraassen, Bas C. (2012) “Modeling and measurement: the criterion of empirical grounding”, Philosophy of Science 79: 773-784. 

.