Stalnaker’s Thesis that the probability of a conditional is the conditional probability, of the consequent given the antecedent, ran quickly into serious trouble, in the first instance (famously) by David Lewis.
When I took issue with David Lewis’s triviality results, Robert Stalnaker wrote me a letter in 1974 (Stalnaker 1976). Stalnaker showed that my critique of Lewis did not save his Thesis when applied to his (Stalnaker’s) own logic of conditionals (logic C2).
Stalnaker proved, without relying on Lewis’ special assumptions:
If the logic of conditionals is C2, and for all statements A and B, P(A → B) = P(B|A) when defined, then there are at most two disjoint propositions with probability > 0.
At first blush this proof must raise a problem of a result I had presented, namely:
Theorem. Any antecedently given probability measure on a countable field of sets can be extended into a model structure with probability, in which Stalnaker’s Thesis holds, while the field of sets is extended into a probability algebra.
This theorem does not hold for a language of which the logic is Stalnaker’s C2. Rather, it can be presented equivalently as a result for a language that has the same syntax as C2, but has a weaker logic, that I called CE.
While Stalnaker acknowledged that his proof was specifically for C2, and did not claim that it applied to CE, neither he nor I showed then just how the difference between the two logics resolves the apparent tension.
Here I will show just how Stalnaker’s triviality argument does not hold for CE, with a simple counterexample.
2. Stalnaker’s Lemma
Stalnaker’s argument relies on C2 at the following point, stated without proof, which I will call his Lemma.
Definition. C = A v (~A & (A → ~B))
Lemma. ~C entails C → ~(A & ~B)
We may note in passing that these formulas can be simplified using principles that hold in both C2 and CE, for sentences A and B that are neither tautologies nor contradictions. Although I won’t rely on this below, let’s just note that C is then equivalent to [A v (A → ~B)] and ~C to [~A & (A → B)].
3. The CE counter-example to the Lemma
I will show that this Lemma has a counter-example in the finite partial model of CE that I constructed in the post “Probabilities of Conditionals: (1) Finite et-ups” (March 29, 2021).
The propositions are sets of possible outcomes of a tossed fair die, named just by the numbers of spots that are on the upper face. To begin we take propositions
p = {1, 3, 5} “the outcome is odd”
q = {1, 2, 3} “the outcome is low”
The probability of (p → q) will be P(q|p) = P(1, 3)/P(1, 3, 5) = 2/3. That is the clue to the construction of the selection function s(x, p) for worlds x = 1, 2, 3, 4, 5, 6.
In this model the choices are these. First of all if x is in p then s(x, p) = x. For the other three worlds we choose:
s(2, p) = 1, s(4, p) = 3, s(6, p) = 5
Thus (p → q) is true in 1 and 3, which belong to (p ∩ q), and also in 2 and 4, but not in 5 or 6.
Hence (p → q) = {1, 3, 2, 4}, “if the outcome is odd then it is low”, which has probability 2/3 as required.
Similarly we see that (p → ~q) = {5, 6}.
To test Stalnaker’s Lemma we define:
c = p ∪ (~p ∩ (p → ~q))
= {1, 3, 5} ∪ ({2,4, 6} ∩ {5, 6})
= {1,3, 5} ∪ {6}
= {1,3,5, 6} “the outcome is odd or 6” or “the outcome is neither 2 nor 4”
~c = {2, 4} “the outcome is 2 or 4” (the premise of the Lemma)
Now proposition c has four members, and that means that in the construction of the model we need to go to Stage 2. There the original 6 world model is embedded in a 60 world model, with each possible outcome x replaced by ten worlds x(1), …, x(10). These are the same as x, except that the selection function can be extended so as to evaluate new conditionals. The previously determined choices for the selection function carry over. For example, s(4(i), p) = 3(i), so (p → q) is true in each world 4(i), for i = 1, …, 10.
We refer to the set {x(1), …, x(10)} as [x]. So in this stage,
c = [1] ∪ [3] ∪ [5] ∪ [6]
The conclusion of the Lemma is:
c → ~(p ∩ ~q} = c → ~[([1] ∪ [3] ∪ [5]) ∩ ([4] ∪ [5] ∪ [6])]
= c → ~[5] “If the outcome is either odd or 6 then it is not 5”
What must s(x, c) be? The way to determine that is to realize again that each member of c must have probability ¼ conditional on c. Probability ¼ equals 15/60 so for example (c → {1}) must have 15 members.
Since [1] is part of c, we must set s(1(1), c) = 1(1), and so forth, through s(1(10), c) = 1(10). Similarly for the other members of c.
To finish the construction we need to get up to 15, so we must choose five worlds y not in [1] such that s(y, c) = 1. Similarly for the rest. To do so is fairly straightforward, because we can divide up the members of [2] and [4] into four bunches of five worlds each:
S(2(i), c) = 1(i) for i = 1, .., 5
S(2(j), c) = 3 (j) for j = 6, .., 10
S(4(i), c) = 5(i) for i = 1, .., 5
S(4(j), c) = 6 (j) for j = 6, .., 10
Now each conditional c → [x] is defined for each of the 60 worlds, and has probability ¼ for x = 1, 3, 5, 6.
The Lemma now amounts to this, in this model:
~c implies c → ~{[5]}
or, explicitly,
[2] ∪ [4] ⊑ [[1] ∪ [3] ∪ [5] ∪ [6]] → ~[5]
For a counter-example we look at a specific world in which ~c is true, namely world 4(1). Above we see that s(4(1), c) = 5(1). Therefore in that world the conditional c → {5(1)} is true, and hence also c → [5], which is contrary to the conclusion of the Lemma.
4. Conclusion
To recap: in this finite partial model of CE the examined instance of Stalnaker’s Lemma amounts to:
Premise. The outcome is either 2 or 4
Conclusion. If the outcome is neither 2 nor 4 then it is not 5 either
And the counter-example is that in this tossed coin model, there is a certain world in which the outcome is 4, but the relevant true conditional there is that if the outcome is not 2 or 4 then it is 5.
Of course, given that the Lemma holds in C2, this partial model of CE is not a counter-example to Stalnaker’s argument as it applies to his logic C2 or its extensions. It just removes the apparent threat to CE.
REFERENCES
Stalnaker, Robert (1976) “Stalnaker to van Fraassen”. Pp. 302-306 in W. L. Harper and C. A. Hooker (eds.) Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science. Dordrecht: Reidel.
At first blush these two topics may seem entirely unrelated. Stalnaker’s Thesis is that the probability of (If A then B) is the conditional probability of B given A. A Moore Statement (one that instantiates Moore’s Paradox) is a statement that could be true, but could not be believed.
But the two get closer when we replace the intuitive notion of belief with subjective probability. Then there are two kinds of Moore Statements to be distinguished: An Ordinary Moore Statement is one that could be true, but cannot have probability one. A Strong Moore Statement is one that could have positive probability, but could not have probability one.
When we introduce conditionals with Stalnaker’s Thesis there are Moore Statements in our language. I will give examples of both sorts, and indicate why they are important.
Example1. Imagine the following situation: 1. The match is not struck 2. The match is wet 3. It is not the case that if the match is struck, it will burn.
Om the basis of lines 1. and 2 we can give we can give several warrants for line 3. Following Stalnaker, we could assert that if the match is struck, it will not burn, because it is wet. And then equivalently, it is not the case that if the match is struck then it will burn. Obviously Lewis might reject this reasoning. However, Lewis would then say that the match mightor might not burn if struck. But that also implies that it is not the case that the match will burn if struck.
Now define:
X = (the match is not struck, and it is not the case that if the match is struck, it will burn)
The above imaginary scenario shows that X could be true. But X could not have probability one.
Let’s use P for probability as usual. If a conjunction has probability 1, so do its conjuncts. Thus if P(X) = 1 then P(the match is not struck) = 1, and P(the match is struck) = 0.
So then P(if the match is struck it will burn) = P(the match burns | the match is struck) is either 1 or undefined. (This depends just on the convention adopted for probability conditional on a proposition with probability 0.) Hence P(It is not the case that if the match is struck, it will burn) either equals 0 or is not defined. And accordingly, that is so for X as well: P(X) = 0 or P(X) is undefined.
Therefore X is an Ordinary Moore Statement.
To display and example of a Strong Moore Statement, we need to show that something can have positive probability. For this we can use a numerical example.
Example 2. Tosses with a fair die.
The basic statements involved are just about the outcome of a toss, and each outcome has probability 1/6. Define:
A = the outcome is either two or six. True in possibilities {2, 6}
~ A = the outcome is neither two nor six = the outcome is either odd or 4. True in possibilities {1, 3, 5, 4}
B = the outcome is six. True in possibilities {6}
Y = (~ A and it is not the case that if A then B)
The probability of ~A is 4/6.
By Stalnaker’s Thesis, the probability of the conditional (if A than B), equals P(B | A) = 1/2 = 3/6. So the negation of that conditional also has probability three out of six: P(~(if A then B)) = 3/6.
The probability of the disjunction of ~ A and ~(if A then B) is the sum of the probabilities of their disjuncts minus P(Y). This cannot be greater than 1. So (3/6) +(4/6) – P(Y) is less than or equal to 1.
Since the probability of the two conjuncts of Y together cannot be more than 1, it follows that their conjunction (that is, Y itself) has a probability greater than or equal to 1/6.
By the same argument as in Example 1, mutatis mutandis, Y cannot have probability 1.
Therefore, Y is a Strong Moore Statement.
Remark 1, re belief.
In thescenario for Example 1 it is natural reaction to say that we can believe X. That seems right, and if so, shows that the intuitive notion of belief does not imply subjective probability 1. There are other reasons to suggest that belief takes only a “sufficiently high” subjective probability (cf. Eva, Shear, and Fitelson 2022). This does have the drawback that no single number is high enough for all examples (the lottery paradox), so that belief must be context-sensitive.
Remark 2, re closure under conditionalization.
In an earlier post (Conditionals, Probabilities, and ‘Or to If’ 12/07/2022) I presented the argument that,
for any given domain, the set of probability functions that satisfy Stalnaker’s Thesis is not closed under conditionalization.
The argument was rather abstract, and what it lacked were good, concrete examples. Examples 1. and 2. above fill that gap.
There was a similar situation with the Reflection Principle for subjective probability. A probabilistic Moore Statement is one that is not self-contradictory, one that can even have a positive probability, but you cannot conditionalize on it, because it cannot have probability 1. That there are such statements entails that the set of probability functions which satisfy the principle, on a given domain, is not closed under conditionalization.
For a discussion of how Moore’s Paradox is related to closure under conditionalization see also my post “A Brief Note on the Logic of Subjective Probability” (07/24/2019)
Remark 3, about triviality results
Lewis’ famous triviality result for Stalnaker’s Thesis assumed that the admissible probability functions on a model is closed under conditionalization, and indeed, that it should be. The above examples show that this assumption of Lewis’ precludes Stalnaker’s Thesis from the outset.
Similarly for the other triviality results that I have seen.
REFERENCES
Eva, Benjamin; Ted Shear and Branden Fitelson (2022) “Four Approaches to Supposition”. Phisci-archive.pitt.edu/18412/7/fats.pdf
In his new book The Meaning of If Justin Khoo discusses the inference from “Either not-A or B” to “If A then B”. Consider: “Either he is not in France at all, or he is in Paris”. Who would not infer “If he is in France, he is in Paris”? Yet, who would agree that “if … then” just means “either not … or”, the dreaded material conditional?
I do not want to argue either for or against the validity of the ‘or to if’ inference. The curious fact is that just thinking about it brings out something very unusual about conditionals. Perhaps it will have far reaching consequences for the concept of logical entailment.
To set out the traditional concept of entailment let A be a Boolean algebra of propositions and P(A) the set of all probability measures with domain A. I will use “&” for the meet operator. Then entailment, as a relation between propositions, can be characterized in three different ways, which are in fact, in this case, equivalent:
(1) the natural partial ordering of A, with (a ≤ b) defined as (a&b) = a.
(2) For all m in P(A), if m(a) = 1 then m(b) = 1
(3) For all m in P(A), m(a) <= m(b)
The argument for their equivalence, which is spelled out in the Appendix, requires just two facts about P(A):
P(A) is closed under conditionalization, that is, if m(a) > 0 then m(. |a) is also in P(A), if defined.
If a is a non-zero element of A then there is a measure m in P(A) such that m(a) > 0.
Enter the Conditional: the ‘Or to If’ Counterexample
The Thesis, aka Stalnaker’s Thesis, is that the probability of conditional (a → b) is the conditional probability of b given a, when defined:
m(a →b) = m(b|a) = m(b & a)/m(a), if defined.
Point: if the special operator→ is added to A with the condition that m(a → b) = m(b|a) when defined, then these three candidate definitions are no longer equivalent. For:
(4) For all m in P(A), if m(~a v b) = 1 then m(b|a) = 1
(5) For many m in P(A), m(~a v b) > m(b|a)
For (4) note that if m(~a v b) = 1 then m(a & ~b) = 0 so m(a) = m(a&b). Therefore m(b|a) = 1. So on the second characterization of entailment, the “if to or” inference is valid. If you are sure of the premise you will be sure of the consequent.
But not so for the third characterization of entailment. For (5) take this example (I will call it the counterexample): we are going to toss a fair die:
Probability that the outcome will be either not even or six (i.e. in {1, 3, 5, 6}) = 4/6 = 2/3.
Probability that the outcome is six, given that the outcome is even = 1/3.
So in this context the traditional three-fold concept of entailment comes apart.
Losing Closure Under Conditionalization
Recalling that to prove the equivalence of (1) –(3) for a Boolean algebra, we needed just two assumptions, we can use that, together with the counterexample, to draw a conclusion that holds for every and any logic of conditionals with Stalnaker’s Thesis.
Let A→ be a Boolean algebra with additional operator →. Let P(A→) be the set of all probability measures on A→such that m(a → b) = m(b|a) when defined. Then:
Theorem. If for every non-zero element a of A→ there is a member m of P(A→) such that m(a) > 0 then P(A→) is not closed under conditionalization.
I was surprised. Previous examples of such lack of closure were due to special principles like Miller’s Principle and the Reflection Principle.
I do not think this result looks really bad for the Thesis, though it needs to be explored. It does mean that from a semantic point of view, there are in the same set-up two distinct logics of conditionals.
However, it seems to look bad for the Extended Thesis (aka ‘fully resilient Adams Thesis’):
(*) m(A → B| E) = m(B | E & A) if defined
For if we look at the conditionalization of m on a proposition X, namely the function m*(. | ..) = m( . | .. & X), then if m is well defined and satisfies (*) we get
m*(A → B| E) = m(A → B| E & X) = m(B | E & A & X) = m*(B| E & A)
that is, m* also satisfies the Extended Thesis. So it appears that the Extended Thesis entails or requires closure under conditionalization for the set of admissible probability measures.
But it can’t have it, in view of the ‘or to if’ counterexample.
Appendix.
That (1) – (3) are equivalent for a Boolean algebra (with no modal operators).
Clearly, if (a & b) = a then m(a) <= m(b), and hence also that if m(a) = 1 then m(b) = 1. This includes the case of a = 0.
So I need to show that if the first relation does not hold, that is, if it is not the case that a ≤ b, then neither do the other two.
Note: I will make use of just two features of P(A):
P(A) is closed under conditionalization, that is, if m(a) > 0 then m(. |a) is also in P(A), if defined.
If a is a non-zero element of A then there is a measure m in P(A) such that m(a) > 0.
Lemma. If it is not the case that (a&b) = a then there is a measure p such that p(a & ~b) > 0 while p(b & ~a|) = 0.
For if (a & b) is not a then (a & ~b) is a non-zero element. Hence there is is a measure m such that m(a & ~b) >0, and so also m(a) > 0. So m(.|a) is well defined. And then m(a & ~b|a) >0 while m(b & ~a| a) = 0.
Ad condition (3): Suppose now that (a & b) is only part of a, and m(a & ~b) > 0). Then m(a) > 0, so m(. |a) is well defined and in P(A). Now m(b|a) = m(b & a)/[m(b & ~a) + m(b & a)] hence < 1, hence < m(a|a) = 1.
Ad condition (2): All we have left now to show is that if (a & b) is not a, and a is not 0, then condition (2) does not hold either. But that follows from what we just saw: there is then a member m of P(A) such that m(a) > m(b & a). So consider the measure m(.|a), which is also in P(A): m(b|a) < 1, while of course m(a|a) = 1.
Bell’s Inequalities, which are satisfied in the probabilities of results, conditional on experimental set-ups of a traditional sort, are famously violated in the results of certain quantum mechanics experiments. It is therefore remarkable that those inequalities could be deduced from certain putative principles about causality and locality. It would seem that their violation could be taken as refuting those principles.
But even more remarkable were the signs that Bell’s Inequalities could be deduced logically, given certain principles seriously proposed for the logic of conditionals. (See Bibliography: Stapp 1971, Eberhard 1977, Herbert and Karush 1978.) My project here is to examine that deduction, and the options there are for a philosophical response. Is it the logic of conditionals that is at fault? Or is it an understanding of conditionals and their logic that was loaded with philosophical realist presuppositions?
Bell’s Inequalities related conditional probabilities, of results given certain measurement set-ups. Therefore, if they are to be approached in this way, it can only be in a logic compatible with a bridge principle that relates conditionals and conditional probabilities. The bridge principle introduced by Robert Stalnaker, known generally as Stalnaker’s Thesis, or more recently as just The Thesis, was this:
Thesis. P(A |B) = P(B |A)
Its combination with Stalnaker’s logic of conditionals turned out to be inadequate (it would allow only trivially small models). David Lewis rejected the Thesis, but also pointed the finger at what he took to be a mistaken principle about conditionals
Conditional Excluded Middle (CEX) [A –> (B v C)] is equivalent to [(A –>B) v (A –> C)]
The Thesis and CEX do indeed go together, for one needs to accommodate the fact about conditional probability that if B and C are contraries then P(B v C| A) = P(B |A ) + P(C | A).
(CEX is what Paul Teller called the Candy Bar Principle; see previous post about this.)
There is however a logic of conditionals, which I called CE, somewhat weaker than Stalnaker’s, which includes CEX yet combines successfully with the Thesis. It has only infinite models, but ‘small’ situations, like experimental set-ups, can be modeled with partial structures that are demonstrably extendable to models of CE combined with the Thesis, for all sentences, regardless of complexity. (See previous posts on probabilities of conditionals.)
So, we are in a position to examine the arguments that putatively lead from the logic of conditionals to Bell’s Inequalities.
The first principle for any logic of conditionals is that if A implies B then A–> B is true. This yields at once the theorem that A –> A is always true. The second is Modus Ponens: if A and A –> B are both true then B is true. Beyond this, there be controversy.
The logic CE adds CEX, stated above, as well as two more principles about conjunction. Stated as a theory of propositions (rather than the sentences that express them):
(I) Conditional Excluded Middle A –> (B v C) = (A –> B) v (A –> C)
(II) Conjunction Distribution A –> (B & C) = (A –> B) & (A –> C)
(III) Modus Ponens Amplified A & (A –> B) = (A & B)
The last includes Modus Ponens, but adds something: if A and B are both true, then there is no question about whether B would be true if A were true, of course it would, because it is.
How are situations, like experimental set-ups, modeled? The standard way of doing this in philosophical logic is to say that each proposition is either true or false in each possible world, and the proposition can be identified with (or at least, represented by) the set of worlds in which it is true. Of course, ‘possible worlds’ is a metaphor, there is just the set-up and the possible results of doing the experiment.
As a simple example suppose a die is to be tossed. We have a hypothesis: the die is fair, and all outcomes have the same probability. So as possible worlds we just take the sequences <toss, outcome> of which there are six. In <toss, 1> the statement ‘the outcome is 1’ is true, and so forth.
Now the conditionals we are interested in are such as these:
A –> B ‘if the outcome is odd, it is less than 4’.
That is true in <toss, 1> and in <toss, 3>. But where else is it true? To satisfy Stalnaker’s Thesis, since the probability of ‘greater than four, conditional on even’ is 2/3, or 4/6. So the conditional must be true in two other worlds besides those.
So we have in the model a function s: given non-empty antecedent A and world w, the world s(A,w) is a world in which A is true. Intuitively, s(A, w) is the way things would have been the case with A true. The proposition A –> B is then the set of worlds {w: s(A, w) is in B}.
Elsewhere (see Notes at the end) you can see the details for this very simple experimental set up, modeled with conditionals and probabilities. Just to give you the idea, there the proposition A –>B that we have here as example, is true in worlds <toss, 1>, <toss, 3>, <toss, 2>, <toss, 4>. So, in this model, if the outcome is actually 4, then the outcome would have been less than 4 if it had been odd. There is no rationale for this: it is just how things are in that possible universe of worlds. We might be living in it, or in another one, that is not up to us.
The important point is this: for any such situation we can construct a representation that is extendible to a model of CE in which the Thesis holds for all propositions.
For the original Einstein-Podolsky-Rosen thought experiment David Bohm designed an experiment that would be technically feasible, in which a pair of photons are emitted from a source in an entangled state.
For us, what we need to display is only the ‘surface’ of the experimental set-up, with some notes to help the imagination; we do not need to look into the quantum mechanics.
Look at the left side, labeled (A) for ‘Alice’. It is a device in which there is a polarization filter, which may or may not be passed by an incoming particle. A red light goes on if it does pass, a green light if nothing passes. That filter has three different orientations, and one is chosen beforehand by the experimenter, or by a randomizing device. Similarly for the right hand side, labeled (B) for ‘Bob’.
The experimental facts are these: if we only look at one side, left or right, then regardless of the setting, the red light is going on in exactly 50% of the runs. But if we look at both, we see that if the two settings are the same, then the red lights never turn on at the same time (Perfect (anti)Correlation.) Furthermore, there are specific probabilities for the red lights turning on at the same time, for any pair of settings. These are conditional probabilities: P(red light for Alice and red light for Bob | Alice chose setting i and Bob chose setting j). It is these for which the Bell Inequalities may or may not hold.
With reference to the above diagram let’s refer to Alice as the one on the left (L) and Bob as the one on the right (R) . The two outcomes, red light on and green light on, I will refer to as outcomes 1 and 0. And the settings are settings 1, 2, 3. Let little letters like “a”. “b”, “i” and “j” be variables over {1,0}. Then we can symbolize:
On the left the setting is i: Li
On the left the setting is i and the outcome on the left is a: Lia
On the right the setting is j and the outcome on the right is b: Rjb
and then we can have sentences like (Lj1 –> Rj0) to indicate that with the same setting in on both sides, if the outcome is 1 on the left then it will be 0 on the right. Also sentences like P(Lj0|Rk0) = 3/4 to indicate that if the left and right settings are j and k respectively, then the probability that the light is green on the left side, given that it is green on the right side, equals 3/4.
The conclusion drawn from many observations are the following two premises.
For i, j = 1, 2, 3 and a, b = 0, 1:
I. Perfect Correlation: If the setting is i on both sides then the probability of outcome a on both sides equals 0
II. Surface Locality: The probability of outcome Lia is the same conditional on Li as it is conditional on Li & Rj —
that is, the probability of an outcome on one side is unaffected by the setting on the other side.
Now the Bell Inequalities can be expressed in a simple way. Let us abbreviate the special case of the probability of outcome 1 happening on both sides, for specific settings, as follows:
p(i; j) = the probability of (Li1 & Rj1) given settings Li and Rj
The Bell Inequalities can then be expressed in a set of ‘triangle inequalities’:
p(1;2) + p(2;3) ≥ p(1;3)
and so forth. There is no reference in these inequalities to any factor which may be hidden from direct measurement — any violation can be found on the observable level.
So it would be very disconcerting if there were a proof that there could not be any violations!
We fix on some entailments implied in the experimental set-up, assuming the sort of perfection not found in an actual lab. So we take it that the settings being chosen, and the experiment initiated, entails that there will be an outcome, and it will be red light or green light:
Li entails (Li1 v Li0), hence Li –> (Li1 v Li0) is true
Similarly for the other similar points; for example, (Li &Rj) –> [(Li1 & Rj1) v …. v (Li0 & Rj0)], all logically possible combinations listed in the blank here.
Moreover, we take it as necessary, that is true in all worlds, that outcomes are unique. That is, the conjunction (Li1 & Li0) is never true; similarly for R.
Finally, the modeling must accommodate this: Lia and Rjb are each, taken individually, possible, that is, there is a possible world in which it is true.
Consequences of the entry moves
Starting with the entry move, and using Conditional Excluded Middle, we infer accordingly:
Either Li –> Li1 or Li –> Li0 is true,
more generally,
One of the conditionals (Li & Rj) –> (Lia & Rjb) is true, i, j = 1, 2, 3 and a, b = 1, 0.
Note that this is a finite set of conditionals, since there are only finitely many combinations of settings and outcomes.
I have just written “is true”, as if we are only interested in the actual world. Of course we have to be interested in all the possible worlds in the modeling set-up, each characterized by specific settings and specific outcomes. But the above reasoning holds for any world in the model.
We note also that as long as an antecedent of a conditional is itself possible, there cannot be a conflict in the consequents:
if A –> B and A –> ~B are true, or if [A –> (B & ~B)]is true, then A is false.
This follows from the Conjunction principles, and applies to our case because with our entry moves Li1 implies the falsity of Li0, and so forth.
The hidden variable
Which counterfactual conditionals are true in a given situation is not something empirically accessible. But by the above reasoning, in any given world α in the model, there is a set of true propositions of form Li –> Lia, Rj –>Rjb, (Li & Rj) –> (Lia & Rjb) which characterizes that world.
Call that set A(α). It is a hidden factor which completely determines what the outcomes will or would be, whatever setting the experimenter chooses or would have chosen if he had chosen differently. A(α) represents the world’s hidden dynamical state.
NOTE: At this point we have to raise a question not answerable in that simple artificial language: what makes those conditionals true? In order for the discussion to have any bite, with respect to the experiment, whatever makes them true must not be, for example, the actual but unknown future — it has to be something determined before the experiment has its outcomes. That set A(α) of statements has to represent something characterizing the particle-pair at the outset: that hidden dynamical state has to be a physical feature. To this we will come back below.
(Historically minded logicians will be reminded here of the objections to Diodoros Chronos’ Master Argument.)
Importing Perfect Correlation
This is a Surface principle that must govern the modeling. It can be extrapolated, because there is nothing you could add to the antecedent that would raise a probability zero to a positive probability. So we conclude
For i = 1, 2, 3, P(Li1 & Ri1| Li & Ri & X) = 0, and so [(Li & Ri & X) –> (Li1 & Ri1)] is false in all worlds, regardless of what X is, unless (Li & Ri & X) is itself impossible.
Consider now world α and let X be its hidden state A(α). Suppose Li and Ri are both true. In that case, since A(α) is also true, it follows that Li1 and Ri1 are not both true. So Li –> Li1 is in A(α) if and only if Ri –> Ri1 is not in A(α), which entails that Ri –> Ri0 is in A(α).
Thus for each i = 1, 2, 3 we need only know whether Li –>Li1 is in A(α), we need not add a conditional with antecedent Ri. So really, all that matters in A(α) are three conditionals: L1 –>L1a, L2 –>L2b, L3 –>L3c. And it is the triple <a, b, c> that summarizes what matters about A(α), a triple of numbers each of which is 0 or 1. In some worlds the hidden state is thus summarized by <1,0,1>, in others by <0, 1, 1>, and do forth. Let’s introduce a name:
Cabc = the set of worlds β such that the hidden state of β is summarized by <a, b, c>
That set of worlds has a probability, P(Cabc).
Suppose now that we have chosen settings L1 and R2 in world α. What is the probability that the red light will turn on, on both sides — i.e. the probability of L11 & R21? It is the probability that A(α) is either of type <1, 0, 1> or type <1, 0, 0>, hence P(C101)+P(C100).
So that sum equals the conditional probability p(1; 2) = P(L11 & R21 | L1 & R2). Similarly for the other terms in Bell’s Inequalities. Now we can see whether they follow from what we have arrived at so far:
Similarly for the other triangle inequalities that make up Bell’s Inequalities.
Thus, we have deduced the Bell’s Inequalities, and this implies that in the experimental set-up we predict that those inequalities will not be violated. But in certain set-ups of this form, they are violated.
Our task: to show just what is wrong with the above deduction, what hidden assumptions it must have that are disguised by our traditional ways of thinking about counterfactual conditionals or even more hidden assumptions underlying those.
Faced with the above result, and given the attested phenomena that violate Bell’s Inequalities, the first temptation is surely to conclude that the logic of conditionals is at fault, with the main suspects being CEX and or Stalnaker’s Thesis.
The problem with this is that if we reject CEX or Stalnaker’s Thesis, we no longer have any way to relate conditionals to Bell’s Inequalities, which deal with conditional probabilities. So the conversation ends there.
I propose that the fault lies rather in the philosophical background, with realism about conditionals. That is a metaphysical position, even if it mimics common sense discourse oblivious to its own presuppositions. On such a realist view, conditionals, even when counterfactual, are factually true or false. On that view, what I called the hidden state is real, an aspect of the objective modalities in nature.
The are different options for empiricist/nominalist views about conditionals. Elsewhere I have explored a switch from semantics to pragmatics, by moving the focus from truth conditions to felicity conditions for assertion.
But here I will explain an alternative that seems pertinent for the present topic, the analysis of the sort of reasoning that surrounds the Einstein-Podolsky-Rosen paradox and the violation of Bell’s Inequalities. (I first suggested this in in a Zoom lecture in March 2022 to a German student organizaton, and have since then worked out the technical details in the post called A Rudimentary Approach to the True, the False, and the Probable.)
Let us start from the core description of the experiment (or any experiment or situation of this sort) which involves the assertions about the actual settings and the conditional probabilities of outcomes or different settings. As far as the exposition of the phenomena is concerned, that suffices. All relevant information for the discussion of Bell’s Inequalities’ violation by certain phenomena can be expressed here.
Now suppose that into this language we introduce the arrow, the ‘conditional’ propositional operator, with Stalnaker’s Thesis as the basic principle governing its meaning:
P(B | A) = P(A –> B)
Extending the language cannot by itself create new information, let alone new facts! So we should insist that the right-hand part of the equation contains no more information about the experiment than the left-hand part does.
A realist interpretation denies this, in part at least: it insists that we, the observers, have no more information than what is there on the left, but this merely a limitation to our knowledge. In fact, those conditionals are divided into the true and the false, in accordance with facts not describable in the original language.
What alternative can we offer? We can submit that the conditional A –> B is true only if P(B | A) = 1 and false only if P(B | A) = 0. In that case there is nothing more to be known about the truth values of conditionals beyond what we can gather from the probabilities.
It follows of course that many of these conditionals are neither true nor false. Indeed, in the original set-up, one of the remarkable facts is that we know that P(L11|L1) = 0.5, and so (L1 –> L11) is not true, and not false. The true conditionals, do not tell us what will definitely happen, though they tell us something about what will definitely not happen. An example is [(L1 & R1 &R10) –> L11], because P(L11| L1 & R10) = 1.
Specifically, the Candy Bar Principle is correct in one sense and not in another. In general, A –> (B or ~B) is true, that is part of CEX, and the fact that P(B or ~B | A) = 1. But still in general, neither P(B|A) nor P(~B|A) will have probability 1, so neither A –> B nor A –> ~B will be true. Conditional Excluded Middle is valid, but the Principle of Bivalence fails. (And this is a familiar phenomenon in various places in philosophical logic.)
My suggestion is therefore that factual statements about preparation, measurement, and outcomes are true or false always, while the subjunctive conditionals about them are true or false only when the conditional probabilities are 0 or 1. This does not make much sense in the usual approaches to semantic analysis of modals, which are realist at least in form. How could there be such a difference between the evaluation of conditionals and the evaluation of statements lacking any such modal connectives?
It can be done in the way I’ve presented in the post “A Rudimentary Algebraic Approach to the True, the False, and the Probable”.
NOTES
A bit of history. In a seminar in 1981, after teaching about Bell’s Inequalities, I handed out a small addition called “The End of the Stalnaker Conditional?” It contained a sketch of the argument I presented here. This paper was widely distributed though never published. John Halpin (1986), who cites this paper, elaborated on it with reference to the defense and modifications Stalnaker offered in his (1981), to show that Bell’s Inequalities would still be derivable after that defense. Both Halpin and I were addressing Stalnaker’s logic, which is stronger than CE, and could not be successfully combined with Stalnaker’s Thesis. So the point was really moot, unless the argument concerning Bell’s Inequalities could be shown to use no resources going beyond CE.
The logic CE and its combination with Stalnaker’s Thesis, with proofs of its adequacy, were presented in my (1976). Essentially the same theory was developed by I. R. Goodman and H. T. Nguyen, independently; for a quick look see the Wikipedia articles “Conditional Event Algebra” and “Goodman-Nguyen-van Fraassen algebra”. The ideas of my 1975 were also developed further by Stefan Kaufman in a series of more recent papers; see especially Kaufman (2009), and still more recently by Goldstein and Santorio (2021).
Above I wrote about CE, combined with the Thesis, that it has only infinite models. But ‘small’ situations, like experimental set-ups, can be modeled with partial structures that are demonstrably extendable to models of CE combined with the Thesis, for all sentences. Details about this can be found in my previous posts in this blog, “Probabilities of Conditionals” (1) and (2). The examples with die tosses, and their details, can be found there.
BIBLIOGRAPHY
Eberhard, P. H. (1977) “Bell’s Theorem without hidden variables”. Il Nuovo Cimento 38B(1): 75-79.
Goldstein, S. and P. Santorio (2021) “Probability for Epistemic Modalities”. Philosophers’ Imprint 21 (33): 1-34.
Halpin, J. (1986) “Stalnaker’s Conditional and Bell’s Problem”. Synthese 69: 325-340.
Herbert, N. and J. Karush (1978) “Generalizations of Bell’s Inequalities”. Foundations of Physics 8: 313-317.
Kaufmann, S. (2009) “Conditionals Right and Left: Probabilities for the Whole Family”. Journal of Philosophical Logic 38: 381-353.
Stalnaker, R. (1981) “A Defense of Conditional Excluded Middle”. Pages 87-104 in Harper, Stalnaker, and Pearce (eds.) Ifs. Dordrecht: Reidel.
Stapp, Henry P. (1971) “S-matrix interpretation of quantum theory”. Physical Review D3: 1303-1320.
van Fraassen, B. C. (1976) “Probabilities of Conditionals”. Pages 261-308 in W. Harper and C.A. Hooker (eds.) Foundations of Probability and Statistics, Volume l. Dordrecht: Reidel.
At a conference at Notre Dame in 1987 Paul Teller “made an issue” as he wrote later of “the fallacious form of argument I called the ‘Candy Bar Principle’:
from ‘If I were hungry I would eat some candy bar’ conclude ‘There is some candy bar which I would eat if I were hungry’.”
And Henry Stapp, whom Teller had criticized mentioned this in his presentation: “Paul Teller has suggested that any proposed proof of the kind I am setting forth must contain such a logical error ….”
I do not want to enter into this controversy, if only because there were so many arguments swirling around Stapp’s proposed proofs. Instead I want to examine the question:
is the Candy Bar inference a fallacy?
Let’s formulate it for just a finite case: there are three candy bars, A, B, and N. The first two are in this room and the third is next door. I shall refer to the following form of argument as a Candy Bar inference:
If I choose a candy bar it will be either A or B
therefore,
If I choose a candy bar it will be A, or, if I choose a candy bar it will be B
and I will symbolize this as follows:
C –> (A v B), therefore (C –> A) v (C –> B)
This has a bit of a history of course: it was submitted as valid in Robert Stalnaker’s original theory of conditionals and was rejected by David Lewis in his theory. Lewis showed that Stalnaker’s theory was inadequate, and blamed this principle. But we should quickly add that the problems Lewis raised also disappeared if this principle were kept while another one, shared by Stalnaker and Lewis, was rejected. This is just by the way, for now I will leave all of this aside.
How shall we go about testing the Candy Bar inference?
I imagine that the first intuitive reaction is something like this:
Imagine that I decide to choose a candy bar in this room. Then it will definitely be either A or B that I choose. But there is nothing definite about which one it will be.
I could close my eyes and choose at random.
Very fine! But unfortunately that is not an argument against the Candy Bar inference, but rather against the following different inference:
It is certain that if I choose, then I will choose either A or B,
therefore
Either it is certain that if I choose I will choose A, or, it is certain that if I choose I will choose B
That is not at all the same, for we cannot equate ‘It is certain that if X then Y’ with ‘if X then Y’. As an example, contrast the confident assertion “If the temperature drops it will rain tomorrow” with “It is certain that if the temperature drops it will rain tomorrow”. The former will be borne out, the prediction will be verified, if in fact the temperature drops and it rains the next day — but this is not enough to show that the latter assertion was true.
So the intuitive reaction does not settle the matter. How else can we test the Candy Bar inference?
Can we test it empirically? Suppose two people, Bob and Alice of course, are asked to predict what I will do, and write on pieces of paper, respectively, “if Bas chooses a candy bar in this room, he will choose A” and “if Bas chooses a candy bar in this room, he will choose B”. Surely we will say:
we know that if Bas chooses a candy bar in this room, he will choose A or B.
So if he does, either Bob or Alice will turn out to have been right.
And then, if Bas chooses A, we will say “Bob was right”.
That is also an intuitive reaction, which appears to favor the Candy Bar inference. But again, it does not really establish much. For it says nothing at all about which of these conditionals, if any, would be true if Bas does not choose a candy bar. That is the problem with any sort of empirical test: it deals only with facts and does not have access to what would have happened instead of what did happen.
Well there is another empirical approach, not directly to any facts about the choice and the candy bars, but to how reasonable, practical people would let this situation figure in their decision making.
So now we present Alice and Bob with this situation and we ask them to make bets. These are conditional bets, they will be Gentlemen’s Wagers, which means that they get their money back if Bas does not choose.
Alice first asks herself: how likely is he to choose a bar from this room, as opposed to from next door (where, you remember, there is bar N) Suppose she takes that to have probability 3/4. She accepts a bet that Bas will choose A or B, if he chooses at all, with payoff 1 and price 0.75. Her expectation value is 0, it is just a fair bet.
Meanwhile Bob agrees with her probability judgment, but is placing two bets, one that if Bas chooses he will choose A, and one that if Bas chooses he will choose B. These he thinks equally probable, so for a payoff of 1 he agrees to price 3/8 for each. His expectation value is 1/4(0) + 3/8(1) + 3/8(1) minus what he paid, hence 0: this too is just a fair bet.
Thus Alice and Bob pay the same to be in a fair betting situation, where the payoff prices are the same, though one was, in effect, addressing the premise and the other the conclusion of the Candy Bar inference. So, as far as rational betting behavior is concerned then, again, there is no difference between the two statements.
Betting, however, as we well now by now, is also only a crude measuring instrument for what matters. The fact that these are Gentlemen’s Wagers, as they pretty well have to be, once again means that we are really only dealing with the scenario in which the antecedent is true. The counterfactual aspect is beyond our ken.
To be clear: counterfactual conditionals are metaphysical statements, if they are statements about what is the case, at all. They are not empirical statements, and this makes the question about the validity of the Candy Bar inference a metaphysical question.
There is quite a lot of every-day metaphysics entrenched at the surface of our ordinary discourse. Think for instance of what Nancy Cartwright calls this-worldly causality, with examples like the rock breaking the window and the cat lapping up the milk.
Traditional principles about conditionals, just as much as traditional principles about causality, may guide our model building. And then nature may or may not fit our models …
So the question is not closed, the relation to what is empirically accessible may be more subtle than I managed to get to here. To be continued ….
REFERENCES
The Notre Dame Conference in question had its proceedings published as Philosophical Consequences of Quantum Theory: Reflections on Bell’s Theorem (ed. J. T. Cushing and E. McMullin; University of Notre Dame Press 1989).
My quotes from Teller and Stapp are from pages 210 and 166 respectively.
If a theory has no finite models, can we still discuss finite examples, taking for granted that they can be represented in the theory’s models?
It is a well-known story: Robert Stalnaker introduced the thesis, now generally called
The Equation P(p → q) = P(q|p), provided P(p) > 0
that the probability of a conditional is the conditional probability of consequent-given-antecedent. Then David Lewis refuted Stalnaker’s theory.
In 1976 proposed The Equation for a weaker logic of conditionals that I called CE. The main theorem was that any probability function P on a denumerable or finite field of sets (‘propositions‘) can be extended to a model of CE incorporating P, with an operation → on the propositions, satisfying The Equation.
To be clear: the models for CE endowed with probability in this way are very large, the universe of possible worlds non-denumerable. But taking a cue from the proof of that theorem, I mean to show here that we can in practice direct our attention to finite set-ups. These are, as it were, the beginnings of models, and they can be used to provide legitimate examples with manageable calculations.
The reason the theory’s models get so large is that the conditional probabilities introduce more and more numbers (See Hajek 1989).
Example. Consider the possible outcomes of a fair die toss: 1, 2, 3, 4, 5, 6. With these outcomes as possible worlds, we have 2^6 propositions, but all the probabilities assigned to them are multiples of 1/6. So what is the conditional probability of the outcome is 5, given that it is not 6? Probability 1/5. What is the conditional probability of the outcome is 4, given the outcome is less than 5? Probability 1/4. Neither is a multiple of 1/6.
Therefore, none of those 2^6 propositions can be either the proposition that the outcome is (5 if it is not 6), or the proposition that the outcome is (4 if it is less than 5).
In the end, the only way to allow for arbitrarily nested conditionals, in any and all proposition algebras closed under →, is to think of any set of outcomes that we want to model as the slices of an equitably sliced pie which is infinitely divisible.
The telling examples that we deal with in practice do not involve much nesting of conditionals. So let us look into the tossed fair die example, and see how much we have to construct to accommodate simple examples. I will call such a construction a set-up.
(In the Appendix I will give set-ups a precise definition of set-ups as partial models, but for now, will explain them informally.)
GENERAL: MODELS AND PROPOSITION ALGEBRAS
As do Stalnaker and Lewis, I define the → operation by using a selection function s: this function s takes any proposition p and any world x into a subset of p, s(p, x).
world y is in (p –> q) if and only if s(p, y) ⊆ q
The main constraint is that s(p, x) has at most one member. It can be empty or be a unit set. Secondly, if y is in p, then s(p, y) = {y}, and if p is empty then s(p, y) is empty too. There are no other constraints.
Specifically, unlike for Stalnaker and Lewis, the selection is not constrained by a nearness relation. I do not take the nearness metaphor seriously, and see no convincing reason for such a constraint. But I use the terminology sometimes, just as a mnemonic device to describe the relevant selection: if s(p, x) = {y} I may call y the nearest p-world to x.
The result of the consequent freedom is that if p and q are distinct propositions then the functions s(p, .) and s(q, .) are totally independent — each can be constructed independently, without any regard to the others.
That allows us to build parts of models while leaving out much that normally belongs in a model.
EXAMPLE: THE OUTCOMES OF A TOSSED DIE
A die is tossed, that has six possible outcomes, there are six possible worlds: (1) is the world in which the outcome is 1, and similarly for (2), …, (6). I will call this set of worlds S (mnemonic for “six”). There is a probability function P on the powerset of S: it assigns 1/6 to each world. I will refer to the set-up that we are constructing here as Set-Up 1.
As examples I will take two propositions:
p = {(1), (3), (5)}, “the outcome is odd”. This proposition is true just in worlds (1), (3), (5).
q = {(1), (2), (3)}, “the outcome is low”. This proposition is true just in worlds (1), (2), (3)
Each of these two propositions has probability 1/2.
The idea is now to construct s(p,.) so that P(p → q) = P(q|p). I claim no intuitivebasis for the result. Its purpose is to show how the Equation can be satisfied while observing basic logic of conditionals CE.
It is clear that (p → q) consists of two parts, namely (p ∩ q) and a certain part of ~p. Can we always choose a part of ~p so that the probabilities of these two parts add up to P(q|p)? A little theorem says yes:
P(q|p) minus P(p ∩ q) ≤ P(~p).
Of course at this point we can only do so where the only probabilities in play are multiples of 1/6. Later we can look at others, and build a larger partial model. I will show how the small set-up here will emerge then as part of the larger set-up, so nothing is lost.
For a given non-empty proposition p, we need only construct (p → {x}) for each x in p, making sure that their probabilities add up to 1. The probabilities of the conditionals (p → r), for any other proposition r, are then determined in this set-up. That is so because S is finite and in any proposition algebra (model of CE),
(p → t) ∪ (p → u) = [p → (t ∪ u)]
So let us start with member (1) of proposition p, and define s(p, .) so that P(p → {(1)}) = 1/3, which is the conditional probability P({(1)} | p).
That means that (p → {(1)}) must have two worlds in it, (1) itself and a world in ~p. Therefore set
s(p, (2)) = {(1)}.
Then (p→{(1)}) = {(1), (2)} which does indeed have probability 1/3.
Similarly for the others (see the diagram below, which shows it all graphically):
so that, for example, (~p → {(2)}) = {(2), (1)}, which has probability 1/3, equal to the conditional probability P({(2)} | ~p).
What about (p –> {(6)})? There is no world x such that s(p, x) = {6}. So (p → {(6)}) is the empty set and P(p –> {(6)}) = 0, which is indeed P({(6)}|p).
Let’s see how this works for p with the other proposition q, “the outcome is low”; that is, the proposition q = {(1), (2), (3)}
(p → q), “if the outcome is odd then it is low”, is
true in (1) and (3) since they are in (p ∩ q), they are their own nearest p-worlds.
true in (2) and (4), since their nearest p-world are (1) and (3) respectively
false in (5) and (6) since their nearest p-world is (5) “odd but not low”
(~p → q), “if the outcome is even, then it is low”, is
true in (2) since it is in ~p ∩ q
true in (1) since its nearest ~p-world is (2), “even and low”
false in (3), for its nearest ~p world is (4), “even and high”
false in (4), for it is its own nearest ~p world, “even and high”
false in (5), for its nearest ~p world is (6), “even and high”
false in (6), for it is its own nearest ~p world, “even and high”
So (p → q) is {(1), (3), (2), 4)} which has probability 2/3; ……….we verify that it is P(q|p)
(~p → q) is {(2), (1)}, which has probability 1/3; ……………………..we verify that it is P(q|~p)
A DIAGRAM OF THE MODEL: selection for antecedents p and ~p
The blue arrows are for the ‘nearest p-world’ selection, and the red arrows for the ‘nearest ~p-world’ selection.
THE SECOND STAGE: EXPANDING THE SET-UP
Above I gave two examples of conditional probabilities that are not multiples of 1/6, but of 1/4 and of 1/5. In Set-Up 1 there is no conditional proposition that can be read as “if the outcome is not six then it is five”. The arrow is only partially defined. So how shall we improve on this?
Since the smallest number that is a multiple of all of 6, 4, and 5 is 60, we will need a set-up with 60 worlds in it, with 10 of them being worlds in which the die toss outcome is 1, and so forth.
So we replace (1) by the couples <1, 1>, <1, 2>, …., <1, 10>. Similarly for the others. I will write [(x)] for the set {<x, 1>, …, <x, 10>} Giving the Roman numeral X as name to {1, …, 10}, our set of worlds will no longer be S, but the Cartesian product SxX. I will refer to the set-up we are constructing here as Set-Up 2. The probability function P is extended accordingly, and assigns the same probability 1/60 to each member of SxX.
Now we can construct the selection function s(u, .) for proposition u which was true in S in worlds (1), …, (5) – read it as “the outcome is not six” – and is true in our new set-up in the fifty worlds <1,1>, …, <5, 10>. As before, to fix all the relevant probabilities, we need:
(u → [(t)]) has probability 1/5 for each (t), from 1 to 5.
Since [(t)] is the intersection of itself with u, it is part of (u → [(t)]). That gives us ten elements of SxX, but since 1/5 = 12/60, we need two more. They have to be chosen from ~u, that is, from [(6)]
Do it systematically: divide ~u into five sets and let the selection function choose their ‘nearest’ in u appropriately:
s(u, <6, 1>) = {<1,1>}. <s(u, <6,2>) = {<1,2>}
s(u, <6, 3>) = {<2,1>}. <s(u, <6,4>) = {<2,2>}
s(u, <6, 5>) = {<3,1>}. <s(u, <6,6>) = {<3,2>}
s(u, <6, 7>) = {<4,1>}. <s(u, <6,8>) = {<4,2>}
s(u, <6, 9>) = {<5,1>}. <s(u, <6,10>) = {<5,2>}
So now (u → [(5)]) = {<5, 1>, …, <5, 10>, <6,9>, <6,10>}, which has twelve members, each with probability 1/60, and so this conditional has probability 1/5, which is the right conditional probability.
It will be clear enough now, how we can similarly construct s(r, .) for proposition r read as “the outcome is less than 5”, which requires conditional probabilities equal to ¼.
HOW SET-UP 1 RE-APPEARS IN SET-UP 2
And it should also be clear how what we did with propositions p and q in the earlier set-up, with universe of worlds S emerges in this larger set-up in the appropriate way. For example, the proposition read as “the outcome is low” is now the union of [(1)], [(2)], and [(3)], and so forth.
Of course, there are new propositions now. For some of these we can construct a selection function as well. For example, the proposition (u → [(5)]) which we just looked at has twelve members, and the probability 1/12 equals 5/60, a multiple of 1/60. So we can construct the selection function s(u → [(5)]), .). Thus for any proposition t, the proposition [(u → [(5)]) → t] will be well-defined and its probability will be the relevant conditional probability. But there are other propositions in Set-Up 2 for which this can be done only by embedding this set-up in a still larger one.
As I said above, eventually we have to look upon the six possible outcomes of the die toss as slices of an evenly divided pie, this pie being infinitely divisible. That is a comment about the theorem proved for models of the logic CE in which The Equation is satisfied. But as long as our examples, the ones that play a role in philosophical discussions of The Equation, are “small” enough, they will fit into small enough set-ups.
APPENDIX.
While leaving more details to the 1976 paper, I will here distinguish the set-ups, which are partial models, from the models.
I will now use “p”, “q” etc. with no connection to their use for specific propositions in the text above.
A frame is a triple <V, F, P>, with V a non-empty set, F a field (Boolean algebra) of subsets of V, P a probability function on a field G of subsets of V, with F part of G.
A model is a quintuple <V, F, P, s, →> such that:
<V, F, P> is a frame
s (the selection function) is a function from FxV into the power set of Vsuch that
s(p, x) has at most one member
if x is in p then s(p, x) = {x}
s(Λ, x) = Λ
→ is the binary function defined on FxF by the equation
(p → q) = {x in V: s(p,x) ⊆ q}
Note: with this definition, <V, F, →> is a proposition algebra, that is, a Boolean algebra with (full or partial) binary operation →, with the following properties (where defined):
(I) (p → q) ∩ (p → r) = [p → (q ∩ r)]
(II) (p → q) ∪ (p → r) = [p → (q ∪ r)]
(III) p ∩ (p →q) = p ∩ q
(IV) (p → p) = V.
A set-up or partial model is a quintuple <V, F, P, s, →> defined exactly as for a model, except that s is a partial function, defined only on a subset of FxV. And accordingly, → is then a partial binary function on the propositions.
In the next post I will explore Set-Up 1 and Set-Up 2 further, with examples .
NOTES
I want to thank Branden Fitelson and Kurt Norlin for stimulating correspondence, which gave me the impulse to try to figure this out.
REFERENCES
The theorem referred to above is on page 289 of my “Probabilities of Conditionals”, pp. 261-300 in W. Harper and C.A.Hooker (eds.), Foundations of Probability Theory, …. Vol. 1. Reidel Pub; Dordrecht 1976.
(Note that this is not the part about Stalnaker-Bernoulli models, it is instead about the models defined on that page. There is no limit on the nesting of arrows.)
Alan Hajek, “Probabilities of Conditionals: Revisited”. Journal of Philosophical Logic 18 (1989): 423-428. (Theorem that the Equation has no finite models.)
Alan Hajek and Ned Hall, “The hypothesis of the conditional construal of conditional probability”. pp. 75-111 in Probability and Conditionals: Belief Revision and Rational Decision. Cambridge U Press 1994.