Probabilities of Conditionals (1): finite set-ups

If a theory has no finite models, can we still discuss finite examples, taking for granted that they can be represented in the theory’s models?

It is a well-known story:  Robert Stalnaker introduced the thesis, now generally called 

The Equation             P(p → q) = P(q|p), provided P(p) > 0

that the probability of a conditional is the conditional probability of consequent-given-antecedent.  Then David Lewis refuted Stalnaker’s theory.

In 1976 proposed The Equation for a weaker logic of conditionals that I called CE.  The main theorem was that any probability function P on a denumerable or finite field of sets (‘propositions‘) can be extended to a model of CE incorporating P, with an operation → on the propositions, satisfying The Equation.

To be clear:  the models for CE endowed with probability in this way are very large, the universe of possible worlds non-denumerable.  But taking a cue from the proof of that theorem,  I mean to show here that we can in practice direct our attention to finite set-ups.  These are, as it were, the beginnings of models, and they can be used to provide legitimate examples with manageable calculations.

The reason the theory’s models get so large is that the conditional probabilities introduce more and more numbers (See Hajek 1989).  

Example. Consider the possible outcomes of a fair die toss: 1, 2, 3, 4, 5, 6.  With these outcomes as possible worlds, we have 2^6 propositions, but all the probabilities assigned to them are multiples of 1/6.  So what is the conditional probability of the outcome is 5, given that it is not 6?  Probability 1/5.  What is the conditional probability of the outcome is 4, given the outcome is less than 5? Probability 1/4.  Neither is a multiple of 1/6.  

Therefore, none of those 2^6 propositions can be either the proposition that the outcome is (5 if it is not 6), or the proposition that the outcome is (4 if it is less than 5).

In the end, the only way to allow for arbitrarily nested conditionals, in any and all proposition algebras closed under , is to think of any set of outcomes that we want to model as the slices of an equitably sliced pie which is infinitely divisible.  

The telling examples that we deal with in practice do not involve much nesting of conditionals.  So let us look into the tossed fair die example, and see how much we have to construct to accommodate simple examples.  I will call such a construction a set-up.

(In the Appendix I will give set-ups a precise definition of set-ups as partial models, but for now, will explain them informally.)

GENERAL: MODELS AND PROPOSITION ALGEBRAS

As do Stalnaker and Lewis, I define the → operation by using a selection function s: this function s takes any proposition p and any world x into a subset of p,  s(p, x).  

world y is in (p –> q) if and only if s(p, y) ⊆ q

The main constraint is that s(p, x) has at most one member.  It can be empty or be a unit set.  Secondly, if y is in p, then s(p, y) = {y}, and if p is empty then s(p, y) is empty too.  There are no other constraints.  

Specifically, unlike for Stalnaker and Lewis, the selection is not constrained by a nearness relation.  I do not take the nearness metaphor seriously, and see no convincing reason for such a constraint.  But I use the terminology sometimes, just as a mnemonic device to describe the relevant selection:  if s(p, x) = {y} I may call y the nearest p-world to x.  

The result of the consequent freedom is that if p and q are distinct propositions then the functions s(p, .) and s(q, .) are totally independent — each can be constructed independently, without any regard to the others.

That allows us to build parts of models while leaving out much that normally belongs in a model.

EXAMPLE: THE OUTCOMES OF A TOSSED DIE

A die is tossed, that has six possible outcomes, there are six possible worlds:  (1) is the world in which the outcome is 1, and similarly for (2), …, (6).  I will call this set of worlds S (mnemonic for “six”).   There is a probability function P on the powerset of S: it assigns 1/6 to each world.  I will refer to the set-up that we are constructing here as Set-Up 1.

As examples I will take two propositions:

p = {(1), (3), (5)}, “the outcome is odd”. This proposition is true just in worlds (1), (3), (5).

q = {(1), (2), (3)}, “the outcome is low”.  This proposition is true just in worlds (1), (2), (3) 

Each of these two propositions has probability 1/2.

The idea is now to  construct s(p,.) so that P(p → q) = P(q|p).  I claim no intuitive basis for the result. Its purpose is to show how the Equation can be satisfied while observing basic logic of conditionals CE.

It is clear that (p → q) consists of two parts, namely (p ∩ q) and a certain part of ~p.  Can we always choose a part of ~p so that the probabilities of these two parts add up to P(q|p)?  A little theorem says yes: 

               P(q|p) minus P(p ∩ q)  ≤   P(~p).

Of course at this point we can only do so where the only probabilities in play are multiples of 1/6.  Later we can look at others, and build a larger partial model.  I will show how the small set-up here will emerge then as part of the larger set-up, so nothing is lost.

For a given non-empty proposition p, we need only construct (p → {x}) for each x in p, making sure that their probabilities add up to 1.  The probabilities of the conditionals (p → r), for any other proposition r, are then determined in this set-up.  That is so because S is finite and in any proposition algebra (model of CE),

(p → t) ∪ (p → u) = [p → (t ∪ u)]

So let us start with member (1) of proposition p, and define s(p, .) so that P(p → {(1)}) = 1/3, which is the conditional probability P({(1)} | p).

That means that (p → {(1)}) must have two worlds in it, (1) itself and a world in ~p.  Therefore set

 s(p, (2)) = {(1)}.

Then (p→{(1)}) = {(1), (2)} which does indeed have probability 1/3.

Similarly for the others (see the diagram below, which shows it all graphically):

 s(p, (4)) ={ (3)},     s(p, (6)) = {(5)}

You can see at once how we will deal with s(~p,.)

s(~p, (1)) = {(2)},    s(~p, (3)) = {(4)},    s(~p, (5)) = {(6)}

so that, for example, (~p → {(2)}) = {(2), (1)}, which has probability 1/3, equal to the conditional probability P({(2)} | ~p).

What about (p –> {(6)})?  There is no world x such that s(p, x) = {6}.  So (p → {(6)}) is the empty set and P(p –> {(6)}) = 0, which is indeed P({(6)}|p).

Let’s see how this works for p with the other proposition q, “the outcome is low”; that is, the proposition q = {(1), (2), (3)}

 (p → q), “if the outcome is odd then it is low”, is 

  • true in (1) and (3) since they are in (p ∩  q), they are their own nearest p-worlds.
  • true in (2) and (4), since their nearest p-world are (1) and (3) respectively 
  • false in (5) and (6) since their nearest p-world is (5) “odd but not low”

(~p → q), “if the outcome is even, then it is low”, is

  • true in (2) since it is in ~p ∩ q
  • true in (1) since its nearest ~p-world is (2), “even and low”
  • false in (3), for its nearest ~p world is (4), “even and high”
  • false in (4), for it is its own nearest ~p world, “even and high”
  • false in (5), for its nearest ~p world is (6), “even and high”
  • false in (6), for it is its own nearest ~p world, “even and high”

So (p → q) is {(1), (3), (2), 4)} which has probability 2/3; ……….we verify that it is P(q|p)

     (~p → q) is {(2), (1)}, which has probability 1/3; ……………………..we verify that it is P(q|~p) 

A DIAGRAM OF THE MODEL: selection for antecedents p  and ~p

The blue arrows are for the ‘nearest p-world’ selection, and the red arrows for the ‘nearest ~p-world’ selection.

THE SECOND STAGE:  EXPANDING THE SET-UP

Above I gave two examples of conditional probabilities that are not multiples of 1/6, but of 1/4 and of 1/5.  In Set-Up 1 there is no conditional proposition that can be read as “if the outcome is not six then it is five”.  The arrow is only partially defined.  So how shall we improve on this?

Since the smallest number that is a multiple of all of 6, 4, and 5 is 60, we will need a set-up with 60 worlds in it, with 10 of them being worlds in which the die toss outcome is 1, and so forth.

So we replace (1) by the couples <1, 1>, <1, 2>, …., <1, 10>.  Similarly for the others.  I will write [(x)] for the set {<x, 1>, …, <x, 10>} Giving the Roman numeral X as name to {1, …, 10}, our set of worlds will no longer be S, but the Cartesian product SxX.  I will refer to the set-up we are constructing here as Set-Up 2. The probability function P is extended accordingly, and assigns the same probability 1/60 to each member of SxX.

Now we can construct the selection function s(u, .) for proposition u which was true in S in worlds (1), …, (5) – read it as “the outcome is not six” – and is true in our new set-up in the fifty worlds <1,1>, …, <5, 10>.  As before, to fix all the relevant probabilities, we need:

(u → [(t)]) has probability 1/5 for each (t), from 1 to 5.

Since [(t)] is the intersection of itself with u, it is part of (u → [(t)]).  That gives us ten elements of SxX, but since 1/5 = 12/60, we need two more.  They have to be chosen from ~u, that is, from [(6)]

Do it systematically:  divide ~u into five sets and let the selection function choose their ‘nearest’ in u appropriately:

s(u, <6, 1>) = {<1,1>}. <s(u, <6,2>) = {<1,2>}

s(u, <6, 3>) = {<2,1>}. <s(u, <6,4>) = {<2,2>}

s(u, <6, 5>) = {<3,1>}. <s(u, <6,6>) = {<3,2>}

s(u, <6, 7>) = {<4,1>}. <s(u, <6,8>) = {<4,2>}

s(u, <6, 9>) = {<5,1>}. <s(u, <6,10>) = {<5,2>}

So now (u → [(5)]) = {<5, 1>, …, <5, 10>, <6,9>, <6,10>}, which has twelve members, each with probability 1/60, and so this conditional has probability 1/5, which is the right conditional probability.

It will be clear enough now, how we can similarly construct s(r, .) for proposition r read as “the outcome is less than 5”, which requires conditional probabilities equal to ¼.

HOW SET-UP 1 RE-APPEARS IN SET-UP 2

And it should also be clear how what we did with propositions p and q in the earlier set-up, with universe of worlds emerges in this larger set-up in the appropriate way.  For example, the proposition read as “the outcome is low” is now the union of [(1)], [(2)], and [(3)], and so forth.

Of course, there are new propositions now.  For some of these we can construct a selection function as well. For example, the proposition (u → [(5)]) which we just looked at has twelve members, and the probability 1/12 equals 5/60, a multiple of 1/60.  So we can construct the selection function s(u → [(5)]), .).  Thus for any proposition t, the proposition [(u → [(5)]) → t]  will be well-defined and its probability will be the relevant conditional probability.  But there are other propositions in Set-Up 2 for which this can be done only by embedding this set-up in a still larger one.

As I said above, eventually we have to look upon the six possible outcomes of the die toss as slices of an evenly divided pie, this pie being infinitely divisible.  That is a comment about the theorem proved for models of the logic CE in which The Equation is satisfied. But as long as our examples, the ones that play a role in philosophical discussions of The Equation, are “small” enough, they will fit into small enough set-ups.

APPENDIX.

While leaving more details to the 1976 paper, I will here distinguish the set-ups, which are partial models, from the models.

I will now use “p”, “q” etc. with no connection to their use for specific propositions in the text above.

frame is a triple <V, F, P>, with V a non-empty set, F a field (Boolean algebra) of subsets of V, P a probability function on a field G of subsets of V, with F part of G.

model is a quintuple <V, F, P, s, →> such that:

  •  <V, F, P> is a frame 
  • (the selection function) is a function from FxV into the power set of V such that
  • s(p, x) has at most one member
  • if x is in p then s(p, x) = {x}
  • s(Λ, x) = Λ

→ is the binary function defined on FxF by the equation

(p → q)  = {x in V: s(p,x) ⊆ q}

Note: with this definition, <V, F, →>  is a proposition algebra, that is, a Boolean algebra with (full or partial) binary operation →, with the following properties (where defined):

(I)        (p → q) ∩ (p → r) =  [p → (q ∩ r)]

(II)       (p → q) ∪ (p → r) = [p → (q ∪ r)]

(III)     p ∩ (p →q)  =   p ∩ q

(IV)     (p → p) = V.

set-up or partial model  is a quintuple <V, F, P, s, →> defined exactly as for a model, except that is a partial function, defined only on a subset of FxV.  And accordingly, → is then a partial binary function on the propositions.

In the next post I will explore Set-Up 1 and Set-Up 2 further, with examples . 

NOTES 

I want to thank Branden Fitelson and Kurt Norlin for stimulating correspondence, which gave me the impulse to try to figure this out.  

REFERENCES

The theorem referred to above is on page 289 of my “Probabilities of Conditionals”, pp. 261-300 in W. Harper and C.A.Hooker (eds.), Foundations of Probability Theory, ….   Vol. 1. Reidel Pub; Dordrecht 1976.

(Note that this is not the part about Stalnaker-Bernoulli models, it is instead about the models defined on that page. There is no limit on the nesting of arrows.)

Alan Hajek, “Probabilities of Conditionals: Revisited”.  Journal of Philosophical Logic 18 (1989): 423-428.  (Theorem that the Equation has no finite models.)

Alan Hajek and Ned Hall, “The hypothesis of the conditional construal of conditional probability”.  pp. 75-111 in Probability and Conditionals: Belief Revision and Rational Decision.  Cambridge U Press 1994.

Leave a comment