Objective Chance → Moore’s Paradox

A Moore Statement (one that instantiates Moore’s Paradox) is a statement that could be true, but could not be believed.   For example, “It is raining but I don’t believe that it is raining”.

We find interesting new varieties of such statements when we replace the intuitive notion of belief with subjective probability.  Then there are two kinds of Moore Statements to be distinguished:


An Ordinary Moore Statement is one that could be true, but cannot have probability one.
Strong Moore Statement is one that could have positive probability, but could not have probability one.

When we introduce statements about objective chance there are Moore Statements in our language.  Consider first the following (not a Moore statement) said when about to toss a die:

[1] The number six won’t come up, but the chance that six will come up is 1/6.

On this occasion both conjuncts can be true.  The die is fair, so the second conjunct is true, and when we have tossed the die we may verify that our prediction (the first conjunct) was true as well.

Moreover, [1] can be believed, perhaps by a gambler who bet that the outcome will be odd and is feeling lucky.  Or at least he could say, even with some warrant, that it seems likely (or at least a little likely) that [1] is the case.  The gambler could even say (and who could disagree, if the die is known to be fair?) that the probability that [1] is true is 5/6!

The way I will symbolize that is:            P(~Six & [ch(Six) = 1/6]) = 5/6.

In this sort of example we express two sorts of probability, one subjective and one objective.  Are there some criteria to be met?  Is there to be some harmony between the two?

Like so much else, there are some controversies about this.  I propose what I take to be an absolutely minimal constraint:

Minimal Harmony.  P(ch(A) > 0) = 1 implies P(A) > 0         If I am sure that there is some positive chance that A then it seems to me at least a little likely that A.

I really cannot imagine someone seriously, and rationally, saying anything like 

“I am certain that there is some chance that the six will come up, but I am also absolutely certain that it will not happen”.  

Except a truly deluded gambler, with a gambling strategy sure to lead to eventual ruin?    

To construct a Moore Statement we only need to modify [1] a little:

[2] The number six won’t come up, but the chance that six will come up is not zero

~Six & ~[ch(Six) = 0]

That [2] could be true we can argue just like we did for [1].  But [2] is a Moore Statement for it could not have subjective probability 1, by the following argument.

Assume that P([2]) = 1.  Then:

  1. P(~Six) = 1
  2. P(Six) = 0
  3. P(~[ch(Six) = 0]) = 1
  4. ~[ch(Six) = 0] is equivalent to [ch(Six) > 0]
  5. P(ch(Six) > 0) = 1

contradiction between 2. and 5, violation of principle Minimal Harmony.

Here 1. and 3. follow from the assumption directly.  For 4. note that the situation being modeled here is the tossing of a die with chance defined for the six possible outcomes of that toss.

Not closed under conditionalization

This means also that [2] is a statement on which you cannot conditionalize your subjective probability, in the sense that if you do, your posterior opinion will violate Minimal Harmony.

So we have here another case where the space of admissible probability functions is not closed under conditionalization. 

I will make all this precise in the Appendix.

REFERENCE

My previous post called ‘Stalnaker’s Thesis → Moore’s Paradox’

APPENDIX.  Semantic analysis: language of subjective probability and assessment of chance

As an intuitive guiding example we can think of a model of a tossed die.  There is a set of possible worlds, and in each there is a die (fair or loaded in some fashion) that is tossed and a number that is the outcome of the toss.  To represent the die we need only the corresponding chance function, e.g. the function that assigns 1/6 to the set of worlds in which the outcome is x (for x = 1, 2, 3, 4, 5, 6).  Then, a special feature of this sort of model, there is the set of probability functions on these worlds, representing the different subjective probabilities one might have for (a) what the outcome is, and (b) in what fashion the die is loaded.

Definition.  A probability space M is a triple <K, F, PP> where K is a non-empty set, F is a Borel field of subsets of K, and PP is a family of probability measures with domain F.  

The members of K we call “worlds” and the members of F, the ‘measurable sets’, we call propositions.

Definition.   A subset PP* of PP in probability space M = <K, F, PP> is closed under conditionalization iff for all P in PP* and all elements A of F, P( -|A) is in PP* if p(A) > 0

Definition.  probability space with chance M is a quadruple <K, ch, F, PP> where <K, F, PP> is a probability space and ch is a function that assigns to each world w in K a probability function ch(w) defined on F. 

Definition.  For world w in K, GoodProb( w) = {P in PP: for all A in F,   if P(ch(w)(A) > 0) = 1 then P(A) > 0}.

Theorem.  GoodProb( w) is not closed under conditionalization.

Proved informally the Moore Paradox way, in the body of this post.     

The relevant language has as vocabulary a set of atomic sentences, connectives & and ~, propositional operators (subnectors, in Curry’s terminology) P and ch, relational symbols = and >, and a set of numerals including 0

There is no iteration or nesting of P or ch, which form terms from sentences.

Simultaneous inductive definition of the set of terms and sentences:

  1. An atomic sentence is a sentence
  2. If A is a sentence then ~A is a sentence
  3. If A, B are sentences then (A & B) is a sentence
  4. If A is a sentence and no terms occur in A then ch(A) is a term 
  5. If A is a sentence and P does not occur in A then P(A) is a term
  6. If t is a term and n is a numeral then (t = n) and (t > n) are sentences.

Truth conditions for sentences:

For M =  <K, ch, F, PP> a probability space with chance, and P a member of PP, a P-admissible interpretation ||…. || of the language in M is a function that maps the sentences to propositions, and numerals to numbers (with  mapped to 0), subject to the conditions:

  1. ||A & B|| = ||A|| ∩ ||B||
  2. ||~A|| = K – ||A||
  3. ||ch(A) = n|| = {w in K: ch(w)(||A||) = ||n||}
  4. ||ch(A) > n|| = {w in K: ch(w)(||A||) > ||n||}
  5. ||P(A) = n|| = {w in K: P (||A||) = ||n||}
  6. ||P(A) > n|| = {w in K: P (||A||) > ||n||}

Note that ||P(A) = n|| is in each case either K or empty, and similarly for ||P(A) > n||.

We call a sentence A true in world w exactly if w is a member of ||A||.

For example, if A is an atomic sentence then there is no constraint on ||A|| except that it is a proposition.  And then sentence P(A & ch(A) > 0) = n is true under this interpretation (in all worlds) exactly if P assigns probability ||n|| to the intersection of set ||A|| and the set of worlds w such that ch(w)(||A||) is greater than zero.  And otherwise that sentence is not true in any world.

Consistency of the Reflection Principle for Subjective Probability

A recent article by Cieslinski, Horsten, and Leitgeb, “Axioms for Typefree Subjective Probability” ends with a proof that the Reflection Principle cannot be consistently added to the axiomatic untyped probability theory which they present. 

On the other hand, Haim Gaifman’s “A Theory of Higher Order Probabilities” can be read, despite the glaring difference in interpretation, as establishing the consistency of the Reflection Principle.  

Gaifman’s theory is not untyped, and Gaifman’s approach is not axiomatic but model-theoretic. Thus it stays much closer to the original, informal presentation of the Reflection Principle.  But it is still noticeably abstract.  We can think of his models roughly like this:  certain sets of possible worlds are propositions, and there is a function pr which serves to select those propositions that can express factual statements of form “My (or, the agent’s) probability for A equals r”.

What I would like to do here is present a similar theory, staying in closer touch with the original presentation of the Reflection Principle, and entirely explicit about the way the opinion I currently express (about A, say) is constrained to harmonize with my opinions about how that opinion (about A) could change in time to come.

Introduction

The Reflection Principle purports to be an additional criterion of synchronic coherence: it relates current opinion to other current opinions.  The principle has both a general form (the General Reflection Principle), but also a form specifically for agents who have opinions about their own (current and/or future) doxastic states. The latter was the original formulation, but should now properly be called the Special Reflection Principle.  I will formulate both forms precisely below.

Satisfying Reflection does not require any relation between one’s actual opinions over time.  Nevertheless it is pertinent also for diachronic coherence, because it is a constraint on the agent’s current expectation of her future opinions, and because a policy for managing one’s opinion must preserve synchronic coherence.  

So a minimal probability model, of an agent whose opinion satisfies Reflection, will consist of a probability function P with a domain that includes this sort of proposition:

(Q)   A & my opinion at (current  or future) time t is that the probability of A equals r.

I symbolize the second conjunct as pt(A) = r.  Hence, symbolically,

            (Q) A & pt(A) = r.

Statement pt(A) = r is a statement of fact, true or false, about the agent’s doxastic state at time t.  The agent can express opinions about this, as about any other facts.  

In contrast I  will use capital P to stand for the probability function that encodes the agent’s opinion.  This is the opinion that she expresses or would express with statements like “It seems twice as likely as not (to me) that it will snow tonight”.  So the sentence P(A) = r is one the agent uses to express such an opinion, and she does this in first-person language.  

The (special) Reflection Principle implies a constraint on the opinion expressed in form P(A & pt(A) = r), which relates the opinion expressed about A to the factual statement that the agent has that opinion. 

There is in the corresponding language no nesting: nothing of form P( … P …).  Whenever the agent expresses an opinion, it is an opinion about matters of fact.

We can proceed in two stages.  The first is just to see what the more modest General Reflection Principle is, and how it is to be satisfied. Then we can build on that to do the same for the Special Reflection Principle.  I will focus on modeling, and — except at one point — just take it that the relation to a corresponding language will be sufficiently clear.

Stage 1: General Reflection

My current probability for A must lie within the range spanned by the probabilities for A that I may have or come to have at any time t (present or future), as far as my present opinion is concerned.

To illustrate:  I am a weather forecaster and realize that, depending on whether a certain storm front moves in during the night, my forecast tomorrow morning will be either 0.2 or 0.8 chance of rain.  Then my present forecast for rain must be a chance x of rain tomorrow with x a number in the open interval (0.2, 0.8).

basic model  to represent an agent who satisfies the General Reflection Principle will be the quadruple M = < S, F, TPROB, Pin>, with its elements specified as follows.

T, the set of times,  is a linearly ordered finite or countable set with first member (the times).  For each t in T, TPROB(t) is a finite set of probability functions.  These are functions defined on a field F of sets in space S, with F having S itself as a member.  The members of F represent propositions about which, at any time t, I have an opinion, and the members of TPROB(t) are the opinions I could have at time t. 

= <S, F> I will call the basic space.  I will use A, B, … for members of F, which I will also call the elementary propositions.  The set of probability functions defined on the space  = <S, F> I will call Sp.

At the initial time the agent expresses an opinion, which for now I designate as Pin, consisting in probabilities both for the events represented in space S and about how likely she is to have at time t the various opinions represented in TPROB(t).

The General Reflection Principle requires that for all A in F, Pin(A) is within the span (convex closure, convex hull) of the set {p(A): p is in TPROB(t)}. I will designate that convex closure as [TPROB(t)].  The members of TPROB(t) are the vertices of [TPROB(t)].

Since Pin assigns probabilities to the members of TPROB(t) which are defined on the domain of Pin itself.  General Reflection then implies that Pin is a mixture (convex combination) of those members, with the weights thus assigned:

Pin(A) = ∑ {Pin(p)p(A): p in TPROB(t)}

Equivalently, <S, F, Pin > is a probability space, and as it happens, for each t in T, there are appropriate weights such that Pin is a convex combination of the members of TPROB(t).  

Pin cannot be more than one thing, so those convex combinations must produce, for each time t, the same initial opinion.  We can ensure that this is possible by requiring that for all t and t’,  [TPROB(t’)] = [TPROB(t)].  Of course these sets TPROB(t) can be quite different for different times t; the vertices are different, my opinions are allowed to change.  And specifically, I will later on have some new certainties, for example after seeing the result of an experiment.  What this constraint on the span of foreseen possibilities about my opinion implies for certainties is this:  

if today I am not certain whether A,  then, if I foresee a possibility that I will become certain that A at a later time, then I foresee also a possibility that I will become certain of the opposite at that time.

SUMMARY:  In this construction so far we have Pin defined on a large family of distinct sets, namely the field F of elementary propositions, and each of the sets TPROB(t), for t in T.  

The construction guarantees that Pin, in basic model M = <S, F, TPROB, Pin> satisfies the General Reflection principle.  

But we have not arrived yet at anything like (Q), and we have not yet given any sense to ‘pt(A) = r’.  This we must do before we can arrive at a form in which the Special Reflection Principle is properly modeled.

Stage 2: Special Reflection

The function Pin cannot do all that we want from it, for we need to represent opinions that relate the agent’s probabilities for events in space S to the probabilities assigned to those events by opinions that the agent may have at various (other) times.

Intuitively, (pt(A) = r) is the case exactly if the ‘actual’ opinion at time t is represented by a function p in TPROB(t) such that p(A) = r.  In general there may either no, or one, or many members of TPROB(t) which assign probability r to A.  

So the proposition in question is thus:

(pt(A) = r)  =   {p in TPROB(t): p(A) = r}

Since Pin is defined for each p in TPROB(t), Pin assigns a probability to this proposition:

            Pin(pt(A) = r)  = ∑{Pin(p): p(A) = r and p is in TPROB(t)}.  

But what is not well-defined at this point is a probability for the conjunction (*), mentioned above,  since A is a member of field F and (pt(A) = r) is a member of a quite different field, of subsets of TPROB(t). 

We must depart from the minimalist construction in the preceding section, and extend the function Pin  to construct a function P which is well-defined, for each time t, on a larger space.  This process is what Dick Jeffrey called Superconditioning. 

I have explained its relevant form in the preceding post, with an illustration and intuitive commentary.  So I will here proceed a bit more formally than in the preceding post and without much intuitive explanation.  

NOTE.  At this point we should be a bit more explicit about how the model relates to a corresponding language.  Suppose L is a language of sentential logic, and is interpreted in the obvious way in model M:  the semantic value [[Q]] of a sentence Q in L is an elementary proposition, that is, a subset of S, a member of field F.  

As we now build a larger model, call it M*, by Superconditioning, I need to have a notion of something in M* being ‘the same proposition’ as a given elementary proposition in M.  I will use the * notation to do that:  there will be relation * between M and M* such that a sentence Q which has value [[Q]] in M  has semantic value [[Q]]* in M*.  

Quick overview of the final model, restricted to a specific time t:  

Given: the basic model defined above, to which we refer in the description of final model M*.  

M*(t) = <S*, F*, TPROB*(t), P>, with

S* = S x TPROB(t)

If A is in F then A* = {<x, p>:  x is in  A, p is in TPROB(t)}, 

equivalently, A* = A x TPROB(t)

F* is a field of subsets of S* which includes {A*: A is in  F}

TPROB*(t) and P, defined on F*, are such that for all A in F, P(A*) = Pin(A)

Construction of the final model, for specific time t

 We focus on a specific time t, but the procedure is the same for each t in T.  Let TROB(t) = {p1, …, pn}.    Each of these probability functions is defined on the space S.  

But now we will think instead about the combination of each of these probability functions with as a separate entity.

For each j, from 1 to n, there is a set Sj = {<x, pj>}: x in S}.  Equivalently, Sj =  S x {pj}

We define:  

            for A in F, A= {<x, pj>:  x is in A},  

            the field Fj = {Aj : A is in F}.  

Clearly  Sj = <Sj, Fj> is an isomorphic copy of S, disjoint from Sk unless j = k..

            S* = <S*, F*> is the sample space with S* = ∪{S: j = 1, …, n}.  

Equivalently, S* = S x TPROB(t)

            F* is the least field of subsets of S* that includes S* and includes ∪{Fj: j = 1, …, n}.  

The sets Sj therefore belong to F* and are the cells in a partition of S*.  (These cells represent the distinct situations associated with the different probability functions pj, j = 1, …, n.)

Equivalently, F* is the closure of ∪{Fj: j = 1, …, n} under finite union.  This is automatically closed under finite intersection, since each field Fj is closed under intersection, and these fields are disjoint.  F* has S* as a member, because S* is the union of all the cells.  And the infimumof F*  is Λ because that is a member of each cell; note also that  Λx TPROB(t) is just  Λ.

Clearly, all members of F* are unions of subsets of those cells, specifically finite unions of sets Asuch that A is in F, for  certain numbers k between 1 and n, inclusive.

For A in F, we define A* = ∪{A:  j = 1, …, n}.  Clearly, A* = {<x, p>: x in A, p in TPROB(t)}

 The function f: A –> A* is a set isomorphism between F and F*.  For example,

A* ∩ B*    = [∪{A:  j = 1, …, n}] ∩ [ ∪{B:  j = 1, …, n}]  

                  = ∪ { Aj ∩ B:  j = 1, …, n}] 

                  =  (A ∩ B)*

Now we come to the probabilities.

Definition.   pj* is a probability function on Sj defined by pj*(Aj) = pj(A) for each proposition A in S.  

            TPROB* = { pj*| j = 1, …, n}

Looking back once again to our basic model we recall that there are positive numbers bj for j = 1, …, n, summing to 1 such that Pin = ∑{bjpj: j = 1, …, n}.  

We use these same numbers to define a probability function P on sample space S* as follows:

            For j = 1, …,n

  1. P(Sj) = bj
  2. for each A in F, P(Aj|Sj) = pj*(Aj).  
    1. Equivalently, for each A in F, P(A* ∩ Sj) = P(Aj) = P(Sj)pj*(Aj).  
  • P is additive: if A and B are disjoint members of F* then P(A ∪ B) = P(A) + P(B)

Since all members of F* are finite unions of members of the cells Sj, j = 1, …, n it follows that P is defined by this on all members of F*

It is clear that 3. does not conflict with 2. since pj* is additive.  Since the weights bj are positive and sum to 1, and each function pj* is a probability function which assigns 1 to Sj it follows that P is a probability function with domain F*, and is the appropriate convex combination of the functions pj*.

P(A*) = ∑ {P(A* ∩ Sj): j = 1, …, n} 

= ∑{P(Aj): j = 1, .., n}

= ∑bjpj*(Aj)

= ∑bjpj(A)

= Pin(A)

About the Special Reflection Principle

Define:

(pt(A) = r) = ∪{Sj : P(A*|Sj) = r}

Equivalently,

(pt(A) = r) = ∪{Sj : pj*(Aj) = r}

Since TPROB*(t) is finite, we can switch to a list:

(pt(A) = r)  =    ∪{Sj : j = k, .., m}

            P(pt(A) = r)  =   ∑ P{Sj : j = k, .., m} = ∑ {bj: j = k, …,m}

With this in hand we now calculate the probability of the conjunction (Q)

A* ∩ (pt(A) = r)  =  A* ∩ ∪{Sj : j = k, .., m}

                                = ∪{A ∩ Sj : j = k, .., m}

                                = ∪{Aj : j = k, .., m}

            P(A* ∩ pt(A) = r)  =  ∑{ P(Aj ): j = k, .., m}

                                              = ∑ {P(Sj)pj*(Aj): j = k, .., m}

                                            =  ∑{bjpj*( Aj): j = k, .., m}

                                          =  r∑{bj: j = k, .., m}  

because for each j = k, …, m, pj*( Aj) = r.

Given both these results, and the definition of conditional probability, we arrive at:

            P(A* | pt(A) = r) = r, if defined, that is, if P(pt(A) = r) > 0.

the Special Reflection Principle.

NOTES

1]  The same formalism can have many uses and interpretations — just like, in physics, the same equation can represent many different processes.  Of course, here “the equation” refers just to the mathematical form, with no reference to meaning or interpretation.

In that sense the Reflection Principle appeared first (as far as I can remember) as Miller’s Principle, connecting subjective probability with objective chance, and used in that sense by David Lewis in his theory thereof.  

Then Haim Gaifman, who uses the notation and pr,  gave Miller’s Principle the interpretation that the person expressing her opinion P takes pr to be the opinion of someone(s)  or something(s) recognized as expert(s), to which she defers.  I have drawn on Gaifman’s theory with that interpretation elsewhere, to give a sense to acceptance of a scientific theory. 

2] But the possibility of this sort of reading, which I had mentioned in “Belief and the Will” only to dismiss it for the issue at hand, did promote a misreading of the Reflection Principle.  (As David Christensen did, for example.)  It would clearly be irrational for me to defer to my future opinion except while supposing that I will then be both of sound mind and more knowledgeable than I am now.  But it is not irrational even now to expect myself to be both of sound mind and more knowledgeable, as a result of the sort of good management of my opinion over time, on the basis that I am committed to do so.  And this, all the while knowing that I may either be interrupted in this management by events beyond my control or by interrupting myself, in the course of gaining new insights.   

This is exactly of a piece with the fact that I can morally promise, for example, to protect someone, and expect myself to keep my promise, and morally expect others to rely on my promise, while knowing — as we all do —  the general and irremediable fact that, due to circumstances presently unpredictable, I may fail to do so, either because of force majeure or because of overriding moral concerns.  In epistemology must strive for the same subtlety as in ethics.

3] See previous post, “Conditionalizing on a combination of probabilities” for Jeffrey’s concept of Superconditioning and its relation to the informal Reflection Principle.

REFERENCES

Cieslinski, Cezary,  Leon Horsten, and Hannes Leitgeb (2022) “Axioms for Typefree Subjective Probability”.  arXiv:2203.04879v1

Gaifman, Haim (1988)  “A Theory of Higher Probabilities”.  Pages 191-219 in Brian Skyrms and William L. Harper (eds.) Causation, Chance and Credence.  Dordrecht: Kluwer, 1988.

Van Fraassen, Bas C. (1995)  “Belief and the Problem of Ulysses and the Sirens.”  Philosophical Studies 77: 7–37.

Deontic logic: Horty’s gambles (2)

In the second part of his 2019 paper Horty argues that there is a need to integrate epistemic logic with deontic logic, for “ought” statements often have a sense in which their truth-value depends in part on the agent’s state of knowledge.

I agree entirely with his conclusion. But is the focus on knowledge not too strict? Subjectively it is hard to distinguish knowledge from certainty — and apart from that, when we don’t have certainty, we are still subject to the same norms. So I would like to suggest that rational opinion, in the form of the agent’s actual subjective probability, is what matters.

Here I will examine Horty’s additional examples of gambling situations with that in mind. I realize that this is not sufficient to demonstrate my contention, but it will show clearly how the intuitive examples look different through the eyes of this less traditional epistemology.

Horty’s figure 4 depicts the following situation: I pay 5 units to be offered one of two gambles X1, X2 on a coin toss. My options will be to bett Heads, to bet Tails, or Not To Gamble. But I will not know which gamble it is! You, the bookmaker will independently flip a coin to determine that, and not tell me the outcome. In the diagram shown here, X1 is the gamble on the left and X2 the gamble on the right.

On Horty’s initial analysis, if in actual fact I am offered X1 then I should bet Heads, since that has the best outcome. But as he says, rightly, I could not be faulted for not doing that, since I did not know whether I was being offered X1 or X2.

Even if the conclusion is the same, the situation looks different if the agent acts on the basis of the expectation values of the options available to him. The alternatives depicted in the diagram are equi-probable (we assume the coins are fair). So for the agent, who has paid 5 units, his net expectation value for betting Heads (in this situation where it is equally probable that he is betting in X1 or in X2) is the average of gaining 5 and losing 5. The expectation value is 0. Similarly for the option of betting Tails, and similarly for the option of Not Gambling: each has net expectation value 0. So in this situation it just is not true that the agent ought to take up any of these options — it is indifferent what he does.

Horty offers a second example, where the correct judgment is that I ought not to gamble, to show that his initial analysis failed to entail that. Here is the diagram, to be interpreted in the same way as above — the difference is in the value of the separate possible outcomes.

Reasoning by expectation value, the agent concludes that indeed she ought not to gamble. For by not gambling the payoff is 5 with certainty, while the expectation value of Betting Heads, or of Betting Tails, is 2.5.

So on this analysis as well we reach the right conclusion: the agent ought not to gamble.

Entirely in agreement with Horty is the conclusion that these situations are adequately represented only if we bring epistemology into play. What the agent ought to do is not to be equated with what it would objectively, in a God’s eye, be best for her to do. It is rather what she ought to do, given her cognitive/epistemic/doxastic situation in the world. But she cannot make rational gambling decisions in general if her knowledge (or certainty) is all she is allowed to take into account.

It would be instructive to think also about the case in which it is known that the coin has a bias, say that on each toss (inlcuding the hidden first toss) it will be three times as likely as not to land heads up. Knowledge will not be different, but betting behavior should.

A temporal framework, plus

Motivation: I have been reading John Horty’s (2019) paper integrating deontic and epistemic logic with a framework of branching time. As a preliminary to exploring his examples and problem cases I want to outline one way to understand indeterminism and time, and a simple way in which such a framework can be given ‘attachments’ to accommodate modalities. Like Horty, I follow the main ideas introduced by Thomason (1970), and developed by Belnap et al. (2001).

The terms ‘branching time’ and ‘indeterminist time’ are not apt: it is the world, not time, that is indeterministic, and the branching tree diagram depicts possible histories of the world. I call a proposition historical if its truth or falsity in a world depends solely on the history of that world. At present I will focus solely on historical propositions, and so worlds will not be separately represented in the framework I will display here.

We distinguish what will actually happen from what it is settled now about what will happen. To cite Aristotle’s example: on a certain day it is unsettled, whether or not there will be a sea-battle tomorrow. However, what is settled does not rule out that there will be a sea-battle, and this too can be expressed in the language: some things may or can happen and others cannot.

Point of view: The world is indeterministic, in this view, with the past entirely settled (at any given moment) but the future largely unsettled. Whatever constraints there are on how things may come to be must derive from what has been the case so far, and similarly for whatever basis there is for our knowledge and opinion about the future. Therefore (?), our possible futures are the future histories of worlds whose history agrees with ours up to and through now.

Among the possible futures we have one that is actual, it is what will actually happen. This has been a subject of controversy; how could the following be true:

there will actually be a sea battle tomorrow, but it is possible that there will not be a sea battle tomorrow?

It can be true if ‘possible’ means ‘not yet settled that not’. (See Appendix for connection with Medieval puzzles about God’s fore-knowledge.)

Representation: A temporal framework is a triple T = <H, R, W>, where H is a non=empty set (the state-space), R is a set of real numbers (the calendar), W is a set of functions that map R into H (the trajectories, or histories). Elements of H are called the states, elements of R the times.

(Note: this framework can be amended, for example by restrictions on what R must be like, or having the set of attributes restricted to a privileged set of subsets of H, forming a lattice or algebra of sets, and so forth.)

Here is a typical picture to help the imagination. Note, though, that it may give the wrong impression. In an indeterministic world, possible futures may intersect or overlap.

If h is in W and t in R then h(t) is the state of h at time t. Since many histories may intersect at time t, it is convenient to use an auxiliary notion: a moment is a pair <h, t> such that h(t) is the state of h at t.

An attribute is a subset of H, a proposition is a subset of W. For tense logic, what is more interesting is tensed propositions, which is to say, proposition-valued functions of time.

Basic propositions: if R is a region in the state-space H, the proposition R^(t) = {h in W: h(t) is in R} is true in history h at time t exactly if h(t) is in R. It is natural to read R^(t) as “it is R now”. If R is the attribute of being rainy then R^(t) would thus be read as “It is raining”.

I will let ‘A(t)’ stand for any proposition-valued function of time; the above example in which R is a region in H, is a special case. For any particular value of t, of course, A(t) is just a proposition, it is the function A(…) that is the tensed proposition. The family of basic propositions can be extended in many ways; first of all by allowing the Boolean set operations: A.B(t) = A(t) ∩ B(t), and so forth. We will look at more ways as we go.

Definitions:

  • worlds h and k agree through t (briefly h =/t k) exactly if h(t’) = k(t’) for all t’ ≤ t.
  • H(h, t)= {k in W: h =/ t k} is the t-cone of h, or the future cone of h at t, or the future cone of moment <h, t>.
  • SA(t)= {h in W: H(h, t) ⊆ A(t)}, the proposition that it is settled at t that A(t)

The term “future cone” is not quite apt since H(h, t) includes the entire past of h, which is common to all members of H(h, t). But the cone-like part of the diagram is the set of possible futures at for h at t.

Thus S, “it is settled that”, is an operator on tensed propositions. For example, if R is a region in the state-space then SR^(t) is true in h at t exactly if R has in it all histories in the t-cone of h. Logically, S is a sort of tensed S5-necessity operator. In Aristotle’s sea-battle example, nothing is settled on a certain evening, but early the next morning, as the fleets approach each other, it is settled that there will be a sea-battle.

There are two important notions related to settled-ness: a tensed proposition A(t) is backward-looking iff membership in A(t) depends solely on the world’s history up to and including t. That is equivalent to: A(t) is part of SA(t), and hence that A(t) = SA(t). If A is a region in H then A^(t) is backward-looking iff each future cone is either entirely inside A, or else entirely disjoint from A.

Similarly, A is sedate if h being in A(t) guarantees that h is in A(t’) for all t’ later than t (that world has, so to say, settled down into being such that A is true). Note well that a backward- looking proposition may be “about the future”, because in some respects the future may be determined by the past. Examples of sentences expressing such propositions:

“it has rained” is both backward-looking and sedate, “it will have rained” is sedate but not backward looking, and “it will rain” is neither.

Tense-modal operators can be introduced in the familiar way: “it will be A”, “it was A”, and so forth express obvious tensed propositions, e.g. FA(t) = {h in W: H(h,t’) ⊆ A for some t’> t}. More precise reckoning can also be introduced. For example if the numbers in the calendar represent days, then “it will be A tomorrow” expresses the tensed proposition TomA(t) = {h in W: h(t+1) is in A}.

Attachments

If T is a temporal framework then an attachment to T is any function that assigns new elements to any entities definable as belonging to T. The examples will make this clear.

Normal modal logic

Let T = <H, R, W> be a temporal framework and REL a function that assigns to W a binary relation on W. Define:

◊A^(t) = {h in W: for some k in W such that REL(h, k), k(t) is in A}

Read as the familiar ‘relative possibility’ relation in standard possible world semantics, a sentence expressing ◊A^(t) would be of the form “it is possible that it is raining”.

But such a modal logic has various instances. In addition to alethic modal logic, there is for example a basic epistemic logic where the models take this form. There, possibility is compatibility with the agent’s knowledge, ‘possible for all I know’. In that case a reading of ◊A^(t) would be “It is possible for all I know that it is raining”, or “I do not know that it is not raining”.

Deontic logic

While deontic logic began as a normal modal logic, it has now a number of forms. An important development occurred when Horty introduced the idea of reasons and imperatives as default rules in non-monotonic logic. There is still, however, a basic form that is common, which we can here attach to a temporal framework.

To each moment we attach a situation in which an agent is facing choices. What ought to be the case, or to be done, depends on what it is best for this agent to do. Horty has examples to show that this is not determined simply by an ordering of the possible outcomes, it has to be based on what is best among the choices. (The better-than ordering of the choices can be defined from a better-than ordering of the possible outcomes, as Horty does. But that is not the only option; it could be based for example on expectation values.)

Let T = <H, R, W> be a temporal framework and SIT a function that assigns to each moment m = <h, t> a situation, represented by a family Δ of disjoint subsets of the future cone of m, plus an ordering of the members of Δ. The cells of Δ are called choices: if X is in Δ then X represents the choice to see to it that the actual future will be in X. The included ordering ≤ of sets of histories may be constrained or generated in various ways, or made to depend on specific factors such as h or t. Call X in Δ optimal iff for all Y in Δ, if X ≤ Y then Y ≤ X. Then one way to explicate ‘Ought’ is this:

OA(t) = {h in W: for some optimal member X of Δ in SIT(<h, t>), X ⊆ A(t)}

This particular formulation allows for ‘moral dilemmas’, that is cases in which more than one cell of Δ is optimal and each induces an undefeated obligation. That is, there may be mutually disjoint tensed propositions A(t) and B(t) such that a given history h is both in OA(t) and in OB(t), presenting a moral dilemma.

An alternative formulation could base what ought to be only on the choice that is uniquely the best, and insure that there is always such a choice that is ‘best, all considered’.

Subjective probability

We imagine again an agent in a situation at each moment <h, t>, this time with opinion, represented by a probability function P<h,t> defined on the propositions. (If the state-space is ‘big’ the attributes must be restricted to a Boolean algebra (field) of subsets of the state-space, and thus similarly restrict the family of propositions.)

This induces an assignment of probabilities to tensed propositions: thus if R is a region in H, P(R^(t)) = r is true in h at t exactly if P<h, t>({h in W: h(t) is in R}) = r. Similarly, the probability FR^(t) is true in h at t, is P<h,t>({{h in W: h(t’) is in R for some t’> t}). So if R stands for the rainy region of possible states, this is the agent’s opinion, in moment <h,t>, that it will rain.

In view of the above remarks about the dependency of future on the past, the subjective probabilities will tend to be severely constrained. One natural constraint is that if h =/t h’ then P<h,t> = P<h’,t>.

In Horty’s (2019) examples (which I would like to discuss in a sequel) it is clear that the agent knows (or is certain about) which futures are possible. In that case, at each moment, the future cone of that moment has probability 1. For any proposition A(t), its probability at <h, t> equals the probability of A(t) ∩ H(h, t).

APPENDIX

I am not unsympathetic to the view that only what is settled is true. But the contrary is also reasonable, and simpler to represent. However, we face the puzzle that I noted above, about whether it makes sense to say that we have different possible futures, though one is actual, and future tense statements are true or false depending on what the actual future is.

In the Middle Ages this came up as the question of compatibility between God’s foreknowledge and free will. If God, being omniscient, knew already at Creation that Eve would eat the apple, and that Judas would betray Jesus, then it was already true then that they would do that. Doesn’t that imply that it wasn’t up to them, that they had no choice, that nothing they could think of will would alter the fact that they were going to do that?

No, it does not imply that. God knew that they would freely decide on what they would do, and also knew what they would do. If that is not clearly consistent to you — as I suppose it shouldn’t be! — I would prefer to refer you to the literature, e.g. Zagzebski 2017.

REFERENCES

(I adapted the diagrams from this website)

Belnap, Nuel; Michael Perloff, and Ming Xu (2001) Facing the Future; Agents and Choices in our Indeterministic World. New York: Oxford University Press.

Horty, John (2019) “Epistemic Oughts in Stit Semantics”. Ergo 6: 71-120.

Müller T. (2014) “Introduction: The Many Branches of Belnap’s Logic”. In: Müller T. (eds) Nuel Belnap on Indeterminism and Free Action. Outstanding Contributions to Logic, vol 2. Springer, Cham. https://doi.org/10.1007/978-3-319-01754-9_1

Thomason, R. H. (1970) “Indeterminist Time and Truth Value Gaps,” Theoria 36: 264-281. 

Zagzebski, Linda (2017) “Foreknowledge and Free Will“, The Stanford Encyclopedia of Philosophy (Summer 2017 Edition), Edward N. Zalta (ed.).