epistemic modals – Bas van Fraassen's philosophy blog

[Revised on September 29, 2022] [Remarks on atomless lattices added October 6, 2022]

It is incumbent on any treatment of epistemic modals to show what is wrong with such a statement as “It is raining but it might not be”. To prove the relevant theorem H&M introduce a special condition, Knowability.

This term, as well as intuitions about what the condition implies, immediately recalls Fitch’s Paradox of Knowability. Fitch argued that if every truth is knowable then every truth is known. That conclusion is startling because we are sure there are many propositions which are true but not known to be true, and we are inclined to think that what ever is the case could be known to be the case. But the argument is straightforward: for any proposition A consider A* = (A and it is not known that A). There is no possible world in which it is true that it is known that A*. So our ostensible certainty is refuted.

If we look for an analogue in epistemic modals, replacing “it is known that” by “it must be that” we can see immediately that Fitch is evaded in a way that he could not be evaded in classical theories of modality. For such a statement as “It is raining but it is not the case that it must be raining”, equivalent to ““It is raining but it might not be raining” is never true, not true in any possibility. So Fitch’s argument does not get off the ground.

But we also see in Holliday and Mandelkern’s theory that the possibility that every truth is known can be realized, and that this can actually play a role in illuminating epistemic modals.

Before making this precise, my thought quickly stated is that in the classical reading there are indeed many examples of true propositions to the effect that A and that it is not known that A, but not in the reading where “not” is the orthocomplement.

Propositions and the i-function

Recall here that we are dealing with the complete orthocomplemented lattice of propositions, which is formed by a closure operation on a set of possibilities. The zero and unit element are, respectively, the empty set and the set of all possibilities. For each possibility x there is a possibility i(x) such that “It must be that A” is true exactly if i(x) is in A. The first condition on the i-function is

Facticity. x is a refinement of i(x). Symbolically: x ⊑ i(x)

The second condition is

Knowability. for every possibility x there is y such that i(y) ⊑ x

Given Facticity that means that y ⊑ i(y) ⊑ x: all propositions true at x are true at i(y) and all propositions true there are true at y. What does it mean for the lattice of propositions for this condition to hold?

The simpler case is that of a complete atomistic lattice: A is an atom iff only the 0 element and A itself imply A. Recall here the earlier notation for the ‘span’ or ‘support’ of a possibility [x] = {y: y ⊑ x}. Let’s call a possibility x atomic exactly if z ⊑ x implies that [x] = [z]. So A is an atom iff A = [z] for some atomic possibility z.

If x is atomic and [x] has more than one member, those members are for all purposes in this theory the same, indiscernible, a harmless redundancy. In the example of a Euclidean space, where x is a vector, [x] = {y: y = kx for some number k}, and vectors which are multiples of each other belong to all the same subspaces and in physics do not represent different states. But the redundancy is easily removed too, so without loss of generality, I’ll add here:

Atom-uniqueness. If x is atomic then [x] has only one member.

Atomistic and atomless lattices

A lattice is atomistic (or atomic)exactly if each element is the join of a set of atoms. Specifically, in that case, then each possibility has a refinement which is atomic.

Lemma. If the lattice of propositions is atomistic then Knowability holds if and only if w = i(w) for each atomic possibility w.

Clearly if each possibility x has an atomic refinement w such that w ⊑ x and w = i(w) then Knowability holds. Conversely, if Knowability holds, and w is atomic then if y ⊑ i(y) ⊑ w then [y] = [w] = [i(y)], and so by Atom-uniqueness, w = i(w).

What if the lattice of propositions is not atomistic? Then any element x may have an infinite chain of refinements, and the condition has to be that for at least one element y in that chain, i(y) is also in that chain. But the same would apply to this y, and so we see an infinitely descending subchain of the form … y(j) ⊑ i(y(j) ⊑ y(k) ⊑ …. x.

If the lattice is atomless it is certainly possible for some element y to be such that y = i(y). In that case Knowability holds for all the elements x such that y refines x. But then, nevertheless, there is an element z that refines y, and a further element w such that w ⊑ i(w) ⊑ y, and hence also w ⊑ i(w) ⊑ x, So, if the lattice is atomless we can conclude that for each element y such that y ⊑ i(y) ⊑ x there is a distinct element w that refines y and w ⊑ i(w) ⊑ x. There is no bottom to it ….

So now what happens to Fitch’s paradox?

Suppose the lattice is atomistic, Knowability holds, x is in A, but also in ~□A. Then there is an atomic refinement w such that w ⊑ x, and all of the following are true at w: A, ~□A, □A, □~ □A. That is impossible. So there is no possibility x in which (A ∩ ~□A) is true. (Similarly, even if less transparently, if the lattice is not atomistic and Knowability holds.)

And yet of course it is the case that everything that is true in x is known at some other possibility, namely at the atomic possibility w which refines x, since i(w) ⊑ x.

Notice that argument I just gave does not go through if we just suppose that x is in A and x is not in □A. For our “not” in the metatheory is not just an orthocomplement, it is classical. In the case in which x is in A and i(x) is neither in A nor in ~A, which is not ruled out a priori. As pointed out in the previous post, the condition of i-regularity is required even to establish that {x: i(x) is in A} is a proposition. (And we must note that in H&M’s proof of 4.21 both Knowability and i-regularity are invoked).

Note. In view of the lemma it would seem that the i-function is not easily identifiable. In an atomistic lattice each element is the join of the atoms which refine it. Supposing that z is such that [z] = [x] ⊕ [y], where x and y are atomic so that x = i(x) and y = i(y), there cannot be in general a simple relation between i(z) and the pair i(x) and i(y). For the value of the i-function must in general be a ‘less informative’ possibility of which its argument is a non-trivial refinement.

NOTE. Reference is to Holliday and Mandelkern article, at https://arxiv.org/abs/2203.02872v3

A brief reflection on a recent paper, “The orthologic of epistemic modals” by Wesley Holliday and Matthew Mandelkern

The motivating puzzle p. 1
Inspiration from quantum logic p. 1
Propositions and truth p.2
Could quantum logic be the logic of natural discourse? p. 3
Why this, so far, is not enough for epistemic modals p. 4
Pure states and mixed states p. 4
An open question p. 5

1. The motivating puzzle

Here is a puzzle for you:

(Puzzle *) We, Able and Baker, A and B for short, are two propositions. Baker does not imply the negation of Able. Yet our conjunction is a self-contradiction. Who are we?

In any first or even second year logic course the right answer will be “you do not exist at all!” For if Baker does not imply the negation of Able then their conjunction could be true.

But the literature on epistemic modals furnishes examples, to wit:

“It is raining, but it might not be” cannot be true. Yet, “it might not be raining” does not imply “It is not raining”.

Such examples do rest on assumptions that may be challenged – for example, the assumption that the quoted sentences must all be true or false. But let that go. The interesting question is how such a logical situation as depicted in (Puzzle *) could be represented.

2. Inspiration from quantum logic

That sort of situation was studied in quantum logic, with its geometric models, where the propositions are represented by the subspaces.

A quantum mechanics model is built on a finite-dimensional or separable Hilbert space. In quantum logic the special properties of the infinite-dimensional, separable space do not play a role till quite late in the game. What matters is mainly that there is a well-defined orthogonality relation on this space. So it suffices, most of the time, to think just about a finite-dimensional Hilbert space (that is, a finite-dimensional inner product vector space, aka a Euclidean space).

For illustration think just of the ordinary 3-space of high school geometry but presented as a vector space. Draw the X, Y, Z axes as straight lines perpendicular to each other. The origin is their intersection. A vector is a straight line segment starting at the origin and ending at a point t, its tip; we identify this vector by its tip. The null vector 0 is the one with zero length. Vectors are orthogonal iff they are perpendicular, that is, the angle between them is a right angle.

In the diagram, the vectors drawn along the axes have tips (3, 0, 0), (0,5,0), and (0,0,2). The vector with tip (3, 5, 2) is not orthogonal to any of those.

If A is any set of vectors, its orthocomplement ~A is the set of vectors that are orthogonal to every vector in A. The subspaces are precisely the sets A such that A = ~~A. In this diagram the subspaces are the straight lines through the origin, and the planes through the origin, and of course the whole space. So the orthocomplement of the X-axis is the YZ plane. The orthocomplement of the solid arrow, with tip (3, 5, 2) is thus a plane, the one to which it is perpendicular.

About (Puzzle *). Our imaginative, intuitive picture of a 3-space provides an immediate illustration to solve (Puzzle *). In quantum logic, the propositions are the subspaces of a Hilbert space. Just let A and B be two lines through the origin that are not orthogonal to each other. Their conjunction (intersection) is {0}, the ‘impossible state’, the contradiction. But neither is in the other’s orthocomplement. In that sense they are compatible.

3. Propositions and truth

That the propositions are taken to be the subspaces has a rationale, introduced by von Neumann, back in the 1930s. The vectors represent physical states. Each subspace can be described as the set of states in which a certain quantity has a particular value with certainty. (That means: if that quantity is measured in that state, the outcome is that value, with probability 1.)

Von Neumann introduced the additional interpretation that this quantity has that value if and only if the outcome of a measurement will show that value with certainty. This became orthodoxy: here truth coincides with relevant probability = 1.

Given this gloss, we have:

subspace A is true in (the state represented by) vector v if and only if v is in A.

We note here that if vector u = kv (in our illustration, that they lie on the same straight line through the origin) then they belong to all the same subspaces. As far as truth is concerned, they are distinct but indiscernible. (For the textbook emphasis on unit vectors see note 1.)

Since the subspaces are the closed sets for the closure operation ~~ (S = the ortho complement of the orthocomplement of S), they form a complete lattice (note 2).

The self-contradictory proposition contains only the null-vector 0 (standardly called the origin), the one with zero length, which we count as orthogonal to all other vectors. Conjunction (meet) is represented by intersection.

Disjunction (join) is special. If X is a set of vectors, let [X] be the least subspace that contains X. The join of subspaces S and S’, denoted (S ⊕ S’), is [S ∪ S’]. It is a theorem that [S ∪ ~S] is the whole space. That means specifically that there is an orthonormal basis for the whole space which divides neatly into a basis for S and a basis for ~S. Thus every vector is the sum of a vector in S and a vector in ~S (one of these can be 0 of course).

One consequence is of course that, in traditional terms, the Law of Excluded Middle holds, but the Law of Bivalence fails. For v may be in A ⊕ B while not being either in A or in B.

The term “orthologic” refers to any logic which applies to a language in which the propositions form an an orthocomplemented lattice. So orthologic is a generalization of quantum logic.

4. Could quantum logic be the logic of natural discourse?

The idea, once advanced by Hilary Putnam, that the logic of natural language is quantum logic, was never very welcome, if only because learning quantum logic seemed just too hefty a price to pay.

But the price need not be so high if most of our discourse remains on the level or ‘ordinary’ empirical propositions. We can model that realm of discourse by specifying a sufficiently large Boolean sublattice of the lattice of subspaces.

For a non-trivial orthocomplemented lattice, such as the lattice of subspaces of a Hilbert space, has clearly identifiable Boolean sublattices. Suppose for example that the empirical situations that we can discern have only familiar classical logical relations. That means that, in effect, all the statements we make are, precise or vague, attributions to mutually compatible quantities (equivalently, there is a single maximal observable Q such that all humanly discernible quantities are functions of Q).

Then the logic of our ‘normal’ discourse, leaving aside such subtleties as epistemic modals, is classical, even if it is, only a (presumably large) fragment of natural language. For the corresponding sublattice is Boolean.

5. Why this, so far, is not enough for epistemic modals

Quantum states are variously taken to be physical states or information states. The paper by Holliday and Mandelkern (henceforth H&M) deals with information, and instead of “states” they say “possibilities” (note 3). Crucial to their theory is the relation of refinement:

x is a refinement of y exactly if, for all propositions A, if y is in A then x is in A.

I will use x, y, z for possibilities, which in our case will be quantum states ( those, we’ll see below, are not limited to vectors).

If we do take states to be to be vectors and propositions to be subspaces in a vector space, then the refinement relation is trivial. For if u is in every subspace that contains t then it is in [t], the least subspace to which t belongs (intuitively the line through the origin on which t lies) and that would then be the least subspace to which u belongs as well. So then refinement is the equivalence relation: u and t belong to the same subspaces. As far as what they represent, whether it is a physical state or an information state, there is no difference between them. They are distinct but indiscernible. Hence the refinement relation restricted to vectors is trivial.

But we can go a step further with Holliday and Mandelkern by turning to a slightly more advanced quantum mechanics formalism.

6. Pure states and mixed states

When quantum states are interpreted as information states, the uncertainty relations come into play, and maximal possible information is no longer classically complete information. Vectors represent pure states, and thought of in terms of information they are maximal, they are as complete as can be. But it is possible, and required (not only for just practical reasons), to work with less than maximal information. Mixtures, or mixed states, can be used to represent the situation that a system is in one of a set of pure states, with different probabilities. (Caution: though this is correct it is, as I’ll indicate below, not tenable as a general interpretation of mixed states.)

To explain what mixtures are we need to shift focus to projection operators. For each subspace S other than {0} there is the projection operator P[S]: vector u is in S if and only if P[S]u = u, P[S]u = 0 if and only if u is in ~S. This operator ‘projects’ all vectors into S.

For the representation of pure states, the job of vector u is done equally well by the projection operator P[u], which we now also refer to as a pure state.

Mixed states are represented by statistical operators (aka density matrices) which are, so to speak, weighted averages of mutually orthogonal pure states. For example, if u and t are orthogonal vectors then W = (1/2)P[u] + (1/2)P[t] is a mixed state.

Intuitively we can think of W as being the case exactly if the real state is either u or t and we don’t know which. (But see below.)

W is a statistical operator (or density matrix) if and only if there are mutually orthogonal vectors u(i) (other than 0) such that W = Σb(i)P[u(i)] where the numbers b(i) are positive and sum to 1. In other words, W is a convex combination of a set of projections along mutually orthogonal vectors. We call the equation W = Σb(i)P[u(i)] an orthogonal decomposition of W.

What about truth? We need to extend that notion by the same criterion that was used for pure states, namely that the probability of a certain measurement outcome equals 1.

What is certain in state W = (1/2)P[u] + (1/2)P[t] must be what is certain regardless of whether the actual pure state is u or t. So that should identify the subspaces which are true in W.

But now the geometric complexities return. If u and t both lie in subspace S then so do all linear combinations of u and t. So we should look rather to all the vectors v such that, if the relevant measurement probability is 1 in W then it is 1 in pure state v. Happily those vectors form a subspace, the support of W. If W = Σb(i)P[u(i)], then that is the subspace [{u(i)}]. This, as it happens, is also the image space of W, the least subspace that contains the range of W. (Note 4.)

It is clear then how the notion of truth generalizes:

Subspace S is true in W exactly if the support of W is part of S

And we do have some redundancy again, because of the disappearance of any probabilities short of certainty, since truth is construed following von Neumann. For every subspace is the support of some pure or mixed state, and for any mixed state that is not pure there are infinitely many mixed states with the same support.

While a pure state P[u] has no refinements but itself, if v is any vector in the support of W then P[v] is a refinement of W. And in general, if W’ is a statistical operator whose support is part of W’s support, then W’ is a refinement of W.

So we have here a non-trivial refinement relation.

Note: the geometric complexities. I introduced mixed states in a way seen in text books, that for example W = (1/2)P[u] + (1/2)P[t] represents a situation in which the state is either u or t, with equal probabilities . That is certainly one use (note 5).

But an ‘ignorance interpretation’ of mixtures in general is not tenable. The first reason is that orthogonal decomposition of a statistical operator is not unique. If W = (1/2)P[u] + (1/2)P[t] and W = (1/2)P[v] + (1/2)P[w] then it would in general be self-contradictory to say that the state is really either u or t, and that it is also really v or w. For nothing can be in two pure states at once. Secondly, W has non-orthogonal decompositions as well. And there is a third reason, having to do with interaction.

All of this has to do with the non-classical aspects of quantum mechanics. Well, good! For if everything became classical at this point, we’d lose the solution to (Puzzle *).

7. An open question

So, if we identify what Holliday and Mandelkern call possibilities as quantum states, we have ways to represent such situations as depicted in (Puzzle *), and we have a non-trivial refinement relation.

But there is much more to their theory. It’s a real question, whether continuing with quantum-mechanical states we could find a model of their theory. Hmmm ….

NOTES

In textbooks and in practice this redundancy is eliminated by the statement that pure states are represented by unit vectors (vectors of length 1). In foundations it is more convenient to say that all vectors represent pure states, but multiples of a vector represent the same state.
See e.g. page 49 in Birkhoff, Garrett (1948) Lattice Theory. Second edition. New York: American Mathematical Society. For a more extensive discussion see the third edition of 1967, Chapter V section 7.
Holliday, W. and M. Mandelkern (2022) “The orthologic of epistemic modals”. https://arxiv.org/abs/2203.02872v3
For the details about statistical operators used in this discussion see my Quantum Mechanics pages 160-162.
See P. J. E. Peebles’ brief discussion of the Stern-Gerlach experiment, on page 240 of his textbook Quantum Mechanics, Princeton 1992. Peebles is very careful, when he introduces mixed states starting on page 237 (well beyond what a first year course would get to, I imagine!) not to imply that an ignorance interpretation would be generally tenable. But the section begins by pointing to cases of ignorance in order to motivate the introduction of mixtures: “it is generally the case …[that] the state vector is not known: one can only say that the state vector is one of some statistical ensemble of possibilities.”

Tag: epistemic modals

Orthologic and epistemic modals (3): Knowability