Orthologic and epistemic modals

A brief reflection on a recent paper, “The orthologic of epistemic modals” by Wesley Holliday and Matthew Mandelkern 

  1. The motivating puzzle p. 1
  2. Inspiration from quantum logic p. 1
  3. Propositions and truth p.2
  4. Could quantum logic be the logic of natural discourse? p. 3
  5. Why this, so far, is not enough for epistemic modals p. 4
  6. Pure states and mixed states p. 4
  7. An open question p. 5

1.   The motivating puzzle

Here is a puzzle for you:

(Puzzle *) We, Able and Baker, A and B for short, are two propositions.  Baker does not imply the negation of Able.  Yet our conjunction is a self-contradiction.  Who are we?

In any first or even second year logic course the right answer will be “you do not exist at all!”  For if Baker does not imply the negation of Able then their conjunction could be true.

But the literature on epistemic modals furnishes examples, to wit:

“It is raining, but it might not be” cannot be true.  Yet, “it might not be raining” does not imply “It is not raining”.

Such examples do rest on assumptions that may be challenged – for example, the assumption that the quoted sentences must all be true or false.  But let that go.  The interesting question is how such a logical situation as depicted in (Puzzle *) could be represented.  

2.   Inspiration from quantum logic

That sort of situation was studied in quantum logic, with its geometric models, where the propositions are represented by the subspaces.  

A quantum mechanics model is built on a finite-dimensional or separable Hilbert space.  In quantum logic the special properties of the infinite-dimensional, separable space do not play a role till quite late in the game. What matters is mainly that there is a well-defined orthogonality relation on this space.  So it suffices, most of the time, to think just about a finite-dimensional Hilbert space (that is, a finite-dimensional inner product vector space, aka a Euclidean space).

 For illustration think just of the ordinary 3-space of high school geometry but presented as a vector space.  Draw the X, Y, Z axes as straight lines perpendicular to each other.  The origin is their intersection.  A vector is a straight line segment starting at the origin and ending at a point t, its tip; we identify this vector by its tip.  The null vector 0 is the one with zero length.  Vectors are orthogonal iff they are perpendicular, that is, the angle between them is a right angle.

In the diagram, the vectors drawn along the axes have tips (3, 0, 0), (0,5,0), and (0,0,2).  The vector with tip (3, 5, 2) is not orthogonal to any of those.

If A is any set of vectors, its orthocomplement ~A is the set of vectors that are orthogonal to every vector in A.  The subspaces are precisely the sets A such that A = ~~A.  In this diagram the subspaces are the straight lines through the origin, and the planes through the origin, and of course the whole space.  So the orthocomplement of the X-axis is the YZ plane.  The orthocomplement of the solid arrow, with tip (3, 5, 2) is thus a plane,  the one to which it is perpendicular.

About (Puzzle *).  Our imaginative, intuitive picture of a 3-space provides an immediate illustration to solve (Puzzle *).  In quantum logic, the propositions are the subspaces of a Hilbert space.  Just let A and B be two lines through the origin that are not orthogonal to each other.  Their conjunction (intersection) is {0}, the ‘impossible state’, the contradiction. But neither is in the other’s orthocomplement.  In that sense they are compatible.

3.   Propositions and truth

That the propositions are taken to be the subspaces has a rationale, introduced by von Neumann, back in the 1930s.  The vectors represent physical states.  Each subspace can be described as the set of states in which a certain quantity has a particular value with certainty.  (That means: if that quantity is measured in that state, the outcome is that value, with probability 1.)  

Von Neumann introduced the additional interpretation that this quantity has that value if and only if the outcome of a measurement will show that value with certainty.  This became orthodoxy: here truth coincides with relevant probability = 1. 

Given this gloss, we have: 

subspace A is true in (the state represented by) vector v if and only if v is in A.  

We note here that if vector u = kv  (in our illustration, that they lie on the same straight line through the origin) then they belong to all the same subspaces. As far as truth is concerned, they are distinct but indiscernible.  (For the textbook emphasis on unit vectors see note 1.)

Since the subspaces are the closed sets for the closure operation ~~ (S = the ortho complement of the orthocomplement of S), they form a complete lattice (note 2).   

The self-contradictory proposition contains only the null-vector 0 (standardly called the origin), the one with zero length, which we count as orthogonal to all other vectors.  Conjunction (meet) is represented by intersection.  

Disjunction (join) is special.  If X is a set of vectors, let [X] be the least subspace that contains X.    The join of subspaces S and S’, denoted (S ⊕ S’), is [S ∪ S’].  It is a theorem that [S ∪ ~S] is the whole space.  That means specifically that there is an orthonormal basis for the whole space which divides neatly into a basis for S and a basis for ~S.  Thus every vector is the sum of a vector in S and a vector in ~S (one of these can be 0 of course).

One consequence is of course that, in traditional terms, the Law of Excluded Middle holds, but the Law of Bivalence fails.  For v may be in A ⊕ B while not being either in A or in B.

The term “orthologic” refers to any logic which applies to a language in which the propositions form an an orthocomplemented lattice. So orthologic is a generalization of quantum logic.

4.    Could quantum logic be the logic of natural discourse?

The idea, once advanced by Hilary Putnam, that the logic of natural language is quantum logic, was never very welcome, if only because learning quantum logic seemed just too hefty a price to pay.  

But the price need not be so high if most of our discourse remains on the level or ‘ordinary’ empirical propositions.  We can model that realm of discourse by specifying a sufficiently large Boolean sublattice of the lattice of subspaces.

For a non-trivial orthocomplemented lattice, such as the lattice of subspaces of a Hilbert space, has clearly identifiable Boolean sublattices.  Suppose for example that the empirical situations that we can discern have only familiar classical logical relations.  That means that, in effect, all the statements we make are, precise or vague, attributions to mutually compatible quantities (equivalently, there is a single maximal observable Q such that all humanly discernible quantities are functions of Q).  

Then the logic of our ‘normal’ discourse, leaving aside such subtleties as epistemic modals,  is classical, even if it is, only a (presumably large) fragment of natural language.  For the corresponding sublattice is Boolean.

5.   Why this, so far, is not enough for epistemic modals

Quantum states are variously taken to be physical states or information states. The paper by Holliday and Mandelkern (henceforth H&M) deals with information, and instead of “states” they say “possibilities” (note 3).  Crucial to their theory is the relation of refinement:

x is a refinement of y exactly if, for all propositions A, if y is in A then x is in A.

I will use x, y, z for possibilities, which in our case will be quantum states ( those, we’ll see below, are not limited to vectors).

If we do take states to be to be vectors and propositions to be subspaces in a vector space, then the refinement relation is trivial.  For if u is in every subspace that contains t then it is in [t], the least subspace to which t belongs (intuitively the line through the origin on which t lies) and that would then be the least subspace to which u belongs as well.  So then refinement is the equivalence relation:  u and t belong to the same subspaces.  As far as what they represent, whether it is a physical state or an information state, there is no difference between them.  They are distinct but indiscernible.  Hence the refinement relation restricted to vectors is trivial.

But we can go a step further with Holliday and Mandelkern by turning to a slightly more advanced quantum mechanics formalism.       

6.   Pure states and mixed states

When quantum states are interpreted as information states, the uncertainty relations come into play, and maximal possible information is no longer classically complete information.  Vectors represent pure states, and thought of in terms of information they are maximal, they are as complete as can be.  But it is possible, and required (not only for just practical reasons), to work with less than maximal information.  Mixtures, or mixed states, can be used to represent the situation that a system is in one of a set of pure states, with different probabilities.  (Caution: though this is correct it is, as I’ll indicate below, not tenable as a general interpretation of mixed states.)

To explain what mixtures are we need to shift focus to projection operators.  For each subspace S other than {0} there is the projection operator P[S]: vector u is in S if and only if P[S]u = u, P[S]u = 0 if and only if u is in ~S. This operator ‘projects’ all vectors into S.

For the representation of pure states, the job of vector u is done equally well by the projection operator P[u], which we now also refer to as a pure state.  

Mixed states are represented by statistical operators (aka density matrices) which are, so to speak, weighted averages of mutually orthogonal pure states.  For example, if u and t are orthogonal vectors then W = (1/2)P[u] + (1/2)P[t] is a mixed state. 

 Intuitively we can think of W as being the case exactly if the real state is either u or t and we don’t  know which.  (But see below.)

W is a statistical operator (or density matrix) if and only if there are mutually orthogonal vectors u(i) (other than 0) such that W = Σb(i)P[u(i)] where the numbers b(i) are positive and sum to 1.  In other words, W is a convex combination of a set of projections along mutually orthogonal vectors.  We call the equation W = Σb(i)P[u(i)] an orthogonal decomposition of W.  

What about truth?  We need to extend that notion by the same criterion that was used for pure states, namely that the probability of a certain measurement outcome equals 1.  

What is certain in state W = (1/2)P[u] + (1/2)P[t] must be what is certain regardless of whether the actual pure state is u or t. So that should identify the subspaces which are true in W.

But now the geometric complexities return.  If u and t both lie in subspace S then so do all linear combinations of u and t.  So we should look rather to all the vectors v such that, if the relevant measurement probability is 1 in W then it is 1 in pure state v.  Happily those vectors form a subspace, the support of W.  If W = Σb(i)P[u(i)], then that is the subspace [{u(i)}]. This, as it happens, is also the image space of W, the least subspace that contains the range of W. (Note 4.)

It is clear then how the notion of truth generalizes:

            Subspace S is true in W exactly if the support of W is part of S

And we do have some redundancy again, because of the disappearance of any probabilities short of certainty, since truth is construed following von Neumann.  For every subspace is the support of some pure or mixed state, and for any mixed state that is not pure there are infinitely many mixed states with the same support.

While a pure state P[u] has no refinements but itself, if v is any vector in the support of W then P[v] is a refinement of W.  And in general, if W’ is a statistical operator whose support is part of W’s support, then W’ is a refinement of W.  

So we have here a non-trivial refinement relation.

Note: the geometric complexities.  I introduced mixed states in a way seen in text books, that for example W = (1/2)P[u] + (1/2)P[t] represents a situation in which the state is either u or t, with equal probabilities .  That is certainly one use (note 5).

But an ‘ignorance interpretation’ of mixtures in general is not tenable. The first reason is that orthogonal decomposition of a statistical operator is not unique.  If W = (1/2)P[u] + (1/2)P[t] and W = (1/2)P[v] + (1/2)P[w] then it would in general be self-contradictory to say that the state is really either u or t, and that it is also really v or w.  For nothing can be in two pure states at once.  Secondly, W has non-orthogonal decompositions as well.  And there is a third reason, having to do with interaction.  

All of this has to do with the non-classical aspects of quantum mechanics.  Well, good!  For if everything became classical at this point, we’d lose the solution to (Puzzle *).

7.   An open question

So, if we identify what Holliday and Mandelkern call possibilities as quantum states, we have ways to represent such situations as depicted in (Puzzle *), and we have a non-trivial refinement relation.

But there is much more to their theory.  It’s a real question, whether continuing with quantum-mechanical states we could find a model of their theory.  Hmmm …. 

NOTES

  1. In textbooks and in practice this redundancy is eliminated by the statement that pure states are represented by unit vectors (vectors of length 1).  In foundations it is more convenient to say that all vectors represent pure states, but multiples of a vector represent the same state.
  2. See e.g. page 49 in Birkhoff, Garrett (1948) Lattice Theory.  Second edition.  New York: American Mathematical Society.  For a more extensive discussion see the third edition of 1967, Chapter V section 7. 
  3. Holliday, W. and M. Mandelkern (2022)  “The orthologic of epistemic modals”.  https://arxiv.org/abs/2203.02872v3
  4. For the details about statistical operators used in this discussion see my Quantum Mechanics pages 160-162.
  5. See P. J. E. Peebles’ brief discussion of the Stern-Gerlach experiment, on page 240 of his textbook Quantum Mechanics, Princeton 1992.  Peebles is very careful, when he introduces mixed states starting on page 237 (well beyond what a first year course would get to, I imagine!) not to imply that an ignorance interpretation would be generally tenable.  But the section begins by pointing to cases of ignorance in order to motivate the introduction of mixtures:  “it is generally the case …[that] the state vector is not known: one can only say that the state vector is one of some statistical ensemble of possibilities.”

1 thought on “Orthologic and epistemic modals”

Leave a comment