Orthologic and epistemic modals

A brief reflection on a recent paper, “The orthologic of epistemic modals” by Wesley Holliday and Matthew Mandelkern 

  1. The motivating puzzle p. 1
  2. Inspiration from quantum logic p. 1
  3. Propositions and truth p.2
  4. Could quantum logic be the logic of natural discourse? p. 3
  5. Why this, so far, is not enough for epistemic modals p. 4
  6. Pure states and mixed states p. 4
  7. An open question p. 5

1.   The motivating puzzle

Here is a puzzle for you:

(Puzzle *) We, Able and Baker, A and B for short, are two propositions.  Baker does not imply the negation of Able.  Yet our conjunction is a self-contradiction.  Who are we?

In any first or even second year logic course the right answer will be “you do not exist at all!”  For if Baker does not imply the negation of Able then their conjunction could be true.

But the literature on epistemic modals furnishes examples, to wit:

“It is raining, but it might not be” cannot be true.  Yet, “it might not be raining” does not imply “It is not raining”.

Such examples do rest on assumptions that may be challenged – for example, the assumption that the quoted sentences must all be true or false.  But let that go.  The interesting question is how such a logical situation as depicted in (Puzzle *) could be represented.  

2.   Inspiration from quantum logic

That sort of situation was studied in quantum logic, with its geometric models, where the propositions are represented by the subspaces.  

A quantum mechanics model is built on a finite-dimensional or separable Hilbert space.  In quantum logic the special properties of the infinite-dimensional, separable space do not play a role till quite late in the game. What matters is mainly that there is a well-defined orthogonality relation on this space.  So it suffices, most of the time, to think just about a finite-dimensional Hilbert space (that is, a finite-dimensional inner product vector space, aka a Euclidean space).

 For illustration think just of the ordinary 3-space of high school geometry but presented as a vector space.  Draw the X, Y, Z axes as straight lines perpendicular to each other.  The origin is their intersection.  A vector is a straight line segment starting at the origin and ending at a point t, its tip; we identify this vector by its tip.  The null vector 0 is the one with zero length.  Vectors are orthogonal iff they are perpendicular, that is, the angle between them is a right angle.

In the diagram, the vectors drawn along the axes have tips (3, 0, 0), (0,5,0), and (0,0,2).  The vector with tip (3, 5, 2) is not orthogonal to any of those.

If A is any set of vectors, its orthocomplement ~A is the set of vectors that are orthogonal to every vector in A.  The subspaces are precisely the sets A such that A = ~~A.  In this diagram the subspaces are the straight lines through the origin, and the planes through the origin, and of course the whole space.  So the orthocomplement of the X-axis is the YZ plane.  The orthocomplement of the solid arrow, with tip (3, 5, 2) is thus a plane,  the one to which it is perpendicular.

About (Puzzle *).  Our imaginative, intuitive picture of a 3-space provides an immediate illustration to solve (Puzzle *).  In quantum logic, the propositions are the subspaces of a Hilbert space.  Just let A and B be two lines through the origin that are not orthogonal to each other.  Their conjunction (intersection) is {0}, the ‘impossible state’, the contradiction. But neither is in the other’s orthocomplement.  In that sense they are compatible.

3.   Propositions and truth

That the propositions are taken to be the subspaces has a rationale, introduced by von Neumann, back in the 1930s.  The vectors represent physical states.  Each subspace can be described as the set of states in which a certain quantity has a particular value with certainty.  (That means: if that quantity is measured in that state, the outcome is that value, with probability 1.)  

Von Neumann introduced the additional interpretation that this quantity has that value if and only if the outcome of a measurement will show that value with certainty.  This became orthodoxy: here truth coincides with relevant probability = 1. 

Given this gloss, we have: 

subspace A is true in (the state represented by) vector v if and only if v is in A.  

We note here that if vector u = kv  (in our illustration, that they lie on the same straight line through the origin) then they belong to all the same subspaces. As far as truth is concerned, they are distinct but indiscernible.  (For the textbook emphasis on unit vectors see note 1.)

Since the subspaces are the closed sets for the closure operation ~~ (S = the ortho complement of the orthocomplement of S), they form a complete lattice (note 2).   

The self-contradictory proposition contains only the null-vector 0 (standardly called the origin), the one with zero length, which we count as orthogonal to all other vectors.  Conjunction (meet) is represented by intersection.  

Disjunction (join) is special.  If X is a set of vectors, let [X] be the least subspace that contains X.    The join of subspaces S and S’, denoted (S ⊕ S’), is [S ∪ S’].  It is a theorem that [S ∪ ~S] is the whole space.  That means specifically that there is an orthonormal basis for the whole space which divides neatly into a basis for S and a basis for ~S.  Thus every vector is the sum of a vector in S and a vector in ~S (one of these can be 0 of course).

One consequence is of course that, in traditional terms, the Law of Excluded Middle holds, but the Law of Bivalence fails.  For v may be in A ⊕ B while not being either in A or in B.

The term “orthologic” refers to any logic which applies to a language in which the propositions form an an orthocomplemented lattice. So orthologic is a generalization of quantum logic.

4.    Could quantum logic be the logic of natural discourse?

The idea, once advanced by Hilary Putnam, that the logic of natural language is quantum logic, was never very welcome, if only because learning quantum logic seemed just too hefty a price to pay.  

But the price need not be so high if most of our discourse remains on the level or ‘ordinary’ empirical propositions.  We can model that realm of discourse by specifying a sufficiently large Boolean sublattice of the lattice of subspaces.

For a non-trivial orthocomplemented lattice, such as the lattice of subspaces of a Hilbert space, has clearly identifiable Boolean sublattices.  Suppose for example that the empirical situations that we can discern have only familiar classical logical relations.  That means that, in effect, all the statements we make are, precise or vague, attributions to mutually compatible quantities (equivalently, there is a single maximal observable Q such that all humanly discernible quantities are functions of Q).  

Then the logic of our ‘normal’ discourse, leaving aside such subtleties as epistemic modals,  is classical, even if it is, only a (presumably large) fragment of natural language.  For the corresponding sublattice is Boolean.

5.   Why this, so far, is not enough for epistemic modals

Quantum states are variously taken to be physical states or information states. The paper by Holliday and Mandelkern (henceforth H&M) deals with information, and instead of “states” they say “possibilities” (note 3).  Crucial to their theory is the relation of refinement:

x is a refinement of y exactly if, for all propositions A, if y is in A then x is in A.

I will use x, y, z for possibilities, which in our case will be quantum states ( those, we’ll see below, are not limited to vectors).

If we do take states to be to be vectors and propositions to be subspaces in a vector space, then the refinement relation is trivial.  For if u is in every subspace that contains t then it is in [t], the least subspace to which t belongs (intuitively the line through the origin on which t lies) and that would then be the least subspace to which u belongs as well.  So then refinement is the equivalence relation:  u and t belong to the same subspaces.  As far as what they represent, whether it is a physical state or an information state, there is no difference between them.  They are distinct but indiscernible.  Hence the refinement relation restricted to vectors is trivial.

But we can go a step further with Holliday and Mandelkern by turning to a slightly more advanced quantum mechanics formalism.       

6.   Pure states and mixed states

When quantum states are interpreted as information states, the uncertainty relations come into play, and maximal possible information is no longer classically complete information.  Vectors represent pure states, and thought of in terms of information they are maximal, they are as complete as can be.  But it is possible, and required (not only for just practical reasons), to work with less than maximal information.  Mixtures, or mixed states, can be used to represent the situation that a system is in one of a set of pure states, with different probabilities.  (Caution: though this is correct it is, as I’ll indicate below, not tenable as a general interpretation of mixed states.)

To explain what mixtures are we need to shift focus to projection operators.  For each subspace S other than {0} there is the projection operator P[S]: vector u is in S if and only if P[S]u = u, P[S]u = 0 if and only if u is in ~S. This operator ‘projects’ all vectors into S.

For the representation of pure states, the job of vector u is done equally well by the projection operator P[u], which we now also refer to as a pure state.  

Mixed states are represented by statistical operators (aka density matrices) which are, so to speak, weighted averages of mutually orthogonal pure states.  For example, if u and t are orthogonal vectors then W = (1/2)P[u] + (1/2)P[t] is a mixed state. 

 Intuitively we can think of W as being the case exactly if the real state is either u or t and we don’t  know which.  (But see below.)

W is a statistical operator (or density matrix) if and only if there are mutually orthogonal vectors u(i) (other than 0) such that W = Σb(i)P[u(i)] where the numbers b(i) are positive and sum to 1.  In other words, W is a convex combination of a set of projections along mutually orthogonal vectors.  We call the equation W = Σb(i)P[u(i)] an orthogonal decomposition of W.  

What about truth?  We need to extend that notion by the same criterion that was used for pure states, namely that the probability of a certain measurement outcome equals 1.  

What is certain in state W = (1/2)P[u] + (1/2)P[t] must be what is certain regardless of whether the actual pure state is u or t. So that should identify the subspaces which are true in W.

But now the geometric complexities return.  If u and t both lie in subspace S then so do all linear combinations of u and t.  So we should look rather to all the vectors v such that, if the relevant measurement probability is 1 in W then it is 1 in pure state v.  Happily those vectors form a subspace, the support of W.  If W = Σb(i)P[u(i)], then that is the subspace [{u(i)}]. This, as it happens, is also the image space of W, the least subspace that contains the range of W. (Note 4.)

It is clear then how the notion of truth generalizes:

            Subspace S is true in W exactly if the support of W is part of S

And we do have some redundancy again, because of the disappearance of any probabilities short of certainty, since truth is construed following von Neumann.  For every subspace is the support of some pure or mixed state, and for any mixed state that is not pure there are infinitely many mixed states with the same support.

While a pure state P[u] has no refinements but itself, if v is any vector in the support of W then P[v] is a refinement of W.  And in general, if W’ is a statistical operator whose support is part of W’s support, then W’ is a refinement of W.  

So we have here a non-trivial refinement relation.

Note: the geometric complexities.  I introduced mixed states in a way seen in text books, that for example W = (1/2)P[u] + (1/2)P[t] represents a situation in which the state is either u or t, with equal probabilities .  That is certainly one use (note 5).

But an ‘ignorance interpretation’ of mixtures in general is not tenable. The first reason is that orthogonal decomposition of a statistical operator is not unique.  If W = (1/2)P[u] + (1/2)P[t] and W = (1/2)P[v] + (1/2)P[w] then it would in general be self-contradictory to say that the state is really either u or t, and that it is also really v or w.  For nothing can be in two pure states at once.  Secondly, W has non-orthogonal decompositions as well.  And there is a third reason, having to do with interaction.  

All of this has to do with the non-classical aspects of quantum mechanics.  Well, good!  For if everything became classical at this point, we’d lose the solution to (Puzzle *).

7.   An open question

So, if we identify what Holliday and Mandelkern call possibilities as quantum states, we have ways to represent such situations as depicted in (Puzzle *), and we have a non-trivial refinement relation.

But there is much more to their theory.  It’s a real question, whether continuing with quantum-mechanical states we could find a model of their theory.  Hmmm …. 

NOTES

  1. In textbooks and in practice this redundancy is eliminated by the statement that pure states are represented by unit vectors (vectors of length 1).  In foundations it is more convenient to say that all vectors represent pure states, but multiples of a vector represent the same state.
  2. See e.g. page 49 in Birkhoff, Garrett (1948) Lattice Theory.  Second edition.  New York: American Mathematical Society.  For a more extensive discussion see the third edition of 1967, Chapter V section 7. 
  3. Holliday, W. and M. Mandelkern (2022)  “The orthologic of epistemic modals”.  https://arxiv.org/abs/2203.02872v3
  4. For the details about statistical operators used in this discussion see my Quantum Mechanics pages 160-162.
  5. See P. J. E. Peebles’ brief discussion of the Stern-Gerlach experiment, on page 240 of his textbook Quantum Mechanics, Princeton 1992.  Peebles is very careful, when he introduces mixed states starting on page 237 (well beyond what a first year course would get to, I imagine!) not to imply that an ignorance interpretation would be generally tenable.  But the section begins by pointing to cases of ignorance in order to motivate the introduction of mixtures:  “it is generally the case …[that] the state vector is not known: one can only say that the state vector is one of some statistical ensemble of possibilities.”

An oblique look at propositions (1) a scheme

I want to explore a simple scheme, one that has many instances, and seems to fit much about what one might naturally call propositions, given all the literature in the history of logic, and philosophical logic, where they have appeared.

At the end I will give examples of different languages/logics which can be viewed as based on instances of this scheme. In the next post I will display a minimal logic (‘structural’, in that the familiar Structural Rules for valid arguments hold, and later I plan to post some more about different logics in the same vein.

The scheme

This scheme has four ingredients; a set of states, a set of propositions, and two relations, that I will denote with the symbols » and ≤.

The first relation is one between states and propositions. To keep it simple I will read “x » p” as “in state x, p is true” or just “p is true in x”. That will not always be the most natural reading, other glosses such as “x makes p true” may be more apt, so let’s keep a mental reservation about just what it means.

The second relation is subordination, or being subordinate to. It has both a simple form and a general form. The simple form is easiest to think about, and can be defined explicitly from its more general form, but in some cases it is precisely the general form that matters.

In the simple form it is a relation between states:

x is subordinate to y exactly if all propositions that are true in y are also true in x.

It is easy to deduce from this statement that being subordinate to is a partial ordering of states, which is why I settled on the usual symbol ≤ . That it is reflexive and transitive follows not from any characteristics that states or propositions could have, but just from the logic of “all” and “if … then” in its definition.

Just as a teaser, we can immediately imagine a state that is subordinate to all states, and thus makes all propositions true — a sign of inconsistency — so let us call it (if it exists!) an absurdity. If it is unique I will denote it by the capital letter Φ.

The more general form of subordination relates states to sets of states:

x is subordinate to W exactly if all the propositions that are true in all of W’s members are also true in x.

The simple form of the relation is definable: x ≤ y exactly if x ≤ {y}. In its general form we can deduce that subordination has the following characteristics:

  • if Φ exists then Φ ≤ W, because Φ is subordinate to all states
  • If x is in W then x ≤ W, for if a proposition is true in all members of W then it is true in any given member of W
  • If U is a subset of W and x ≤ W then x  ≤ U, because if p is true in all of W and U is part of W then p is true in all of U
  • If all members of U are subordinate to W, and x ≤ U, then x ≤ W (a little more complicated, see Appendix)

This is a bit redundant, but all the better to remind us of analogies to, for example, the relation of logical consequence, or semantic entailment. But let’s not be too quick to push familiar notions into this scheme, it may have very different sorts of instances.

Five instances of the scheme

The suggested ways to gloss the terms can be followed up with initial suggestions about interpretation. I will list five, chosen to be importantly different from each other.

(A) Modal logic, possible world models. Here we can take states to be possible worlds and propositions to be sets of worlds. This is a trivial instance of the scheme: x » p iff x ≤ p iff x is a member of p, and x ≤ y iff x = y. As we know though, this instance becomes interesting if structures are built on it, say by adding binary or ternary relative possibility relations.

(B) Epistemic logic, beyond modal logic. Here we can take the states to be minds, characterized by associated bodies of information. That is, minds are related to propositions by holding them true, or taking them a settled, or believing. Then the simple ordering is not trivial because x ≤ y would be the case just if x believes all that y believes, but perhaps x believes a lot more. And the absurdity Φ would be the utterly credulous mind which believes everything. In that state, as Arthur Prior said about this sort of thing, all logische Spitzfindigkeit would have come to an end.

It could be more complicated, but it stays simple if we think of the propositions as just sets of states (i.e. of minds). Proposition p is identifiable as the set of minds who believe that p. Then something interesting could play opposite to the absurdity: those minds who are as agnostic (unbelieving) as is possible: call them Zen minds:

a state x is a Zen mind precisely if x believes all and only those propositions that all minds believe.

So all states are subordinate to a Zen mind; a Zen mind is at the pinnacle of meditative detachment.

Questions to tackle then: about how minds can grow to have more beliefs (updating), or even reason from suppositions to have conditional beliefs.

(C) Truth making. The phrase “makes true” suggests a still different way, appealing to a more or less traditional notion of fact, the sort of thing that is or is not the case:

Tractatus 1: “The world is everything that is the case. The world is the totality of facts, not of things.”

After Wittgenstein’s Tractatus, Bertrand Russell took this up in his fabulous little book The Philosophy of Logical Atomism. Facts can be small, like the fact that the cat is on the mat. But they can be big, consisting of bundles of small facts. Say, a, b, c are small facts, and a.b, a.c, b.c, a.b.c are bigger facts made up of them. Being the case is a characteristic of facts; the big fact a.b is the case iff both a and b are the case.

A proposition could then be a set of facts, and we can say that p is made true by x ( x » p) exactly if y is part of x for some y in p. So proposition U = {b, b.c} is made true by b, as well as by b.c, but also by a.b, and by all the other bigger facts that contain either of its members as parts, like a.b.c, a.b.c.d, etc.

This scheme becomes interesting when we face the challenge that Raphael Demos, one of the students, posed when Russell was lecturing at Harvard: what happens to negation, are there negative facts, could there be? And that in turn points to the treatment of negation in what Anderson and Belnap, studying relevant logic, called tautological entailment.

(D) Logic of opinion. We get to something quite different if we take the states to be probability measures on a space, that consists of K, a set of worlds, and F, a family of subsets of K on which these states are defined. Let’s say that if Q is in F then there is an associated proposition, namely the set of probability measures x such that x(Q) = 1. We can define x » A, for such a proposition A, to be true if and only if x is a member of A.

Now x ≤ y is not trivial. Suppose x(A ∩ B) = 1 and y(A) = 1 but y(C) <1 for any proper subset of A. Then all the propositions that are true in y will also be true in x, but not conversely.

And there is an intriguing angle to this. One measure x can be a mixture (linear combination) of several other measures, for example x = ay + (1 – a)z. In that case A will be true in x if and only if it is true in both y and z. So then we see a case of subordination of states to sets of states: x ≤ {y, z} if x is a mixture of y and z. And more generally, all the mixtures of states that make a proposition true are subordinate to that proposition. So the propositions are convex sets of probability measures.

(E) Quantum logic. Rather different from all of these, but related to the previous example, there is a geometric interpretation, introduced by von Neumann in his interpretation of quantum mechanics. The states can be vectors and the propositions subspaces or linear manifolds — so, lines and planes that contain the origin, three-spaces, four-spaces, and so on. One vector x can be a superposition (linear combination) of some others; for example, x = ay + bz + cw, and we can make a similar point, like the one about mixtures of probability functions. But superpositions are quite different from mixtures. And the linear manifolds form a lattice that is non-Boolean.

With these five examples we have suggestions that our simple scheme will relate in possibly interesting ways to alethic modal logic, epistemic logic, truth-maker semantics (which points to relevance logic), probabilistic semantics (which has been related to intuitionistic logic), and quantum logic. Seems worth exploring …

Note on the literature

There is lots of literature on all five of the examples, but I’ll just list some of my own (no need to read them, as far as these posts are concerned; but they are all on ResearchGate).

(B) “Identity in Intensional Logic: Subjective Semantics”  (1986) (C) “Facts and Tautological Entailments” (1969) (D) “Probabilistic Semantics Objectified, I” (1981) (E) “Semantic Analysis of Quantum Logic” (1973)

                     

APPENDIX

Characteristic 4. of subordination was this:

If all members of U are subordinate to W, and x ≤ U, then x ≤ W

Remember how states relate to propositions, reading “x » p” as “p is true in x”.

Suppose that b≤ U. That is:

1. For all q (If all z in U are such that z » q, then b » q)

Now suppose that all members of U are subordinate to W:

2. For all z in U { For all q (If all y in W are such that y » q, then z » q)}

The first two “all’s” can change position, and we can write this as

3. For all q: All z in U are such that [(If all y in W are such that y » q, then z » q)

And in the conditional in the middle part, we note that z does not appear in the antecedent, so that is the same as:

4. For all q: If all y in W are such that y » q, then all z in U are such that [( z » q)]

But that putting this together with 1. we arrive at

5. For all q (If all y in W are such that y » q, then b » q)

that is to say, b is subordinate to W.