Modality and Negation before 1932

Lewis and Langford’s Symbolic Logic (1932) was the culmination of work over the previous half-century.  This is a well-studied development, and I do not have anything new to say.  But I would like to add some comments on the role of negation.

1.      Exclusion, choice, and strong negation

When Gerrit Mannoury discussed Intuitionism he placed much weight on the distinction between exclusion negation and choice negation.  The latter is a form of negating a proposition that presupposes a definite contrast class, with a meaning that is something like “not this, but one of the others”. For example, “the apple is not ripe” would typically be understood as asserting that the apple is a real apple in one of the stages before ripeness.  The context would supply that definite contrast class.  If we read that statement with “not” as exclusion negation, the contrast would be totally indefinite: the apple is raw, or it is rotten. or it is a fake porcelain apple, or it is a painted apple in a Rembrandt still life, or ….  

That distinction shed some light on the Intuitionists’ approach to infinity.  But Intuitionist negation, as it appears in Heyting’s logic, does not fit either.  Within mathematics, as the Intuitionists present it, a statement can be negated only on the basis that it leads to a self-contradiction. In the logic, the negation of A is A → f, where f is the falsum, the absolute absurdity.

I will call this third form strong negation.  Outside Intuitionism, or at least outside its original presentation in terms of proofs and refutations, we can explain it this way:  

        to assert ¬A, in the sense of a strong negation, is to assert that A is not possibly true. 

Strong negation appears in modal logic generally, and specifically in Lewis’ early work.

However, the principles that govern strong negation are saliently different from the familiar ones characteristic of choice negation.

2.      Choice negation is ‘classical’

In lattice theory the complement x’ of element x is the largest element y such that x ∧ y = 0 (‘x and y are disjoint’), if it exists.[1]  That it is the largest means that if z is any element disjoint from x then z  ≤  x’.   Clearly, if we have several negations in the language they cannot all be like that.  

In view of the many options that have been broached on this subject, it may be best to mention some suggested characteristics in various treatments of negation, and to ask which are satisfied in specific cases.

I will call a negation classical if its corresponding operation ‘ on a lattice of propositions meets three conditions:

x ∧ x’ = 0           (disjointness)

(x’)’ = x                 (involution)

x ≤ y  iff  y’ ≤ x’    (antitone)

So when negation corresponds to an operation of that sort on a lattice of propositions, then negation obeys the principles

            (A & ¬A) implies the falsum                         (Non-Contradiction)

¬¬A entails, and is entailed by A                    (Double Negation)

A entails B if and only if ¬B entails ¬A.        (Contraposition)

Choice negation is classical.  For example, if the contrast class is a range of colors then “the apple is not-not red” is true iff the apple has a color in this range that is not one of the colors other than red, hence is red.  And, taking e.g scarlet to imply red, it is also the case that being other than red (in the intended range) implies being other than scarlet.

We may also note that if negation is defined by the two-valued truth-table, it is a choice negation, hence classical.

Is Exclusion negation classical?  I’ll discuss that in the Appendix.

Strong negation is not classical.  That the law of Double Negation does not hold in Intuitionistic logic is well known.  Contraposition is closely related to Reductio ad Absurdum of which one form (negation elimination) fails in Intuitionistic logic.  Below we’ll see how Contraposition fares in normal modal logic.  

3.      A quick note on Lewis’ relation to MacColl

Lewis’ theory of strong negation and the conditional is the same as MacColl’s, whose work he acknowledges.  (Lewis 1918, p. 292: “The fundamental ideas of the system are similar to those of MacColl’s Symbolic Logic and its Applications”.)

I will go chronologically backward for a moment to discuss Lewis first and then the earlier work by MacColl.

(I will adapt their notation, or use my own, as convenient.)

4.      C. I. Lewis chose strong negation as modal primitive

Lewis published his Survey of Symbolic Logic in 1918, and there presented his logic of strict implication with as modal primitive a strong negation connective ~.  

Notation:  I will now keep ¬ for just the truth-functional (material) negation. 

 The sentence ‘~p’ is to be read as ‘it is impossible that p’.  Then the strict conditional → is defined by:  

p → q =def ~(p & ¬q)

Lewis’ strong negation is intuitively at least the same as the Intuitionist’s.    For by the above definition,

            p → (q & ~q) = ~(p & ¬ (q & ¬q)) = ~p

given that logically equivalent formulas are mutually substitutable everywhere (and in normal modal logic the falsum is equivalent to any self-contradiction).

But Lewis was not guided by any precise semantics for modality,  and his vague intuitions led him astray.  Specifically he assumed that strong negation obeys Contraposition.  He had the, at first sight so innocent looking, principle

            2.21 (~q → ~p) → (p → q)

Two years later Lewis  published an emendation (Lewis 2020).  Emil Post had pointed out to him that this principle led to the theorem

            ~p = ¬ p

thus collapsing strong negation into material negation.  Lewis commented “Mr. Post’s example which demonstrates the falsity of 2.21 is not here reproduced, since it involves the use of a diagram and would require considerable explanation.” 

With our current preference for □ as primitive, we understand his 2.21 as

            2.21*.  □(¬q ⊃ ¬◊p) → □(p ⊃ q)

In the Appendix I will sketch a possible world model that gives us a counterexample to 2.21* [Hint: focus on the case of  ¬ q being false.]

Dropping the ill-fated 2.21 did not do great damage to the logic of strict implication, which then eventually appeared in Lewis and Langford.

5.      Oskar Becker to C. I. Lewis:  nesting and iteration

But the theory of strict implication was not unchanged in other ways.  In 1930 there had appeared Oskar Becker’s monograph (Becker 1930) which discussed iterations and nestings of modal operators, with questions about how they are to be understood.  Lewis and Langford acknowledged this in their preface and wrote their famous Appendix II in response.  

Becker had proposed the following principles as possible additions to Lewis’ logic:

C10. □□p = □p

C11.  ◊p → □◊p

C12.  p → □p  (which Becker called the Brouwersche Axiom)

Lewis and Langford then defined five modal logics, S1-S5, in which (our familiar) S4 is S3 plus C10, and S5 is (our familiar) S4 plus C11 and C12.

But Becker’s questions about how C10.-C12. are to be understood were not very well answered.  We’ll see below how MacColl made a determined effort to establish the truth conditions for one case of an iterated modality, one that involves negation.

6.      Tracing it all back to Hugh MacColl

From a distance MacColl’s symbolism looks rather like arithmetic.  He introduces five modalities: true, false, certain, impossible, and variable.  Each has a Greek letter as its symbol, with ε (epsilon) for certainty  and η(eta) for impossibility, for example.  

But there seems at first to be a curious ambiguity.  The proposition that A is certain or necessary is symbolized Aε.  But then we also see ε entering as a propositional constant, for the absolutely certain, the ‘top’ or ‘unit’ of the algebra.

I think we can understand that this is not just a notational quirk.  In arithmetic, consider the sentences “3 + 2 = 5” and “32 = 9”.  The numeral “2” appears first as the name of a number, and then, in superscript, as the name of the squaring function.  We can give a good account of this: the symbol “32” the denotes the value of a certain function (exponentiation) applied to the ordered couple <3, 2>.  Superscripting is a convention to symbolize exponentiation.

modal statements, strong negation

In the same way we should understand MacColl’s formulation of modal statements.  In his commentary and examples they are clearly taken as de dicto: the assertion that it is certain that A is the assertion that A has a certain modal property, namely certainty.  The symbol “Aε” denotes the proposition which is the value of a certain function (we may call it propositional exponentiation) applied to the ordered couple <A, ε>.  The function is symbolized by the superscripting convention.

In the same way the assertion that A is impossible is symbolized Aη.  This is strong negation, it is what Lewis later symbolized as ~A.

propositional constants

But impossibility too is itself a proposition, the ‘bottom’ of the algebra, and we have the laws (MacColl uses ‘ for ordinary negation and . for conjunction):

A . ε = A                (ε is the unit, the top, the tautology)

A . η = η                (η is the bottom, the absurdity)

(A . A’)η                 (Law of Non-Contradiction)

Aε . (A’)ε = η

6.3 the strict conditional

MacColl defines a conditional:

            A : B =def  (A . B’)η   (It is impossible that A and not B)

That is exactly the definition that Lewis then chose for his strict conditional.  MacColl, who has the connective + for disjunction, points out that, equally, A : B = (A’ + B)ε, it is certain that either not-A or B.  

6.4 iterated and nested modal operators

Especially interesting is iterated propositional exponentiation, to symbolize nested modalities, a subject which, as we saw above, did not return in the literature till 1930.  MacColl writes (section 9, page 7):

The symbol ABC means (AB)C; it asserts that the statement AB belongs to the class C, in which C may denote true, or false, or possible, &c. Similarly ABCD means (ABC)D, and so on.  (MacColl 1906: 7)

So for example, (Aε)η is the statement that it is impossible that A is certain: what we would write as ~◊□A.

MacColl returns to this in section 22, where he writes “But, it may be asked, what is meant by statements of the second, third, &c., degrees, when the primary subject is itself a statement?”   What follows in that section, and in many other pasages, points to an understanding of logic as pertaining to information processing.  There is the suggestion that, for example, Aη may be a revision if A, made when new data arrive that are incompatible with A.  But I will leave that aside for now.

There is an interesting discussion of a nested modality in a later paper (MacColl 1910).  Here he finds what he takes at first blush to be an antinomy.  In his system the symbol θ stands for the modality variable, it is his term for what is possible but not certain, or equivalently, what is neither impossible nor certain.  So it could be defined:

Aθ = (Aε)’.(Aη)’ 

MacColl then writes:

The symbol Aθθ, in my system, is short for  (Aθ)θ and asserts that the statement Aθ is a variable.  The antinomy consists in the conflict of two arguments, of which the one professes to prove that the second-degree proposition Aθθ is an impossibility or self-contradiction; while the other professes to prove that it is not. (MacColl 1910: 196, with the initial “Aθ” corrected to “Aθθ”)

A first blush its natural reading looks entirely intelligible, or at least perfectly grammatical:

“It is neither certain nor impossible that it is neither certain nor impossible that A”.

But it becomes puzzling when we try to see under what conditions, or for what sort of statement A, this would be true or definitely false.

To prove the first option in the antinomy MacColm argues, first of all:

 A will itself be either certain, or impossible, or neither certain nor impossible.  If A is either certain or impossible then Aθ is clearly false.  

That seems correct, but then MacColl argues:

if A is neither certain nor impossible then it is certain (and hence not neither certain nor impossible) that Aθ.  And so again, Aθθ is false.  

Therefore Aθθ is false under all conditions, hence impossible. 

The reasoning about the case in which A is itself neither certain nor impossible, is not obviously correct.  The inference appears to rely on an unstated modal principle, perhaps one as strong as the S5 principle, that statements asserting or denying certainty or possibility are certain if true.  With such an assumption added, the first horn of the dilemma is established.

Accordingly, MacColl’s contrary argument, that Aθθ is possible, is not needed if we can push back to a prior question:  

whether, or under what conditions, Aθ is certain if true

The most reasonable attitude to take would seem to be that there are many modes of modality, pertaining to different subject matters – in some cases S5 is the correct logic and in other cases not.

7.      APPENDIX. 

[1] Is Exclusion negation classical?

Suppose we assert that the apple is not red, while intending no definite contrast whatsoever.  One option may be to assert that nevertheless, all the ways the apple might or could be, ways that do not involve its being red, form a class.  Then, just as for choice negation it will follow that not being in any of those ways implies being red.  

This ostensibly simple view may have difficulties, with vagueness and the vagueness of vagueness, or with Russel-type paradoxes, or with more general questions about whether the class could or could not be a set.  

If we are uneasy with the idea of reifying ways things might or could be as a definite class, then we will certainly begin to doubt inference by Reductio ad Absurdum (as Intuitionists do), and almost certainly Contraposition and Double Negation as well, whenever the negation in play is exclusion negation.  

In a many-valued logic that defines the connectives in terms of a ‘many-truthvalue-table’, it is possible to have a negation that is not classical.  For example, if ¬1 = 0, ¬2 = 1, ¬0 = 1 then ¬¬2 ≠ 2.  We could reckon this as an exclusion negation (‘value other than the designated value’), though a simple case.  In a language with truth-value gaps, where conjunction and disjunction are typically not functional (not compositional), it may also be possible to have a non-functional negation.

It is perhaps a failing in formal semantics to take negation (in our language in use) for granted as understood and unequivocal.  

[2] Counterexample to Lewis 2.21:  (~q → ~p) → (p → q) in normal modal logic

In our preferred notation, and assuming the Duality □¬ = ¬, we can write 2.21 as

*. □{□(¬ q ⊃ ¬ p)  ⊃ □(p ⊃ q) }

Recall that in a normal modal logic possible world model M = <W, R>, the sentence □A is true at world w in W iff A is true in all worlds in R(w).  

I will sketch a model, describing only just enough to show that it provides a counterexample to *.

The worlds w1, w2, w3, w4 in W are related as follows:

  1. w2 is in R(w1)
  2. w3 and w4 are in R(w2)
  3. For any world w in W, if w is in R(w2) then R(w) = R(w2)
  4. p and ¬q are true in w3
  5. p and q are true in w4

Argument

  1. q is true in w4
  2. q is true in all members of R(w2)                             
  3. (¬ q ⊃ ¬ p) is true in all members of R(w2)
  4. □(¬ q ⊃ ¬ p)  is true in w2
  5. (p & ¬q) is true in w3
  6. □(p ⊃ q) is false in w2                                   since w3 is in R(w2)
  7. □(¬ q ⊃ ¬ p)  ⊃ □(p ⊃ q) is false in w2,                by 4. and 6.
  8. □{□(¬ q ⊃ ¬ p)  ⊃ □(p ⊃ q) } is false in w1          since w2 is in R(w1)

Note that, for simplicity, I made clause c. stronger than it needs to be to justify line 2.  Apart from that I have left <W, R> with as little constraint as possible.

 

8.      REFERENCES 

Becker, Oskar (1930)  “Zur Logik der Modalitäten”.  Jahrbuch  für Philosophie und Phänomenologische ForschungXI: 497-548.

Gabbay, Dov M. and John Woods (2006)  Handbook of the History of Logic. Vol. 7: Logic and the Modalities in the Twentieth Century.  Amsterdam: Elsevier.

Lewis, Clarence I. and C. H. Langford (1932) Symbolic Logic. New York: The Century Company. 

Lewis, Clarence Irving (1920) “Strict implication – an emendation”. The Journal of Philosophy, Psychology, and Scientific Methods 17: 300-302. 

MacColl, Hugh  (1906) Symbolic Logic and ‘its Applications. London: Longmans, Green and Co.

MacColl, Hugh (1910) “Linguistic misunderstandings (I)”. Mind, n.s., 19: 186–199. 

Read, Stephen (1998) “Hugh MacColl and the algebra of strict implication”. Nordic Journal of Philosophical Logic 3: 59-84.

Wolenski, Jan (1998) “MacColl on Modalities”. Nordic Journal of Philosophical Logic 3:133-140.


[1] In a Boolean algebra it exists, is unique, and x ∧ x’ = 0, x v x’ = 1.  

Atomless: The Calculus of Systems of (Almost) Any Logic 

logic L is a closure operator on a set of sentences S (of a syntax SYNT). An L-theory is a subset of S closed under L, and an L-theorem is a sentence that belongs to all L-theories.  (A logic may not have any theorems.)  A sentence A is an L-consequence of set of sentences X exactly if A is in L(X).

The L-theories, being the L-closed sets, form a complete lattice CL, which Tarski called the calculus of systems of that logic.  This lattice is bounded above and below, respectively by S and L(Λ), the L-closure of the empty set.  The latter contains only the set of theorems of L, if any, and is therefore also said to be trivial.  

In a lattice, an atom is an element that is not the bottom but has no element between it and the bottom.  Precisely:

Definition.  An L-theory X is an atom of the calculus of systems of L if and only if X is not trivial but has no non-trivial proper sub-theory.

So far, so general.  I wish to remark that in almost all cases, and certainly for all familiar logics, this lattice is not atomic, in fact has no atoms at all.

Definition.  Logic L is disjunction minimal exactly if:

  1. S is a set of finite strings generated from an infinite set of propositional variables
  2. The syntax SYNT includes a binary connective such that
    1. if A, B are sentences then A B is an L-consequence of A
    1. if A is not an L-theorem, and q is a propositional variable that does not occur in A, then A v q is not an L-theorem, and A is not an L-consequence of A v q.

Result.   If logic L is at least  disjunction minimal then its calculus of systems CL has no atoms.

Argument.

Suppose X is a non-trivial L-theory, and A is a member of X that is not an L-theorem.  Since A is finite there is a propositional variable q which does not occur in A.  Then A v q is a member of X, and is not an L-theorem, and A is not an L-consequence of A v q.

Therefore L({A q}) is a non-trivial proper sub-theory of X, for it is part of X, includes A q which is not an L-theorem, and it does not include member A of X.  Hence X has a proper non-trivial sub-theory.

So, if L is at least disjunction minimal, and X is any non-trivial L-theory, then X is not an atom of calculus CL. Hence CL has no atoms.

Remark. Almost all, but not all logics in the literature are at least disjunction minimal. An exception is the Weak Kleene logic WK3 where A does not entail A v B. See Jc Beall (2016) “Off-Topic: a new interpretation of Weak Kleene logic”. Australasian Journal of Logic (13: 6), Article 1.

Modal logic: generalizing on general frames

Frames in the semantics of modal logic

In normal modal logic a frame is a couple <W, R>, with R a map of W into the powerset P(W) of W and  with the operator □ on P(W) defined by 

            □X = {w: R(w) ⊆ X}

In neighborhood semantics a frame is a couple <W, N>, with N a map of W into the powerset P(P(W)) of P(W), and with the operator □ on P(W) defined by 

            □X = {w: X is in N(w)}.

That is a generalization: in a normal modal logic frame we can define N(w) = {X: w is in □X}, which is of course the family of supersets of R(w).

general frame is a triple <W, N, P>, where <W, N> is as above, P is a Boolean algebra of subsets of W (‘the propositions’), and the operator □ is defined as above but only on P, not on P(W).  (When the corresponding syntax is interpreted in a general frame, the semantic values assigned to sentences are all in P.)

That too is a generalization: When P = P(W) we have the special case of an ordinary neighborhood frame.

These three types of frames are clearly special instances of a more general type, which I will call the truly general frames:

truly general frame is a triple <W, 𝜙, P> where 𝜙 is some mathematical entity or other, P is a Boolean algebra of subsets of W, and the operator □ is an monotone operator on P which is a function of W and 𝜙.

(I take the monotonicity of □ to be the most basic, and inalienable, characteristic of the subject of modal logic, however broadly construed.)  

This truly general type can be put to work: I will give a simple example of its use for showing the independence of certain basic principles of modal logic.

Preliminary: a little possible world story

In one kind of world all the inhabitants are enormously cheerful, elated, they think everything is just great.  I call these the Elated worlds.  In another kind of world the inhabitants are odious for their negativity and pessimism, they think everything sucks, nothing is great.  These are the Odious worlds.   To describe what is the case in these worlds we introduce sentential unary connective □, read as “It is considered great that”.  Note that for any sentence A, □A is true in all and only the Elated worlds.  So for instance □(A v ~A) is not true in all worlds.  But in other respects it’s all pretty normal.  For instance, □ is monotone: if A implies B then □A implies □B.

Representation of the example

For convenience I will represent these worlds by natural numbers, with even numbers representing the Elated worlds and odd numbers the Odious worlds.

Frame ARITH = <W, 𝜙, P> with W the set of natural numbers (starting with 1), 𝜙 its natural ordering, P just P(W), and the operator □ on P defined by □X = the set of even numbers that are members of X.

Before interpreting the modal logic syntax in ARITH let’s just note a few features of  this operator:

  • T1. □ is a monotone operator on P:  if X ⊆ Y then □X ⊆ □Y
  • T2. □X ⊆ X
  • T3. (□X ∩ □Y) = □(X ∩ Y)
  • T4. □X= □□X
  • T5. □⊥ = ⊥ 
  • T6. □W ≠ W

(The proofs are straightforward, but see Appendix for some details.)

Interpretation, logic

I take the modal sentential syntax to be familiar and will use ‘□’ for the modal connective, the symbols ‘&’ and ‘~’ for conjunction and negation,  ‘⊥’ for the falsum and ‘T’ for ‘~⊥’ (with the context preventing confusion).

model is a frame together with an interpretation of this syntax in this frame, which is an assignment of propositions (subsets of W) to the sentences, by the recipe:

|A & B| = |A|  |B|

|~A| = W minus |A|

|□A| = □|A|

|⊥| = Λ, the empty set

Validity in a model.  If M is a model, with frame <W, 𝜙, P> and interpretation | |, we write ‘╞A’ for |A| = W, and ‘A1, A2, …╞M B’ for (|A1| ∩  |A2|  ∩ … )  ⊆ |B|.

Let us consider the following principles familiar in modal logic, dividing them into two groups (roughly following Chellas’ (1980) nomenclature).

Group One. 

RM.      If A├B then □A  ⊢ □B  (Monotonicity)

M.        □(A & B) ⊢  (□A & □B)

C.         (□A & □B)  ⊢   □(A & B)

K.         □(A ⊃ B) ⊢   (□A ⊃ □B) 

N◊.      ⊢ ~□ ⊥

D.         □A     ⊢   ~□~A

T.         □A     ⊢   A

S4.       □A   ⊢   □ □A

Group Two.

RN.     If ├A then  ⊢ □A  

N.        ⊢ □T

B.         A  ⊢  □ ~□~A

S5.       ~□~A    ⊢  □~□~A

 Satisfaction and Violation in a model

I will call any such principle satisfied in model M if replacing ‘⊢’ in the principle by ‘╞M’ yields a true statement about M, and violated in M if it is not satisfied in M.

Independence.

A principle Q of modal logic is independent of principles Q1, Q2, Q3, … if there is a model M with a truly general frame such that Q1, Q2, Q3, … are sound in M and Q is violated in M.

Theorem.  The modal logic principles in Group Two are independent of the principles in Group One.

Let M be a model with frame ARITH and interpretation | |.  The proof is that all principles in Group One are satisfied in M, and all the principles in Group Two are violated in M. 

The satisfaction of Group One follows in the main from facts T1. through T5. about ARITH.  The violation of N, which implies the violation of RN, is due to fact T6., that □W is the set of even numbers, while W is the set of all natural numbers.  B is similarly violated because |□~□~T| is some set of even numbers.  For the violation of S5 we have to look to a proposition other than W, e.g the set <4] = {1, 2, 3, 4}.  If |A| = <4] then |~□~A|  contains all the odd numbers (as well as 2 and 4), while  |□~□~A| ={2, 4}.  (For more details see Appendix.)

APPENDIX

Let E be the set of even numbers.  Then to prove T1. – T5., replace in each case ‘□’ by ‘E ∩’.  The crucial fact that □W¹ W is simply that E is a proper subset of W.  Note that if we look at ARITH as a augmented neighborhood frame, N(x) is not closed under supersets but under supersets within E, with E as fixed point, and that suffices to make □ monotone.

To prove the theorem, the remarks there suffice to show that the principles in Group Two are violated.  To spell out the argument for S5, I used the intuitive notation <n] for the initial segment {1, …, n} of W, and we can equally set [ n> for the final segment {n+1, …} of W.

Then ~□~<4] =    ~□[5> = ~{6, 8, …}  =  {1,2,3,4, 5, 7, 9, …}= the union of <4] with the odd numbers.  But  then □~□~<4] = {2, 4}, which includes no odd numbers.

That the principles in Group One are all satisfied in model M follows mainly from facts T1.-T5 about M.  For N◊, see from T5 that |~□ ⊥| = W.  For K, suppose that n is in |□(A ⊃ B)|.  Then n is even and belongs either to |~A| or to |B|.  If it belongs to |~A| then it is not in |A|, and hence not in |□A|.  Therefore it either does not belong to |□A| or belongs |□B|.

On Tarski’s Calculus of Systems (2)

The relation between negation and infinity loomed large in the Intuitionist critique of mathematics.  But when we come to Intuitionistic logic, it turns out to be all about the conditional.  Negation comes in just by means of a definition:  ~A =def A → f.  

It is in the logic’s realizations that the intimate relation between negation and infinity comes to the fore.  Recall from the previous post that the lattice of theories of any distributive logic is a complete Heyting lattice with zero element Cn(𝜙), and

T → T’ is the weakest (largest) theory X such that T ∩ X ⊆ T’; 

T → T’ = ⊕{T’’: T ∩ T’’ ⊆  T’)

The pseudo-complement ¬T of T, if it exist, must be the weakest element X which is contrary to T, that is, it is such that T ∩ X ⊆ Cn(𝜙).  So we instantiate the above:  ¬ T is the weakest element X such that  T ∩ X ⊆  Cn(𝜙)), hence:

¬ T  = T → Cn(𝜙).  Equivalently, ¬ T = ⊕{T’: T ∩ T’ ⊆  Cn(𝜙)}. 

So much, so general.  How does it happen that the Laws of Excluded Middle and of Double Negation fail in Intuitionistic logic, and their corresponding principles fail in the calculus of systems?

To continue, let L be classical propositional logic, with & and ~ as primitive connectives and let S be the set of all sentences.  Thus S = Cn(S) is the unit (top) of the lattice. For brevity, if A is a sentence I will write  “Cn(A)” for “Cn({A})”.

NoteContrariety, the condition T ∩ T’ ⊆  Cn(𝜙), is different from mutually inconsistency.  With p, q atomic sentences, Cn(𝜙) is contrary to but consistent with Cn(p), Cn(p) and Cn(~p) are both mutually inconsistent and contrary. But Cn(p & q) and Cn(~p & q) are mutually inconsistent, their join is S, but not contrary for their intersection contains non-theorem q.  Contrariety is this:

Theorem 1.  T ∩ T’ ⊆  Cn(𝜙) iff for every non-theorem A in T there is a consistent extension of T’ that includes ~A.

To prove this we can appeal to the general features of theories in classical logic:

  • any consistent theory is part of a maximal consistent theory 
  • every theory is the intersection of all its maximal consistent extensions
  • for all sentences A, if T is a maximal consistent theory then T contains either A or ~A 
  • if T does not imply A then T has a maximal consistent extension that includes ~A 

Proof.  

  •  Let A be any L-non-theorem in T.   Suppose that T’ has no consistent extension that includes ~A.  Then A is in all the maximal consistent extensions of T’, hence in T’, hence in T ∩ T’,  which is thererfore not included in Cn(𝜙).
  •  Suppose that for every L-non-theorem A in T, T’ has a consistent extension which includes ~A.  Then if A is an L-non-theorem in T, A is not in T’, and hence not in T ∩ T’.  If A is an L-non-theorem that is not in T then it is also not in T ∩ T’.  Therefore T ∩ T’ contains only L-theorems, and is Cn(𝜙).

Lemma 1. Cn(~A) ⊆ ¬Cn(A)

If E is in Cn(A) there is a sentence B such that E is equivalent to A v B.  For each such sentence Cn(~A) has a consistent extension that includes ~(A v B).

  • If ~A entails B then it too entails A v B, so then A v ~A entails A v B, hence A v B is an L-theorem.
  • If ~A does not entail B then Cn(~A) has a maximal consistent extension that includes ~B, which includes ~A & ~B.

So by Theorem 1, Cn(A) ∩ Cn(~A) ⊆ Cn(𝜙). Hence Cn(~A) ⊆ CnA) → Cn(𝜙) = ¬Cn(A).

Theorem 2.  If T is finitely axiomatizable then T Θ ¬T = S

Because of the classical conjunction rules, if T is finitely axiomatizable then there is a sentence A such that T = Cn(A).  By the lemma, Cn(A) Θ  Cn(~A) ⊆ Cn(A) Θ  ¬Cn(A).  But Cn(A) Θ  Cn(~A) = Cn(Cn(A) ∪  Cn(~A)) = S.

It follows then that violations of Excluded Middle can only be by theories that are not finitely axiomatizable.

To pursue this we need to improve on the above lemma.

Lemma 2.   T ∩ T’ ⊆  Cn(𝜙) iff for every non-theorem A in T there is a maximal consistent extension of T’ that includes ~A.

This follows at once from Theorem 1.

Lemma 3.  If T ∩ T’ ⊆  Cn(𝜙) then T’ ⊆ ∩ Cn({~A: A is in T})

Suppose T ∩ T’ ⊆  Cn(𝜙) and that A is an L-non-theorem of T.  Let KT’(A) be the set of all consistent extensions of T’ that include ~A.  So ∩KT’(A) ⊆ Cn(~A). Let MT’ be the set of all maximal consistent extensions of T.  Then for each L-non-theorem A of T:

            T’ = ∩MT’ ⊆ ∩KT’(A) ⊆ Cn(~A). 

Since T’ has a consistent extension that includes ~A for each L-non-theorem in T it follows that T’ ⊆ ∩{Cn(~A): A is an L-non-theorem in T}.  From this the lemma follows because if A is an L-theorem then Cn({~A}) = S, which has no effect on the intersection.

Theorem 3.    ¬T = ∩({Cn{~A}: A in T}).

First, it is clear that for every L-non-theorem A in T, ∩({Cn{~A}: A in T}) has a consistent extension that includes ~A.  Therefore ∩({Cn{~A}: A in T})  ⊆  ¬T by theorem 1 and definition.  

Secondly,  T ∩ ¬T ⊆  Cn(𝜙), so by Lemma 3, ¬T ⊆ ∩ Cn({~A: A is in T})

Corollaries:  Cn({~A}) = ¬Cn(A), ¬Cn(𝜙) = S, ¬S = Cn(𝜙)

This theorem gives us a way to identify the pseudo-complement of theories that are not finitely axiomatizable.

Example. Let AT be the infinite list of atomic sentences p1, …, pm , … and T = Cn(AT).  Let ATfin be the set of finite conjunctions of atomic sentences. Every member of T is equivalent to a finite conjunction C of atomic sentences disjoined with some other sentence A (e.g. p v q).  

So ¬T = ∩ Cn({~(C v A): C in ATfin})

As an example, consider the L-non-theorem (p1 & ~p2). Could it belong to ¬T?  It does not belong either to Cn(~p1) or to Cn(~p2). Nor does it belong to Cn(~q) for any q in ATfin other than  p1 or p2.

We can generalize this reasoning to cover any L-non-theorem.  Since L is classical propositional logic, we can think in terms of truth-table rows or their generalization: 

finite state-description is a consistent conjunction p*1 & …& p*m of  atomic sentences each of which either has or does not have an appended negation sign. 

In classical propositional logic,  A is an L-non-theorem if and only if there is a finite state-description B such that B├L~A.  For any such B, since B is finite and AT is infinite, there is an atomic sentence q which does not appear in B.  But then B is not in Cn(~q), hence not in ¬T.  Therefore there is no derivation of ~A in ¬T.  Hence ¬T = Cn(𝜙).

This gives us a counterexample to both Excluded Middle and Double Negation, with T = Cn(AT):

            T ⊕ ¬T = T and T ≠ S

            ¬ ¬T = ¬ Cn(𝜙) = S and S ≠ T

The following analogy strikes me as apt.  Remember that ¬T  is also the join of its contraries.  There are uncountably many infinite state descriptions, all inconsistent with each other.  So we can look to continua for analogies. In a geometric space, the join of a family of subspaces is the least subspace that contains all.  Suppose P is a plane, its subspaces are the straight lines through the origin.  Take away one such straight line:  the join of the remaining ones is still the entire plane.

What Could Be the Most Basic Logic?

It was only in the 19th century that alternatives to Euclidean geometry appeared.  What was to be respected as the most basic geometry for the physical sciences: Euclidean, non-Euclidean with constant curvature, projective?  Frege, Poincare, Russell, and Whitehead were, to various degrees, on the conservative side on this question.>[1]  

In the 20th, alternatives to classical logic appeared, even as it was being created in its present form.  First Intuitionistic logic, then quantum logic, and then relevant and paraconsistent logics, each with a special claim be more basic, more general in its applicability, than classical logic.

Conservative voices were certainly heard.  John Burgess told his seminars “Heretics in logic should be hissed away!”.  David Lewis described relevant and paraconsistent logic as logic for equivocators.  The other side was not quiet.  Just as Hans Reichenbach gave a story of coherent experience in a non-Euclidean space, so Graham Priest wrote a story of characters remaining seemingly coherent throughout a self-contradictory experience.

Unlike in the case of Euclidean geometry, the alternatives offered for propositional logic have all been weaker than classical logic.  So how weak can we go?  What is weaker, but still sufficiently strong, to qualify as “the” logic, logic simpliciter

I am very attracted to the idea that a certain subclassical logic (FDE) has a better claim than classical logic to be “the” logic, the most basic logic.  It is well studied, and would be quite easy to teach as a first logic class. Beall (2018) provides relevant arguments here – the arguments are substantial, and deserve discussion.  But I propose to reflect on what the question involves, how it is to be understood, from my own point of view, to say why I find FDE attractive, and what open questions I still have.

1.      A case for FDE

The question what is the most basic logic sounds factual, but I cannot see how it could be.  However, a normative claim of the form

Logic L is the weakest logic to be respected in the formulation of empirical or abstract theories

seems to make good sense.  We had the historical precedent of Hilary Putnam’s claiming this for quantum logic.  I will come back to that claim below, but I see good reasons to say that FDE is a much better candidate.

2.      Starting a case for FDE

FDE has no theorems.  FDE is just the FDE consequence relation, the relation originally called tautological entailment, and FDE recognizes no tautologies.  Let us call a logic truly simple if it has no theorems.

To be clear: I take L to be a logic only if it is a closure operator on the set of sentences of a particular syntax.  The members of L(X) are the consequences of X in L, or the L-consequences of X; they are also called the sentences that X entails in L.  A sentence A is a theorem  of L iff A is a member of L(X) for all X.  The reason why FDE has no theorems is that it meets the variable-sharing requirement: that is to say, B is an L-consequence of A only there is an atomic sentence that is a component of both B and A.

So the initial case for FDE can be this: it is truly simple, as it must be, because

logic does not bring us truths, it is the neutral arbiter for reasoning and argumentation, and supplies no answers of its own. 

To assess this case we need a clear notion of what counts as a logic (beyond its being a closure operator), and what counts as supplying answers.  If I answered someone’s question with “Maybe so and maybe not”, she might well say that I have not told her anything.  But is that literally true?  A. N. Prior once made a little joke, “What’s all the fuss about Excluded Middle?  Either it is true or it is not!”.  We would have laughed less if there had been no Intuitionistic logic.

3.      Allowance for pluralism

My colleague Mark Johnston like to say that the big lesson of 20th century philosophy was that nothing reduces to anything else.  In philosophy of science pluralism, the denial that for every scientific theory there is a reduction to physics, has been having a good deal of play.

As I mentioned, FDE’s notable feature is the variable-sharing condition for entailment.  If A and B have no atomic sentences in common, then A does not entail B in FDE.  So to formulate two theories that are logically entirely independent, choose two disjoint subsets of the atomic sentences of the language.  Within FDE, theories which are formulated in the resulting disjoint sublanguages will lack any connection whatsoever.    

4.      Could FDE be a little too weak?

The most conservative extension, it seems to me, would be to add the falsum, ⊥.  It’s a common impression that adding this as a logical sign, with the stipulation that all sentences are consequences of ⊥, is cost-less.  

But if we added it to FDE semantics with the stipulation that ⊥ is false and never true, on all interpretations, then we get a tautology after all: ~⊥.  The corresponding logic, call it FDE+, then has ~ ⊥ as a theorem.   So FDE+ is not truly simple, it fails the above criterion for being “the” logic.  Despite that common impression, it is stronger than FDE, although the addition looks at once minimal and important.  Is FDE missing out on too much?

How should we think of FDE+?  

Option one is to say that ⊥, a propositional constant, is a substantive statement, that adding it is like adding “Snow is white”, so its addition is simply the creation of a theory of FDE.

Option two is to say that FDE+ is a mixed logic, not a pure logic.  The criterion I would propose for this option is this:

A logic L defined on a syntax X is pure if and only if every syntactic category except that of the syncategoremata (the logical and punctuation signs) is subject to the rule of substitution.

So for example, in FDE the only relevant category is the sentences, and if any premises X entails A, in FDE, then any systematic substitution of sentences for atomic sentences in X entails the corresponding substitution in A.  

But in FDE+ substitution for atomic sentence ⊥ does not preserve entailment in general.  Hence FDE is a pure logic, and FDE+ is not.

The two options are not exclusive.  By the usual definition, a theory of logic L is a set of sentences closed under entailment in L.  So the set of theorems of FDE+ is a theory of FDE.  However, it is a theory of a very special sort, not like the sort of theory that takes the third atomic sentence (which happens to be “Snow is white”) as its axiom.  

Open question: how could we spell out this difference between these two sorts of theories?  

5.      Might FDE be too strong?

FDE is weak compared to classical logic, but not very weak.  What about challenges to FDE as too strong?  

It seems to me that any response to such a challenge would have be to argue that a notion of consequence weaker than FDE would be at best a closure operator of logical interest.  But the distinction cannot be empty or a matter of fiat.

Distributivity

The first challenge to classical logic that is also a challenge to FDE came from Birkhoff and von Neumann, and was to distributivity.  They introduced quantum logic, and at one point Hilary Putnam championed that as candidate for “the” logic.  Putnam’s arguments did not fare well.[2]  

But there are simpler examples that mimic quantum logic in the relevant respect.

Logic of approximate value-attributions  

Let the propositions (which sentences can take as semantic content) be the couples [m, E], with E  an interval of real numbers – to be read as “the quantity in question (m) has a value in E”.

The empty set 𝜙 is counted as an interval.  The operations on these propositions are defined:

[m, E]  ∧ [m, F] = [m, E ∩ F]

[m, E]  v [m, F]  =  [m, E Θ F], 

where E Θ F the least interval that contains E ∪ F

Then if E, F, G are the disjoint intervals  (0.3, 0.7), [0, 0.3], and [0.7, 1],  

[m, E]  ∧ ([m, F] v [m, G]) = [m, E] ∧ ([ m, [0,1]]  = [m, E]

([m, E]  ∧ ([m, F]) v ([m, E]  ∧ ([m, G]) = [m, 𝜙]

which violates distributivity.

This looks like a good challenge to distributivity if the little language I described is a good part of our natural language, and if it can be said to have a logic of its own.

The open question:  

if we can isolate any identifiable fragment of natural language  and show that taken in and by itself, it has a logical structure that violates a certain principle, must “the” logic, the basic logic, then lack that principle?

Closure and conflict

We get a different, more radical, challenge from deontic logic.  In certain deontic logics there is allowance for conflicting obligations.  Suppose an agent is obliged to do X and also obliged to refrain from doing X, for reasons that cannot be reconciled.  By what logical principles do these obligations imply further obligations?  At first blush, if doing X requires doing something else, then he is obliged to do that as well, and similarly for what ~X requires.  But he cannot be obliged to both do and refrain from doing X: ought implies can.

Accordingly, Ali Farjami introduced the Up operator.  It is defined parasitic on classical logic: a set X is closed under Up exactly if X contains the classical logical consequences of each of its members.  For such an agent, caught up in moral conflict, the set of obligations he has is Up-closed, but not classical-logic closed.

If we took Up to be a logic, then it would be a logic in which premises A, B do not entail (A & B). Thus FDE has a principle which is violated in this context.

To head off this challenge one reposte might be that in deontic logic this sort of logical closure applies within the scope of a prefix.  The analogy to draw on may be with prefixes like “In Greek mythology …”, “In Heinlein’s All You Zombies …”.  

Another reposte can be that FDE offers its own response to the person in irresolvable moral conflict.  He could accept that the set of statements A such that he is obliged to see to it that A, is an FDE theory, not a classical theory.  Then he could say: “I am obliged to see to it that A, and also that ~A, and also that (A & ~A).  But that does not mean that anything goes, I have landed in a moral conflict, but not in a moral black hole.”

Deontic logic and motivation from ethical dilemmas only provide the origin for the challenge, and may be disputed.  Those aside, we still have a challenge to meet.

We have here another departure from both classical logic and FDE in and identifiable fragment of natural language.  So we have to consider the challenge abstractly as well.  And it can be applied directly to FDE.

Up is a closure operator on sets of sentences, just as is any logic.  Indeed, if is any closure operator on sets of sentences then the operator

Cu:   Cu(X) = ∪{C({A}): A in X}

is also a closure operator thereon.  (See Appendix.)

So we can also ask about FDEu.  Is it a better candidate to be “the” logic?  

FDEu is weaker than FDE, and it is both pure and truly simple.  But it sounds outrageous, that logic should lack the rule of conjunction introduction!

6.      Coda

We could give up and just say: for any language game that could be played there is a logic – that is all.

But a normative claim of form

Logic L is the weakest logic to be respected in the formulation of empirical or abstract theories

refers to things of real life importance.  We are not talking about just any language game.  

Last open question:  if we focus on the general concept of empirical and abstract theories, can we find constraints on how strong that weakest logic has to be?

FDE is both pure and truly simple. Among the well-worked out, well studied, and widely applicable logics that we already have, it is the only one that is both pure and truly simple.  That is the best case I can make for it so far.

7.      APPENDIX

An operator on a set X is a closure operator iff it maps subsets of X to subsets of X such that:

  1. X ⊆ C(X)
  2. CC(X) = C(X)
  3. If X ⊆ Y then C(X) ⊆ C(Y)

Definition.  Cu(X) = ∪{C({A}): A in X}.  

Proof that Cu is a closure operator:

  •  X ⊆ Cu(X).  For if A is in X, then A is in C({A}), hence in Cu(X).
  •  CuCu (X) = Cu(X).  Right to left follows from the preceding.  Suppose A is in CuCu (X).  Then there is a member B of Cu(X) such that A is in C({B}), and a member  E of X such that B is in C({E}). Therefore A is in CC({E}).  But CC({E}) = C({E}), so A is in  Cu(X).  
  • If X ⊆ Y then Cu(X) ⊆ Cu(Y).  For suppose X ⊆ Y. Then {C({A}): A in X} ⊆ {C({A}): A in Y}, so Cu(X) ⊆  Cu(Y).

8.      REFERENCES

Beall, Jc. (2018) “The Simple Argument for Subclassical Logic”. Philosophical Issues.

Cook, Roy T.  (2018) “Logic, Counterexamples, and Translation”.  Pp. 17- 43 in Geoffrey Hellman and Roy T. Cook (Eds.) (2018) Hilary Putnam on Logic and Mathematics.  Springer.

Hellman, Geoffrey (1980). “Quantum logic and meaning”. Proceedings of the Philosophy of Science Association 2: 493–511.

Putnam, Hilary (1968) “Is Logic Empirical” Pp. 216-241 in Cohen, R. and Wartofsky, M. (Eds.). (1968). Boston studies in the philosophy of science (Vol. 5). Dordrecht.   Reprinted as “The logic of quantum mechanics”. Pp. 174–197 in Putnam, H. (1975). Mathematics, matter, and method: Philosophical papers (Vol. I). Cambridge.

Russell, Bertrand (1897) An Essay on the Foundations of Geometry. Cambridge.

NOTES


[1] For example, Russell concluded that the choice between Euclidean and non-Euclidean geometries is empirical, but spaces that lack constant curvature “we found logically unsound and impossible to know, and therefore to be condemned a priori (Russell 1897: 118).

[2] See Hellman (1980) and Cook (2018) especially for critical examination of Putnam’s argument.

The curious roles atomic sentences can play (1)

[A reflection on papers by Hiz and Thomason, listed at the end.  Throughout I will use my own symbols for connectives, to keep the text uniform.]

Atomic sentences, we say, are not a special species.  They could be anything; they are just the ones we leave unanalyzed.  What we study is the structures built from them, such as truth-fuctional compounds.

But that innocuous looking “They could be anything” opens up some leeway.  It allows that the atomic sentences could have values or express propositions that the complex sentences cannot.  I will discuss two examples of how this leeway can be exploited for proofs of incompleteness.

The story I want to tell starts with a small error by Paul Halmos in 1956.

Halmos and Hiz

 In his 1956 paper Paul Halmos wanted to display the classical propositional calculus with just & and ~ as primitive connectives.  (Looks familiar, what could be the problem?)  As guide he took the presentation in Hilbert and Ackermann, with v and ~ as primitives. For brevity and ease of reading they had introduced “x ⊃ y” as abbreviation for “~x v y”.

  1. (x v x) ⊃ x
  2. x ⊃ (x v y)
  3. (x v y) ⊃ (y v x)
  4. (x v y) ⊃ (z v x . ⊃ . z v y )

Knowing how truth functions work, Halmos (1956: 368) treated “x v y”  as abbreviation of “~(~x & ~y)” and “x ⊃ y” as abbreviation of “~(x & ~y), to read Hilbert and Ackermann’s axioms. That means that his formulation, with ~ and & primitive, was this:

  1. ~[~(~x & ~y) & ~x]
  2. ~[x & ~~(~x & ~y)]
  3. ~[~(~x & ~y) & ~~(~y & ~x)]
  4. ~[~(~x & ~y) & ~[~[~(~z & ~x) & ~~(~z & ~y)]]]

But, unlike what it translates (Hilbert and Ackermann’s), this set  of axioms is not complete!

Henryk Hiz (1958) showed why not.  (He mentioned that Halmos had raised the possibility himself in a conversation, and Rosser had done so as well, in a letter to Halmos.)

Let’s look for a difference in the roles of atomic sentences and of complex sentences in Halmos’ axiom set.  What springs to the eye in Axiom b. is that there is an occurence of x that is preceded by ~, and one that is not so preceded but ‘stands by itself’.  So we can make trouble by allowing an atomic sentence x to take values that a negated sentence ~x cannot have.  

That is what Hiz does, with this three-valued truth-table where an atomic sentence x could have value 1, 2, or 3, but ~x can only have values 1 or 3. 

(He writes A and N for my  & and ~.) 

So if x has value 2 then ~(~x & x) has value ~ (~2 & 2) = ~(1 & 2) = ~1 = 3, which is not designated.  So there is a classical tautology, the traditional Non-Contradiction Principle, that does not receive a designated value.  

In this three-valued logic neither conjunction nor negation behaves classically, but all of Halmos’ axioms have the designated value 1.  So his formulation of classical sentential logic is sound but not complete.

Thomason

Thomason’s (2018) argument and technique, which I discussed in a previous post, were very close to Hiz’, but applied to modal logic.

In modal logic the basic K axiom can be formulated in at least these three ways:

  1. □(x ⊃ y) ⊃ (□x ⊃ □y)
  2. (x v y) ≡  (x v y)
  3. ~◊~(x ⊃ y) ⊃ (~◊~x ⊃ ~◊~y)

The third is a translation of the first with “□” translated as “~~”.  In the previous post (“Is Possiblity-Necessity Duality Just a Definition”, 07/17/2025) I explained Thomason’s model in which that third formulation of K is satisfied, but the Duality principle is shown to be independent.  Here I will show that satisfaction of Axiom (iii) is compatible with a violation of Axiom (ii). 

Thomason presented a model with 8 values for the propositions.  I’ll use here the smaller 5-valued model which I described in the post. My presentation here, in a slightly adapted form, is sufficient for our purpose.  

This structure (matrix)is made up of the familiar 2-atom Boolean lattice B = {T, 1, ~1, ⊥} with the addition of an ‘alien’ element k.  The meet and join on B are operators ∧ and  +. The operator ~ is the usual complement on on B.  The only designated element is T.

To extend the operators to the alient element, we set ~k = ~1.  So x can take any of the five values but ~x can only have a value in B.

What about the joins and meets of elements when one of them is alien?  They are all in B too, with these definitions:

Define.  x* = ~~x, called the Twin of x.  (Clearly x = x* except that k* = 1.) 

Define.  For any elements x and y:   x & y = x* ∧ y*, and x v y = x* + y*.

Finally the possibity operator is defined by: ◊x = T iff x = 1 or T;  ◊x =  ⊥ otherwise.  

Instances of Axiom (iii.) always get the desigated value (by inspection; note that every non-modal sentential part starts with ~). 

But in Axiom (ii) we see the leeway, due to the fact that x can be any element.  The negation, join, or meet of anything with anything can only take values in B.  So Axiom (ii) does not always get a designated value, for if we set x = y = k, we get the result:

(k v k) = (k* + k*) = 1 = T

(k v k) =  ⊥* +  ⊥* =  ⊥

In Thomason’s article this technique is used to show that with formulation (iii) of K, the duality ¬◊¬x = □x is independent, and needs to be added as an axiom rather than a definition.  

Axiom (ii.), with the attendant rules changed mutatis mutandis, and the Duality introduced as a definition, is a complete formulation of system K (cf. Chellas 1980: 117, 122).  A formulation that has Axiom(iii) instead of Axiom (ii) is not.  

Hiz’ warning was well taken.

References

Chellas, Brian F. (1980) Modal Logic: An Introduction. Cambridge.

Hiz, Henryk  (1958) “A Warning about Translating Axioms”. Am. Math. Monthly 65: 613-614.

Thomason, Richmond H. (2018) “Independence of the Dual Axiom in Modal K with Primitive  ◊”.  Notre Dame Journal of Formal Logic 59: 381-385.

Feyerabend and Sellars on Language and Experience

  • 1.         Feyerabend on experience and its reports      1
  • 2.         Severing meaning from use    2
  • 3.         Interpretation  2
  • 4.         Theory-laden-ness of natural language          4
  • 5.         What could interpretation be then?    5
  • 6.         The contemporaneous debates about meaning           5
  • 7.         Wilfrid Sellars on meaning    6
  • 8.         Application to Feyerabend’s account 8
  • 9.         CODA: What is my language?           9

When someone in that crowded theatre shouted “Phlogiston escaping!” we knew that it was false, but of course we ran out at once.

This is a good example to illustrate Paul Feyerabend’s pragmatic theory of observation, as I will explain below.  Feyerabend did leave some questions unanswered.  Thinking about those questions led me to something that I had found perplexing, in Wilfrid Sellars’ correspondence with Roderick Chisholm about intentionality.  

1.         Feyerabend on experience and its reports

When Paul Feyerabend presented his “Attempt at a realistic interpretation of experience” to the Aristotelian Society in February 1958, much of the new scientific realism was already in place.[1] While Feyerabend presents his ideas in an explicit, detailed critique of positivist views of science and experience, we can (and his peers then could) proceed at once to his positive contribution.

This begins with a presentation of the pragmatic theory of observation, which is in the first place about what counts as an observation language.[2]  There are four pragmatic conditions for observation reporting, that we can summarize (using Feyerabend’s own terms) as:

Definition. L is an observation language for a community C of observers, set S of situations, and set A of sentences of L exactly if there is a function F (‘association’) which maps S into the powerset of A, such that given a situation s in S, the members of C are able to come to a quick unanimous decision about whether to accept to reject the sentences in the set F(s), and their acceptance of any of these sentences in F(s) is a reliable indicator of their being in situation s.[3]   

These conditions include nothing at all about the meaning of those sentences.  The role of observation report is entirely separated from any reference to meaning or reference.  

2.        Severing meaning from use

Suppose that in community C the utterance of p is a reliable indicator of the presence of fire to the utterer.  The syntax of p is irrelevant:  p can be learned to have this use by conditioning.  It could be “boojum!” or “fire!” or “phlogiston escaping!” or “rapid oxidation!”.  

“Observability is a pragmatic concept” (Feyerabend 1958:146).  That is, it is a concept that belongs to the analysis of the conditions and contexts of the use of language.  The distinction is Charles Morris’: in semantics we abstract from use, to concentrate on the word-world relation, while in pragmatics the relation studied is three-fold: word, world, user.  

In observation reportage, humans function as measuring instruments.  Perhaps in your car, a red light goes on if and only if the engine is overheating.  There is no logical connection between the color of the light and the temperature of the engine.  But if the light goes on and I ask you “what does that mean?” or “what does that signify?” you will answer “that the engine is overheating”. 

When this little dialogue is transposed, from measurement output to observation reportage, it becomes an example of what Feyerabend calls interpretation.

3.        Interpretation

For L to be not just an observation language but a fully-fledged observation language, Feyerabend submits, it must have an interpretation which determines what its sentences “are supposed to assert” (Feyerabend 1958: 145-46).  

As mentioned above, in making an observation, an organism is acting as a measuring instrument:

“What the observational situation determines (causally) is the acceptance or the rejection of a sentence, i.e. a physical event. In so far as this causal chain involves our own organism we are on a par with physical instruments. But we also interpret the indications of these instruments … and this interpretation is an additional act.” (Feyerabend 1958: 146)

How is that done?  Suppose again that that in community C the utterance of p is a reliable indicator of the presence of situation s to the utterer.  When you, who may or may not be a member of C, interpret that utterance of p, you will describe situation s in your own language, and in accordance with your assumptions, presuppositions, theories, and linguistic practices of the community to which you belong.

Accordingly, examples of interpretation must take the following sort of form:

[1] Observers in community C reliably reliably agree to “Phlogiston is escaping” in the presence of fire and reject it in the absence of fire.

[2] Observers in community C reliably agree to “There is fire” in the presence of phlogiston escaping and reject it in the absence of phlogiston escaping.

[3] Observers in community C reliably agree to “Phlogiston is escaping” in the presence of rapid oxidation, and reject it in the absence of rapid oxidation.

Could any of this be said by a member of community C?  

Yes, in the case of [1] or [2], and definitely not in the case of [3].  So in each case we have to take into account who could offer such an interpretation, and what language that person would be speaking.

I think this important, and to be emphasized:  to understand [1]-[3] properly we must in each case imagine ourselves inside the community – possibly, but not generally community C – where we make the statement, or are addressed by someone making that statement.  For in each case the imagined speaker takes for granted or presupposes that the addressees understand the words used to describe the relevant situation.  More: the speaker takes for granted that the addressees would describe those situations in the same way.   

Speakers in a given community report on their experience, by means of words which may apply either correctly or incorrectly (or not be descriptive at all) to what they are actually experiencing.  This is very far from the idea that the semantic content of the observation report describes anything like an immediately, unmediated content of the ‘given’.

Standing outside a certain community we say that they reported reliably on an experience, which they most certainly had, by asserting that there was a phlogiston escape.  They took it to be that.  In that case, how can we think of observation reports as providing the data to which our theories are accountable?

In philosophy of language elsewhere the corresponding question was: how could there be successful reference by means of a false description?  The response was a turn from semantics to pragmatics. On Russell’s theory of descriptions, the phrase “the so-and-so” denotes entity x if and only if x is so-and-so and nothing else is so-and-so.  Keith Donnellan (1966) argued that we may keep the term “denote” for this, but then must recognize another use for which he offered the term “refer”.  That is, someone may use “the so-and-so” to denote what that phrase denotes (whenever it denotes anything), but will be  using it to refer to something which it does not denote, when the conditions are felicitous for that use. 

Plausible ordinary examples abound. For example, there is in the room exactly one man who appears to be drinking a martini, and in discussing him we refer to him as “the man with the martini”.  Our communication about him may be entirely successful, all (or most of) we say about him may be true, although in fact what is in his glass is just water.  David Lewis offered an especially nice example:  “Help me, Stephanie, the cat is fighting with the other cat again!”  In this example the phrase “the cat” does not actually denote anything, but Lewis used it successfully to refer to a specific cat –  a cat he falsely described as being the only cat there! – nevertheless.

In Donnellan’s terminology, denotation is a (semantic) relation between words and things while reference is a (pragmatic) relation between users and use, words, and things.

We can make a similar distinction about truth and a related notion of correctness under the circumstances:  

an observer may be making a correct observation report of fire by asserting “there is phlogiston escaping”, although that statement is literally false.

4.        Theory-laden-ness of natural language

How we interpret the output of a measuring device depends on the theories we currently accept. Galileo designed an instrument to measure the force of the vacuum; today we interprets its results as measuring atmospheric pressure.  Feyerabend insists that the same goes for our observation reports, and codifies this as:

thesis I: the interpretation of an observation-language is determined by the theories which we use to explain what we observe, and it changes as soon as those theories change.” (Feyerabend 1958: 163, his italics)[4]

When it comes to such an observation report as “Fire!”, that was once interpreted, in a certain community, as reporting a rapid phlogiston escape.  Members of that community could equally well shout “Phlogiston escaping!”, and if they did, everyone, in theoretical agreement or not, would be well advised to prevent themselves from getting burned.

If thesis I. is correct then the way we understand observation reports will change as our theories change.  Is that consequence of the thesis in accord with our history?  

Observational language in use may appear not to change as theories change, because the syntax does not change.  From this we should not infer that meaning is invariant.[5]  One example Feyerabend offers is the changing interpretation of color-reports (Feyerabend 1958: 160-162).  Once the Doppler effect for light is discovered, the report “x is red” is interpreted as a relational statement, with the relative velocity of observer and observed entered as the additional parameter.  On the “human” level, the actual practice of color-reporting does not change, for there the velocities are too small for the effect to be noticed.  But the interpretation does: the scientifically literate will interpret color observation reports by describing the situation in question with reference to that relative velocity.

The realist account of experience which Feyerabend submits is therefore along the following lines:

The interpreter, speaking his own current natural language and from within his own cluster of accepted scientific theories, has no difficulty referring to the relevant situations, generally by means of descriptions that he takes to be correct, and classifying the putative observation statement as a report about that situation if the above pragmatic conditions are satisfied

It seems to me that we should understand Feyerabend’s lecture with reference to the background in which he developed these ideas.  Salient in this respect is the, by then already accepted, new scientific realism in the Minnesota Center for the Philosophy of Science, which Feyerabend had joined in 1957.

Wilfrid Sellars, Thomas Kuhn, Paul Feyerabend, and Norwood Russell Hanson all insisted on the theory-laden-ness of our language in use.  There was a great difference between their approach to the language of science and the way logical positivists had thought about language.  The new realists’ lack of interest in the logical syntax of language is understandable.  For scientific theories are presented in our current natural language, though augmented with mathematics.  The language of science is not, in itself, an uninterpreted calculus that needs to have meaning bestowed on it!  

And so, when Feyerabend says that to be full-fledged language the observation language needs to have an interpretation, it can be taken for granted that this interpretation can be given in natural language, and be based on the currently accepted scientific theories which were formulated in our natural language in use.

But there is still an important difference between our four protagonists.  Wilfrid Sellars, unlike the other three, was intent on engaging with a wide-ranging diversity of traditional issues in philosophy.  His rejection of the earlier realism of Roy Wood Sellars’ generation, as well as of Carnap’s logical positivism, came along with systematically developed responses to those issues.  What an interpretation is, what meaning is, taken as a general question, Sellars could not leave  unaddressed. 

5.        What could interpretation be then?

What is clear enough, and in 1957 had already been clear for some time, is that there is no simple relation between observation terms and theoretical terms.  Not even for the speaker who formulates the interpretation.  The idea of an operational definition of such terms as “oxidation” does not get anywhere.  What about the converse?  Do the sorts of interpretation exhibited above as [1]-[3] offer anything like a definition of the terms used for observation reports?

If someone in our community says “Fire is rapid oxidation”, could we parse that as “Our observation term “fire” means rapid oxidation”?  

That is plausible at first blush.  To make it plausible to us today may not be easy, used as we are to the sparsity of semantic accounts that take only truth and reference into account.  We could certainly say that any actual proper reporting use of “fire” will refer to an instance of rapid oxidation.  But actual reference is not enough to determine meaning.  

Nevertheless it would seem that to interpret “Fire!” in Feyerabend’s sense, is to say that it means that there is rapid oxidation.  For that is just a parallel to the example of the driver who, seeing a red light in his car, says “this means that the engine is overheating”.  

But what is meant by “means” in that sort of assertion?  

Feyerabend criticizes two accounts of the meaning of observation terms, ones he attributes to positivist philosophers of science, but does not give one of his own.[6]  

6.        The contemporaneous debates about meaning

Willard Van Orman Quine had thrown a wrench into this topic of meaning, with his “Two dogmas of empiricism” (1951) and its trenchant, though rather behavioristic, critique of meaning, analyticity, and synonymy.  

In response, Rudolf Carnap pointed insistently to the distinction between pragmatics and semantics, in his “Meaning and Synonymy in Natural Languages” (Carnap 1955).  Denotation, extension, and truth, studied in abstraction from use, are the subject of semantics, and Quine is right to find a solid basis for philosophical analysis there.  But meaning, intension, and intensional relations like synonymy are the subject of study in pragmatics, which brings in patterns in usage.  Carnap insists that these patterns, based in dispositions to use words in certain ways, fix considerably more than extension.  As an example he suggests that different linguists might translate “Pferd”, as used by a German speaker, Karl, as “horse” or instead as “horse or unicorn”.  While the extensions of those English phrases are the same, that there is a difference can be put to the test by asking for Karl’s response to pictures and stories.  

While I (and I think Sellars) would speak of linguistic commitments rather than linguistic dispositions, I take Carnap’s to be an adequate response to Quine’s main arguments (and have nothing good to say about the others).  Nevertheless, Carnap’s response does not throw much light on the basic concepts of pragmatics, and does not go far toward providing pragmatics with a sound theoretical basis.[7]  In the Minnesota Circle, where Feyerabend resided at the time, there was a much farther reaching attempt to do so, in progress, at the hands of Wilfrid Sellars.

7.         Wilfrid Sellars on meaning

Sellars’ correspondence with Roderick Chisholm about intentionality had been published just then, as an Appendix to the Minnesota Studies in Philosophy of Science (Sellars and Chisholm 1957 ).  There an enigmatic, and perhaps somewhat unfortunate, assertion by Sellars introduces what I see as the central theme in Sellars’ analysis.

Meaning, interpretation, translation

Chisholm held that we have thoughts, and the meaning our statements have is the thoughts they express.  As Sellars understands this, it implies that such a sentence as 

[a] “Hund” (in German) means dog

has the form 

“Hund” expresses t, and t is about dogs.

which states that there are certain relations between three things.  Each of these relations is a  word-thing relation.  So this way of understanding meaning statements remains solidly within semantics rather than pragmatics.[8]

The dialogue between these two eminently subtle thinkers is eminently subtle, but I think that the crucial clue to Sellars’ view arrives in this passage in Sellars’ letter of August 31, 1956:

“Thus, while I agree with you that the rubric

” .. . ” means – – –

is not constructible in Rylean terms ( ‘Behaviorese,’ I have called it), I also insist

that it is not to be analysed in terms of

“. . .” expresses t, and t is about – – -.

My solution is that “‘ .. .’ means – – -” is the core of a unique mode of discourse which is as distinct from the description and explanation of empirical fact, as is the language of prescription and justification.”  (Sellars and Chisholm 1957 : 527)

Chisholm is puzzled. Prescriptions, he writes, are neither true nor false.  But isn’t such a semantic statement as [a] “’Hund’ (in German) means dog” true?  

Sellars agrees that it is true.  That admission introduces a negative analogy to prescriptions.  But Sellars insists there is more to [a] than that it is true, so some positive analogy remains.

To teach someone a bit of German by saying “’Hund’ (in German) means dog” requires that this person is, like the teacher, a user of the English word “dog”.  Sellars writes that in such a case, 

“there is an important sense in which this statement does not describe the role of “Hund” in the German language, though it implies such a description.  (Remote parallel : When I express the intention of doing A, I am not predicting that I will do A, yet there is a sense in which the expression of the intention implies the corresponding prediction.)” ” (Sellars and Chisholm 1957 : 532)[9]

There is indeed an important distinction between the expression of an intention, and the statement that one has that intention.  Imagine asking someone “Will you marry me?” and receiving as answer the statement “I do in fact have the intention to marry you, and such intentions are typically followed by marriage”.  Imagine, in contrast, that the answer had been the expression of intention in the words “Yes, I will marry you”. The latter is surely what the suitor hoped, not the former.  Expressing the intention is different from stating that she has that intention.  Nevertheless, if she expressed the intention to do so, the suitor would have warrant to infer the statement that she does in fact have that intention.

We may still, like Chisholm, be at a loss as to how this clarifies the discourse about meaning.  Sellars then goes on to explain his point in a different way, by recourse to Church’s translation test.  I think we can see that employed here as follows, in an attempt to teach someone a bit of German.  Suppose that 

[A] “Hund” is a word for dogs

were just a statement of fact.  Then its German translation would also be a statement of fact, with the same information content.  That translation is

[B] „Hund“ ist ein Wort für Hunde. 

But, although that is certainly true, [B] does not have the same status as [A].  Indeed, the student might already know enough German to realize that sentences of form [B] are always true, even while not knowing the reference of “Hund”.  For [B] has the status for a German speaker which 

            [C] “Dog” is a word for dogs

has for a speaker of English, while [A] does not have that status. [10]

Relation to Feyerabend’s realistic interpretation

It may seem that I have gone astray, into something not related to Feyerabend’s provocative Thesis I.  But not so.  

In his answer to Chisholm, Sellars refers unavoidably to distinct communities with different languages.  The informative [A] must be presumed to be addressed to someone sufficiently far in the English speaking community to understand “means dog in German” or “is a word for dogs”.  The speaker of [A] takes that for granted, presupposes that, and we are here at the crucial point also made above about assertions [1]-[3].  

It is a crucial point for meaning or interpretation in general.  In his lectures collected as The Metaphysics of Epistemology, Sellars clarifies this with the distinction between 

            [D] “und” (in German) means and

and 

[E] “und” (in German) means the same as “va” in Sanskrit”.   (Cf. Sellars 1989: 240)[11]  

The important difference between [D] and [E] lies in their presuppositions, when taken as items in a dialogue or communication.  The assertion of [D] presupposes understanding of the English “and”, it is addressed to someone taken to have in his own vocabulary all that follows the word “means” .  In contrast, [E] does not presuppose understanding of any Sanskrit.  The assertion of [E] conveys factual information only, a relation between elements of two languages outside the addressee’s community.  

If we tried to deal with these examples of ‘meaning’ discourse solely within semantics – that is, with attention only to the relation words bear to the world, independent or abstracted from contexts of use — we would be at a loss.  Ignoring what is presupposed when a speaker addresses someone with [D], we would have to construe [D] as

            [F]  “und” (in German) means the same as “and” in English.

But [F], although it is an English sentence and must be assumed to be addressed to an English speaker, does not presuppose that the addressee has “and” in their vocabulary.  If that is implausible (for how can someone have a significant amount of English, and not have “and”?), an example in which the target words has some unusual synonyms will serve:

            [F*] “Hund” (in German) means the same as “canine” in English.

That information would not suffice for addressees who knew English but did not have the word “canine” in their vocabulary.

Within pragmatics, then, [D] does not have the status of a simple assertion of the form “a is related R-ly to b”.  Instead [D] is part of an intra-communal discourse, meaningful in certain contexts and meaningless in others.

8.        Application to Feyerabend’s account

Let us go back now to Feyerabend’s realist interpretation of experience and its reportage, as I codified it in

[1] Observers in community C reliably agree to “Phlogiston is escaping” in the presence of fire and reject it in the absence of fire.

[2] Observers in community C reliably agree to “There is fire” in the presence of phlogiston escaping and reject it in the absence of phlogiston escaping.

[3] Observers in community C reliably agree to “Phlogiston is escaping” in the presence of rapid oxidation, and reject it in the absence of rapid oxidation.

Imagine ourselves in a distinct community C*, where we speak an English that is by now so thoroughly, relevantly theory-laden, that we would be entirely at a loss if we heard any apparent difference in usage between “fire” and “rapid oxidation” .

To begin, we would have no qualms about rejecting [2] altogether, while parsing [1] as

            [1*] “phlogiston is escaping” (in C language) means that fire is present.

This would be on a par, for us, with

            [1**] That the red light is on means that the engine is overheating,

though we would definitely reject as false:

            [1***] “The red light is on” means the same as “the engine is overheating”.           

We would also say that 

            [4] “fire is present” is a phrase for episodes of rapid oxidation,

and, if pressed, we would have to agree to the rather awkwardly worded

            [3*] “phlogiston is escaping” (in C language) means there is rapid oxidation occurring.

Note well that [4] is in our community a pragmatic tautology, and that [1*] and [3*] make sense only as intra-communal discourse by us, as accurate statements about another community’s observation language.

At the same time we would surely reject:

[3**] “phlogiston is escaping” (in C language) means the same as “there is rapid oxidation occurring” in our language.

For the meaning of “phlogiston is escaping” can only be explained in terms of phlogiston theory, which we do not accept, and which we take to be false.   It is [3*], and not the falsehood [3**] that motivates us to leave the theatre if someone shouts “Phlogiston escaping!”, even while we judge the shouter to be shouting a falsehood.

9.        CODA: What is my language?

Formal semantics did not develop along the route charted by Wilfrid Sellars.[12]  In the above account of meaning there is a crucial distinction between 

  • speakers’ understanding of a words and statement in their own language, 

and 

  • their understanding of words in a language not their own.  

It is not assumed that the language of another community is unintelligible to us, or incommensurable with our own.  

Quite the contrary: someone whose own language is English may be a teacher, teaching German to a French student, who is still learning English while enrolled in that teacher’s German class. 

Equally, someone in our own community, whose language is current chemistry-theory-laden, may teach a history of science class, and depict how persons in a certain historic community reported the presence of fire, using language that was phlogiston-theory-laden.  

Fine so far, but ….

As I reflect on the above account of meaning and interpretation, it seems to me that a great deal is left to rest on the distinction between what is my language, and what is a language that I understand.  

And that raises a further question, that remains to trouble us: what is my language?

10. References

Carnap, Rudolf (1947) Meaning and Necessity: A Study in Semantics and Modal Logic.  Chicago: University of Chicago Press.

Carnap, Rudolf (1955) “Meaning and synonymy in natural languages”.  Philosophical Studies 6: 33-47.

Keith S. Donnellan (1966) “Reference and definite descriptions”.  The Philosophical Review 75: 281-304).

Feyerabend, Paul  (1958) “Attempt at a realistic interpretation of experience”.  Proceedings of the Aristotelian Society 58: 143-170.

Feyerabend, Paul  (1962) “Explanation, Reduction, and Empiricism”.  PP. 103-106 in H. Feigl and G. Maxwell (ed.), Minnesota Studies in the Philosophy of Science, 3: 28-97.

Feyerabend, Paul (1981)  Realism, Rationalism & Scientific Method.  Philosophcial Papers Volume I. Cambridge: Cambridge University Press. 

Kuhn, Thomas (1962) “The Structure of Scientific Revolutions”.  pages  1-173 in the International Encyclopedia of Unified Science II-2. Chicago: University of Chicago Press. 

Sellars, Wilfrid and Roderick Chisholm (1957) “Intentionality and the Mental: a Correspondence”. Minnesota Studies in the Philosophy of Science 2: 507- 539. 

Sellars, Wilfrid (1989) The Metaphysics of Epistemology. Ed.: Pedro Amaral.  Atascadero: Ridgeview Pub. Co.

11.  Notes


[1] In a footnote Feyerabend acknowledges his debt to discussions at the Minnesota Center for Philosophy of Science, where he was a member in 1957.  (Note:  in the published paper that is footnote 22, in the 1981 book reprint it is 31.)  Thomas Kuhn’s The Structure of Scientific Revolutions would appear in the International Encyclopedia of Unified Science in 1962, with an acknowledgement to Feyerabend in its preface.  The Journal of Philosophy (54: 709-712 notes and news, 1957) reported: “A conference, sponsored by the National Science Foundation, was conducted at the Minnesota Center for Philosophy of Science from August 12 to September 14, 1957. The participants were: H. Gavin Alexander, Eva Cassirer, H. Feigl (Director of the Center), P. Feyerabend, C. G. Hempel, G. Maxwell, H. Mehlberg, E. Nagel, H. Putnam, W. Rozeboom, M. Scriven, and W. Sellars. Daily group discussions and essays, circulated as memoranda, treated, extensively and in detail, the logical and philosophical issues of quantum mechanics in particular and of scientific theories in general.”

[2] The term “pragmatic theory of meaning” does not occur in this lecture, but Feyerabend used it afterward; see e.g. Feyerabend 1981: 51, 125.

[3] The inclusion of “unanimous” is meant to indicate that these reactions by the community are reliable or consistent in certain respects, which Feyerabend indicates but does not clarify very far.

[4] This is offered in opposition to the Stability Thesis, that the meaning of observation terms is the same before and after scientific theory change.

[5] Feyerabend (1962: 30) introduces the term “principle of meaning invariance” for what he disputes, whereas in the 1958 lecture he used “Stability Thesis”.

[6] The two accounts he criticizes are the principle of pragmatic meaning (the interpretation of an observational term is determined by its use) and the principle of phenomenological meaning (the interpretation of an observational term is determined by what is ‘given’ by way of feelings and sensations in the appropriate circumstances).  

[7] Carnap’s main achievement, in the development of what he calls the method of extension and intension (Carnap 1947)  was the development of a formal semantics for modal logic, still in abstraction from context- or use-dependence of modal locutions. 

[8] While Chisholm has a quasi-psychological account, with thoughts as central characters, the form of his view is that of the Platonist construal of language and meaning: the word’s meaning is an entity, to which the word bears a certain relation.  

[9] The remote analogy is not very remote, for Sellars asserts without qualification that “semantical statements about linguistic episodes do not describe, but imply a description, of these episodes” (Sellars and Chisholm 1957: 536).

[10] Sellars makes the point in a slightly different way: someone might be told that “Hund” plays in German the same role as “dog” plays in English, and still not know the reference of “Hund”. namely if he has not learned the referring use of “dog”.

[11] The lectures collected in this book were delivered in 1975, and edited in collaboration with Sellars. 

[12] Much work was done to develop formal pragmatics, adapting models of modal logic by adding parameters for contexts, speakers, and agents.  My hope is that this can be complemented by reference to the early discussions of the language of science in practice.