Probability statements (3) Form

An elementary statement assigning probability was defined to be one whose semantic value (set of probability measures that satisfy it) is convex.

Here I want to show that there is a wide range of elementary statements beyond the examples we had, including statements assigning odds and conditional probabilities.

Familiar sorts of statements include, besides the ones examined in the previous posts:

ODDS

“It is twice as likely as not to snow today”, with logical form P(A) = 2P(~A),

“The odds of A to B are 3 to 1”, with logical form P(A): P(B) = 3 : 1,

CONDITIONAL PROBABILITY

“The probability of A, given B, is 2/3”, with logical form P(A/B) = 2/3, or equivalently, P(A∩B):P(B) = 2: 3

CORRELATION

“Rain is more likely in the winter than at other times” , with logical form P(A/B) > P(A/~B).

A good tactic will be to look for a general form in which these can be expressed, and then to see when statements of that general form are elementary.

The concept of expectation

To introduce the more general form we can look to the form of a judgement about expectation (terms vary: expected value, expectation value) as it is understood in probability theory. So I’ll begin by introducing this informally, then give the precise definition, and after that, examine the above examples in those terms.

A good example of an expectation value is the announcement sometimes seen by casinos, “98% payback!”. Does that mean that, if you gamble there, the probability is 98% that you will end up with your money back? Certainly not: this is about a sort of average over huge wins, huge losses, and myriads of small losses. Even if we don’t go to casinos, we are always seeing expectation values announced. For example, the weather forecast gives 0.1 inch of precipitation for Seattle in the next 24 hours. That is not a prediction that it will be precisely, or even approximately 0.1 inch. Rather it is a weighted average over the chances of various amounts of quantity of precipitation within that period. In both these examples what is announced is the expected value of the quantity in question.

Suppose a quantity q has values a(1), a(2), … in certain possible scenarios, and the probabilities of those scenarios are p(1), p(2), …. then

the expectation value of q, for this probability function, is the sum

(a(1)p(1) + a(2)p(2) + …. )

In statistics, the term used is not “quantity” but “random variable”. A poor choice of terminology, mystifying for the uninitiated, but well, there you go.

Statements of expected value

Let’s add to our statement forms Exp(q) = x. A probability function or measure p satisfies Exp(q) = x if and only if the expectation value of q, for p, equals x.

This is the form of an elementary statement, for the semantic value of such a statement is a convex set of probability measures. For imagine that (a(1)p(1) + a(2)p(2) + …. ) = x and also (a(1)p'(1) + a(2)p'(2) + …. ) = x. If we then evaluate the expected value for the mixture bp +(1-b)p’, we see that each value a(i) gets coupled with a number that is between p(i) and p'(i), for i = 1, 2, … The resulting sum will therefore be between the above two sums — so, between x and x, in other words, equal to x.

This argument generalizes very easily to the expectation value being in some interval of numbers. So we can write Exp(q) ε I, for any interval I, and this will also be an elementary statement.

I will put the precise account in the Appendix below, but this is enough to show how the above common probability statements can all be put in terms of expectation.

Translating probability statements into statements of expected value

Example 1: our paradigm example P(A) = x.

There are two relevant possible ways things can be, A and ~A. Now, we can define a function 1A, which takes value 1 in the first way things can be, and value 0 in the other way they can be. (1A is called the indicator quantity for A.) So, for any probability function p, the expectation value of 1A equals [1.p(A) + 0.p(~A)], and that is just p(A).

Thus we can translate (P(A) = x) into (Exp(1A) = x): these statements are satisfied by the same probability functions, they have the same semantic content.

In many cases it is easy to rewrite the elementary statement so that it obviously has the form of equating an expectation value to 0.

Example 2:”It is twice as likely as not that A”, which would have the form P(A) = 2P(~A)

Rewrite this as 1.P(A) – 2.P(~A) = 0. This is the probability-weighted sum of 1 and 2, corresponding to the two possibilities A and ~A. So define r to be the quantity which takes value 1 on A and value -2 on ~A. Then for any probability function p, the expected value of r is the sum 1.p(A) +(- 2).p(~A). Therefore our example statement is equivalent to Exp(r) = 0.

Example 3: “The odds of A to B are m to 1″, which would have the form P(A):P(B) = m:1, or equivalently, P(A) = m.P(B)

Now we require a bit of ingenuity, because A and B — unlike A and ~A — may overlap.

To spell this out we should think of the four cells in the partition {A -B, A ∩ B, B- A, W – B – A}. Suppose that measure p assigns to these cells the probabilities x, y, z, u respectively.

Thus p assigns x + y to A and assigns y + z to B. That is, p(A) = x + y, P(B) = y + z. The equation that p must satisfy can now be rewritten, till it looks like the sort of sum we see in an expectation value:

p(A) = m.p(B),

x +y = m(y + z),

x + (1-m)y – mz = 0

This last line shows us how to define the relevant quantity, call it s. The four cells of the partition are the four ways things could possibly be. Quantity s is defined to have value 1 on (A-B), value (1-m) on (A ∩ B), value -m on (B-A), and value 0 elsewhere. Therefore

Exp(s, p) = 1p(A-B)+(1-m)p(A∩B) +(-m)p(B-A) + 0p(W – B-A).

x +(1-m)y +(-m)z +0

so the statement P(A):P(B) = m:1 is equivalent to Exp(s) = 0.

Thus the odds statement P(A):P(B) = 1:m is equivalent to Exp(s, P) = 0, for this random variable s.

Example 4: “The probability of A, given B, is m”, with logical form P(A/B) = m,

This is similar to the preceding, and is solved in the same way. P(A/B) is defined as P(A∩B) : P(B), and this ratio is here asserted to be m. So for a probability measure to satisfy this it must meet the condition

p(A∩B) : p(B) = m

p(A∩B) = mp(B)

p(A∩B) – 2p(B) = 0

p(A ∩ B) – m[p(A ∩ B) +p(B-A)] = 0

p(A ∩ B) – mp(A∩B) – mp(B-A)= 0

(1-m)p(A ∩ B) – mp(B-A)= 0

The deduction is quite similar to the preceding, noting that B is the union of (A∩ B) and (B -A), and we can define the random variable t:

t takes value (1-m) on (A ∩ B), value (- m) on B(-A), and value 0 elsewhere.

Then we see that P(A/B) = m is equivalent to Exp(t, P) = 0.

Conclusion

In these results we only needed recourse to statements of the form Exp(r) = x; in the Appendix I will show that these, but also those of more general form Exp(r) ε I, where I is a convex set of real numbers, are elementary statements.

I am not taking up the last of the series of examples, the example of positive correlation, because however symbolized, it is definitely not an elementary statement. Correlation is a non-linear relation, and convexity is not preserved.

But we have seen that the examples of the previous post, as well as odds statements and conditional probability statements, are elementary statements. For they are equivalent to statements that say that certain quantities have expected value 0, and such statements (we saw along the way) are elementary.

Open question at this point: is it also possible to formulate statements which are elementary, in the defined sense, but not equivalent to statements of expected value?

APPENDIX

Here I will spell out the above in the precise way introduced in the first post, in terms of frames and model structures.

Let K be the frame <W, F>, and M the model structure <W,F, I>.

A random variable r on K is a measurable function that has a numerical value at each world in W, for example, the height of the highest mountain or the price of wheat. To say that r is measurable means that, for each numerical value, the set of worlds at which r has that value is a legitimate proposition, that is, it is a member of F.

That r has value m on proposition A in a given model structure is defined to mean that r has the same value m at each world in proposition A.

I will here restrict the discussion to what I will call simple random variables: ones with finite range. If r has a finite range then its set of values V(r) = {a, b, c, …, k} correspond to a partition of W, the cells being the propositions C(r, j) = {w: r = j}, with j = a, b, c, … ,k. Call this partition the characteristic partition of r.

If p is a probability measure in P(M), let the probability of C(r,j) be p(j). Then the expectation (or, expected value) of r relative to p is defined to be:

Exp(r, p) = ap(1) + bp(2) + … + kp(k)

the general formula being

Exp(r,p) = Σ{xp(Cr, x): x in V(r)}

Now we can introduce statements that assign expectation values by specifying their semantic value, that is the set of probability measures that satisfy them.

Exp(r, P) = y is satisfied by p in P(M) if and only if Exp(r, p) = y

Exp(r, P) ε [a, b] is satisfied by p in P(M) if and only if Exp(r, p) ε [a, b]

and similarly for open and half-open intervals.

Theorem. The statement Exp(r, P) = y, and more generally the statement Exp(r, P) ε I, where I is a convex set of real numbers, are elementary statements.

Just for the second part, suppose that x = Exp(r, p) and y = Exp(r, p’) both lie in the convex set I. Then the convex combination p” = mp + (1-m)p’ gives values to the cells C(r, j) that lie between the values given to them by p and by p’ respectively. So Exp(r, p”) is also in I.

This is a bit abstract, so a simple example:

ap(1) + bp(2) = x and ap'(1) + bp'(2) = y, so

ap”(1) + bp”(2) = map(1) + mbp(2) +(1-m)ap'(1) + (1-m)bp'(2)

= a[mp(1) + (1-m)p'(1)] + b[mp(2) + (1-m)p'(2)]

so a, b are being multiplied by a number that is somewhere between the numbers that multiplied them in the initial line — their sums when that is done must then also be between the two original sums, hence lie in the same convex set of real numbers.