Odds are more intuitive than probability (2) Bayes’ Theorem

We are all tested these days, by the medical profession for viruses and bacteria, by the police for alcohol consumption, by the sport team doctor for marihuana intake, and so forth. I’m sure their tests are awfully well designed. But I’ll give an imaginary example with simpler numbers and ratios to show how understanding the tests, and Bayes’ Theorem, is much more intuitive if we think in terms of odds.

I will first give intuitive statistical argument to show how a probability assessment is updated. Then I will recast this as Bayesians do, first in terms of probabilities and then (see how much simpler!!) in terms of odds.

EXAMPLE. The Y-virus is an STD, with an incidence of 1 in 500 in the college population.  There is a test for the Y-virus, and it is 99% accurate, in the sense that

            The probability that a positive result is a false positive is 1%.

            The probability that a negative result is a false negative is 1%.

Student Jones is not very worried, since he considers himself a normal, average student and the incidence is so low.  But he takes the test, and the test result is positive! How likely is it, after seeing this result, that Jones has the Y-virus?

Test yourself: just of the top of your head, do you think the answer is:

 99%, Between 75% and 99%, Between 50% and 75%, Less than 50% ?

STEP 1.  AN INTUITIVE STATISTICAL ARGUMENT

Imagine a total college population of 50,000 students, with the actual incidence of the Y-virus precisely 1/500.  Thus, in this population there are 100 students with the Y-virus. Imagine furthermore that all the students are tested, and the test performs exactly as specified, with 1% false positive and 1% false negatives.

The results will then be: Of the 100 students with the Y-virus exactly 99 test positive, and one tests negative. Of the 49,900 students who do not have the Y-virus, 499 test positive (false positives). Jones belongs to the sub-population which tested positive, which therefore has 598 members.  In this sub-population, there are just 99 which have the Y-virus.  So the probability that Jones has the Y-virus is 99 out of 598, which is approximately 16.555…%, or approximately 1 out of 6.

So, roughly speaking the probability that Jones has the Y-virus, in the light of the positive result, is about 1/6.  That is not as bad as he feared!

But it is certainly true that after he sees the evidence, his probability does get much higher than it was. How much higher?   The probability changed from 0.2% to roughly 16%: approximately 80.

What were Jones’ odds, and how did they change? To begin the odds of (virus) : (no virus) were 1 : 499. Afterward they 99 : 499. How much higher is this? We get a nice whole number: the odds were multiflied by 99, precisely.

It is this multiplier, which changes the old odds to the new odds that is called the Bayes factor. Look for it below!

STEP 2. THE BAYESIAN RECIPE, IN TWO FORMS

In Bayesian terminology, the initial probability or odds are called the prior ones, and those after the evidence is accommodated the posterior ones. To determine the posterior probability, they use Bayes’ Theorem (named after the 18th century Reverent Thomas Bayes).  

First version: for probabilities

I will use + for a positive test result, y for having the Y-virus, and P for probability. From the details above we have the following data: P(y) = 0.002, P(~y) = 0.998; P(+|y) = 0.99, P(+|~y) = 0.01.

Bayes’ theorem says:

(*) P(y|+) = P(+|y) times P(y)/P(+).

We can calculate P(+), using another theorem: P(+) = P(y)P(+|y) + P(~y)P(+|~y)

Plugging in the numbers we get P(y|+) = 0.16555… . Which is in accord with our earlier statistical calculation.

Was this intuitive? Oh, it would be if you do it often enough! 🙂

Second version: for odds, if you think in terms of probability

The odds of A to B is the ratio (Probability of A) : (Probability of B). You can also write this with the familiar symbol / for ratio or division, that is the same thing.

When B is ~ A, we just call that the odds on A. If it is just as likely to snow as not to snow, then the probability of snow is 1/2, while the odds on its snowing are 1 : 1. (or 50/50 as people like to say). Jones’ prior odds of having the Y virus are 1 : 499.

How do my prior odds change to my posterior odds, when I get evidence like a positive test result? The odds formulation of Bayes’ Theorem is

Posterior odds = prior odds times the Bayes factor

The Bayes factor is actually an odds ratio itself, it is the odds of getting a positive test result:

( test result +, given that you have Y) : (test result +, given that you don’t have Y).

With P* for the new probability and P for the old probability, that means this:

P*(y|+) : P*(~y|+) = [P(y) : P(~y)] x [P(+|y) : P(+|~y)]

Here it is quite easy to see the numbers to fill in. The odds on having Y are 1 to 499; that is [P(y) : P(~y)]. And the Bayes factor is the odds of a true positive to a false positive, which is the ratio of 0.99 to 0.01. So we arrive at:

The posterior odds P*(y|+) : P*(~y|+) = (1/499) x (99/1) = 99/499.

News for Jones: his odds of having the virus have been multiplied by 99. The Bayes factor!

Third version: if you think in terms of odds in the first place

Then you don’t need the formulas, you will have a simple visual calculation. Just remember what I wrote in the earlier post about how to conditionalize an odds vector: replace the ruled out parts’ numbers by zeroes.

The prior odds of having virus Y are 1 : 499. The odds of a correct test result are 99 : 1. So before we have the test result, the odds vector for the relevant partition looks like this:

Now the positive result comes in, we conditionalize on this by replacing the numbers for what did not happen by zeroes:

So the odds changed from 1 : 499 to 99: 499. The prior odds were multiplied by 99 (the Bayes factor), as seen in this simple, intuitive change of the odds vector.

Note. It is a peculiarity of my example that the odds of getting a correct result are the same as the odds of getting a positive result. Exercise: change the example so that the test still has only 1% false negatives but, say, 10% false positives.

1 thought on “Odds are more intuitive than probability (2) Bayes’ Theorem”

Leave a reply to Sir Maut Cancel reply