What are they good for?
Summarizing evidence. Any kind of evidence. To illustrate, here are some examples.
I propose the following method to test if a person is French. Time a one hour interval, and count
the number of French words the person speaks.
Obviously this is relevant on the whole, French people speak more French than non-French.
There are exceptions of course, such as sleeping French people (who speak little French) and
Quebec people (who speak quite a lot).
Since there is no obvious mathematical model for how many French words either a French or a
non-French person speaks, it will be a good idea to calibrate my test in a pragmatic way. Let's
build a database by measuring a large number of people, of both types (French and non-French),
for one hour each.
Now let's take a test subject, start the stopwatch, and suppose we count ten French words in the
sampling hour. Of course, it might be informative to know what those words were, but let's not
ask that, and just ask what conclusion follows from the limited view that there were ten French
Ten French words in an hour
Suppose that a check of the database shows that this result is a
performance occasionally turned in by a French person once per
hundred hours and less often by non-French people only
once per three hundred hours. That is, speaking 10 French words in an
hour is 5 times more characteristic of French, than of non-French.
Does this mean the person is 5:1 to be French? Possibly, but not necessarily. It depends on the
context. If the person had been intentionally selected as 50:50 to be either French nor not, then
yes. If the person was randomly selected from the Paris phone book, then the person was 100:1 to
be French before the experiment and this number doesn't decrease to 5:1 because they speak
Sometimes people try to insist on asking, "What if the context is completely unknown, is random,
there is no context. Then what?" The question has little meaning and makes no sense. There is a
temptation to claim that "picking a person at random from the whole world" is a random context,
but why should that be so? Doesn't a "random context" really mean picking a context at random
from all contexts? But here are some contexts:
- Pick a person at random from all people in the world except Aaron Aardvark.
- Pick Tallyrand with a 70% probability, and otherwise pick Quisling.
- Pick the next person who walks into Grand Central Station.
The list is not only infinitely long, it is uncountably infinite. There is no way to "average" over it.
Context is like preconceptions. You may sometimes think you don't have them, but surely you
know that everyone else does. The fact is, you can't avoid them.
Exactly French words in an hour
Ok, now let's take a closer look at the data. Exactly ten French words in an hour is an unlikely
result from anyone, but I'm sure you won't be surprised when I tell you that, according to the
database, it is 5 times more characteristic of French, than of non-French.
What, you want to know the actual numbers? I'm disappointed; you seem to be thinking like a
What are they?
LR = likelihood1 ratio = "the ratio of two probabilities of the same
event under different hypotheses."
If you can think of "evidence" as meaning information that
might nudge your decision about some matter in one direction or
the other, then the LR is the appropriate numerical summary of
Testing for a disease
A classical example is testing for a disease. The subject tests
positive for the disease, under a test that has a true positive
rate of 60% (only 60% of the afflicted trigger a positive response
for the disease) and a false positive rate of 1% (1% of healthy
people trigger a positive response). Then the LR=60, meaning that
the positive response is 60 times more characteristic of sick
people than of healthy. Although that is strong evidence that
the subject has the disease, obviously it does not imply any
probability conclusion. If the disease is very rare, then most
positives are in fact false positives. But it does increase the
odds of being afflicted 60-fold from whatever they were before
1. Likelihood is a synonym for probability, except that
we say "likelihood" when the emphasis2 is on varying the hypotheses
(or "conditional") under which the "event" is considered
(as opposed to varying the event, or varying neither).
2. Ok, maybe more than just "emphasis". In statistics there's
a technical definition of the word "likelihood" according to which it is not
synonymous with probability, but rather is applied to a condition and
means the probability, under that condition, of an unstated
but assumed event. For example consider these two probabilities
- X=Pr(person speaks 10 French words in the hour | person is French)
- Y=Pr(person speaks 10 French words in the hour | person is not French)
That usage flies in the face of grammar and seems guaranteed to confuse normal people
and encourage "transposition of the conditional." But it's standard in statistics
and probably useful if you're used to it. I think it was invented by Fisher in order
to have a word for the second situation.
- X is the probability of speaking 10 French words in an hour (given some
- X is the likelihood of the person being French (the unstated event being
the only event in the present context, namely speaking 10 French words in the hour).
Return to home page of Charles H. Brenner