Paternity probabilities when there are many hypotheses
Paradoxical (wrong) answer
Suppose two men are each tested for paternity of the same child.
It occasionally happens (especially if the men are brothers)
that neither man is "excluded".
Maybe the paternity indices are
200 and 500 for the two men. Applying the usual
formula for paternity probability, W=PI/(PI+1), gives the result that
the first man "has a 99.5% probability of paternity" and the second man
"has a 99.8% probability of paternity."
I put "has a ... probability of paternity" in quotes because (a) it is
a term commonly used in this situation, so I am quoting, and (b) there is
obviously some nonsense hidden here, so I am using quotes of derision.
The lie would be perhaps even more bald if we used more natural English:
The first man is 99.5% to be the father and the second man is
99.8% to be the father of the child.
If you believe that, then there is at least a 99.3% chance that
both men are the father. Such a conclusion flies in the face
of conventional wisdom about biology. Yet many paternity experts
especially German ones make such statements all the
time. The kindest excuse I can make for them is that, when they use
the term "probability" it is a term of art,
which is to say that it doesn't mean "probability." That's not a
particularly kind excuse, for it omits an explanation of what they
do mean. Kinder I won't be, for I don't think they know.
Where does this nonsense come from?
The aforementioned "usual
formula for paternity probability, W=PI/(PI+1)" has a built in
"prior probability" assumption of
50% i.e. the assumption that the non-genetic evidence is not
only equally balanced between the two contrary hypotheses of paternity
and unrelatedness, but that specifically each hypothesis is 50% to
That's a problem. For simplicity and definiteness, let's assume
the situation that the two tested men are brothers. Then there are
three possibilities under consideration:
The "usual formula" assumes a priori that the first man is 50% to be
the father, and 50% to be unrelated. Moreover, the second man is also
assumed to be 50% to be the father. 150% is too much.
- F the first man is the father
- U the first man is the uncle (because the second man is the father)
- Z the first man is (and therefore both men are) unrelated to the child
Reasonable prior probabilities
Whatever you may think about prior probabilities, however they are
assigned or derived, the total of any set of probabilities of
exclusive events can add at most to 100%. Keeping that in mind we
imagine that through some process or other, prior probabilities Rf,
Ru, and Rz are assigned to the three hypotheses, such that
Rf+Ru+Rz=1. Suppose we consider Rf=1/4, Ru=1/4, Rz=1/2 based on the
facts in a particular case.
Reasonable posterior probabilities
Then through genetic testing we compute some likelihood ratios.
Normally we would compare F vs Z, getting for example LR(F,Z)=500,
and compare U vs Z, gettting LR(U,Z)=100. (Note that it automatically
follows that LR(F,U) = 500/100=5, and that LR(Z,F) = 1/LR(F,Z), etc.)
Now, we would like to compute some posterior probabilities Pf, Pu,
Pz, where Pf+Pu+Pz=1 also.
Maybe the easiest way to think of the answer is to think of the
several LR computations in terms of a "triple ratio":
It's worth considering carefully what this means. It means that,
given F, the chance to see genetic types like those observed is
proportional to 500. Given U, the chance is proportional to 100. Etc.
What does "proportional" mean in this context? It means that we can
replace the 500, 100, and 1 by any other triple of numbers so long as
they are in the same ratio.
|are explained by the data in the proportion
Hence, the posterior probability Pf is proportional to Rf x 500 = 125.
Pu is proportional to Ru x 100 = 25.
Pz is proportional to Rz = 1 = 1/2.
In other words, Pf=125c, Pu=25c, Pz=0.5c. Since we also know Pf+Pu+Pz=1, we can solve for c and eliminate it.
Pf+Pu+Pz = 150.5 c, or c=1/150.5. Hence the final answer is Pf=125/150.5, Pu=25/150.5, and Pz=0.5/150.5.
Reviewing the above process, the formula turns out to be:
and similarly for Pu or Pz. LR(Z,Z)=1 of course.
LR(F,Z)Rf + LR(U,Z)Ru + LR(Z,Z)Rz
In the above derivation, I chose the hypotheses Z as "pivotal" in
the sense that I compared each of the others to it. Had I chosen F or
U as pivotal, the result would have been the same. The only
restriction on choice of pivot is that the pivotal hypothesis must be
possible, from the genetic evidence. (Otherwise the LR's will mostly
be infinite.) There is no reason that I can see that the hypothesis
chosen as pivotal must have a prior>0 however.
Forensic mathematics home page