- Background discussion
- Introduction of terms
- Relationship between W and L
- paternity, and
- non-paternity
- What's wrong with this picture?
- Hidden assumptions
- The assumption that "nonpaternity" means "unrelated"
- for the plaintiff, H
_{0}: The accused is the father. - for the defendant, H
_{1}: An unrelated man is the father. - 50% as prior probability is approximately the right number as a historical average when paternity has been alleged and disputed.
- We therefore include it for guidance, and only as an indication and example.
- The expert report does not assume 50%; it only means "
**If the court**assumes a prior probability of 50% ...". If the court were to make a different assumption, the conclusion would be somewhat different. - Therefore if the court judges that the facts of the present case are not typical, the 50% prior probability assumption may not be appropriate. However, it will usually require a very extreme prior probability assumption to change the final conclusion, especially for a civil case ("preponderance of evidence" criterion for decision).
- 50% de probabilidad a priori es aproximadamente el número correcto como promedio histórico cuando la paternidad se ha alegado y controvertido.
- Por lo tanto, se incluyen con carácter indicativo, y sólo como una indicación y el ejemplo.
- El informe pericial no asumir el 50%, sólo significa "
**Si el tribunal**toma una probabilidad a priori de 50% ...". Si el tribunal tuviera que hacer una estimación diferente, la conclusión sería algo diferente. - Por tanto, si los jueces del tribunal que los hechos del presente caso no son típicos, el supuesto 50% de probabilidad a priori puede no ser apropiado. Sin embargo, normalmente se requiere un supuesto previo probabilidad muy extremas para cambiar la conclusión final, en especial para un caso civil ("preponderancia de la evidencia" criterio de decisión).
- The assumption of a 50% "prior probability"
- The real formula for W
- A probability summarizes a point of view
- Strategy
- Defense strategy
- Plaintiff strategy
- Mathematical analysis of the OJ defense
- H
_{0}: OJ left the blood. - H
_{1}: An unrelated person left the blood.

I put quotes around "probability of paternity" because I mean
to discuss the *particular* statistic that appears on a
paternity report.

The paternity report often says something like this:

Paternity index = 1204 Probability of paternity = 99.92%

The first of these numbers has a highly respectable mathematical
background, so we shall call it **L**, reflecting
the fact that the
paternity index is an
example of a mathematical
notion known as a likelihood ratio.

The "probability of paternity" has a somewhat less cultured
background, and we call it **W** (from the German
word *Wahrscheinlichkeit*, "probability").

Looking at my example, you can see that W and L are related as

W = L / (1+L), or, if you want to go backwards,

L = W / (1-W).

The logic of the relationship is -- or is supposed to be -- that L summarizes the scientific (i.e. genetic, usually DNA) evidence, and W incorporates other ("anectodal") evidence about the case.

L compares two hypotheses or scenarios -- informally they are

(Note: L does *not* say how much more likely one
*hypothesis* is than the other. It says the converse: If you
*assume* one or the other hypothesis, under which assumption
would the *genetic evidence* be more likely?

(Suppose for example that the man and the child have some
genetic traits in common. Assuming that the man is the father
this is a likely and expected result; children get genetic
material from their parents. Assuming that the man is *not*
the father, the result is not so likely. In the non-paternity
case a coincidence must be assumed to explain the common traits.)

Since W is labelled as a "probability," it should in principle refer to some repeatable experiment and to the chances of making right and wrong decisions. If a judge hears 1000 cases in each of which W=99.9%, he should be entitled to expect that by ruling for paternity every time he will be wrong only one time, and right 999 times.

L is computed purely from the genetic types (and statistical studies). W is computed, as shown above, purely from L. Suppose the man is sterile, or long dead? Suppose the woman had only this one partner? According to the methods described above, that would change nothing. There is no room for the facts!

Before you get the idea that I am a revolutionary trying to throw the whole paternity testing concept into disrepute, let me put this sensational seeming claim into context. Methods or a point of view that leave no room for the facts are of course illogical, and therefore should be improved. But in most paternity cases the science is extremely strong and as I shall discuss below there is legitimately very little room for the non-scientific facts in coming to a correct and balanced appraisal.

(1) The father is either the accused, or someone who is entirely unrelated.

(2) Before considering the DNA evidence, the probability is 50% either way.

In the United States, it is gradually becoming understood that both these assumptions should be clearly stated on the report. Otherwise the report is not really fair because the judge will assume that it means more than it does.

Each of the above assumptions is worth discussing.

But the laboratory is not one of the disputants.

Specifying alternate hypothesis should logically be the prerogative of the side alleging non-paternity, the defendant. If that option is concealed, it is as if the plaintiff can say to the judge,

"Judge, here is the scientific evidence. Here are the stories for the two sides: we claim H_{0}, andwe claim that they claimH_{1}. As you can see, H_{0}is a much better explanation for the evidence, so we should win."

If the man feels that paternity by his own brother is a possible alternative hypothesis he should not allow the laboratory, the judge, or the opposition to put other words into his mouth. Otherwise he ends up defending a strawman.

Once a pair of competing hypotheses are correctly formulated, there is an objective procedure for calculating the correct value of L, the quantitative summary of the value of the genetic evidence for distinguishing between the two hypotheses.

## How explain to judges why we include a prior probability of 50% in our paternity reportHere are my thoughts, simple answer first and then some explanation:## Cómo explicar a los jueces por qué se incluye una probabilidad a priori de 50% en nuestro informe de paternidadAquí están mis pensamientos, una respuesta simple y luego alguna explicación: |

To assume a prior probability of 50% is therefore tantamount to saying that the plaintiff and defense stories are equally plausible. To say that without hearing either story seems offhand to contradict the principle of a trial.

Therefore it is important to understand that if the paternity report includes the inconspicuous phrase "at 50% prior," what it really means is

If the reader of this report wishes to assume a prior probability of 50%, then the (posterior) probability will be [whatever figure is given]. On the other hand, if the reader feels that some other prior probability is appropriate, then the posterior probability will be somewhat different.

And if the report gives a W value with no mention at all of a prior probability (which is the practice in most countries), then it is flat misleading.

How, you may wonder, can such blatant unfairness go on? Am I the only one aware of these problems? Of course I am not. The balanced view is that there is unfairness in principle, but generally not in fact. The next section is in a special color so that, if so inclined, you can skip the math.

This section shows how to avoid assuming a prior probability of 50%, and shows why the conclusion probably won't change anyway. Let p=prior probability, L=likelihood ratio. The real formula for W is W= (pL) / (pL + 1-p). Including the prior probability p as a variable is the avenue through which anecdotal evidence can be incorporated into W. (Note that if we take p=50%, the formula reduces to W=L/(L+1), the formula cited above that is so often used unwittingly. That is where the above formula comes from, and why we say that it has a prior probability assumption built into it.) But the effect of allowing p to vary is often not material because using DNA tests L is usually very large. For example, suppose that L=100000. Then the paternity report will probably calculate W=100000/100001 and report
Suppose that the defendant presents powerful testimony: the woman is unreliable and has made many false accusations in the past; he is a prominent person, a likely target; she admits that full penetration did not occur and that he failed to ejaculate. Suppose that from all of this the judge is persuaded to estimate that if testimony like this were presented in 500 cases, only about once would the man be the father and 499 he would have been falsely accused. So p=1/500. Applying the above formula, we get
The verdict is probably the same whether W=99.5% or 99.92%. Certainly in a civil case it should be. So while the man is entitled by our system of jurisprudence to have his say in court, it is likely that nothing he can say will be significant compared to the force of the DNA evidence. |

I think it is an obvious point, but often not appreciated, that
any probability statement will reflect a particular point of view.
From the point of view of the testing laboratory, 50% is a reasonable
prior because they see many cases of paternity accusations and the
accusation is correct or incorrect close enough to equally
often.^{1} However this view ignores the
factors that may distinguish one particular case from the multitude.

From the judge's point of view, the man's story may be quite compelling -- "I am infertile", "I never met the woman", etc. If the judge is tempted to believe such testimony, the prior probability from his point of view can be much smaller.

The opposite can also happen of course, where the judge feels inclined to start with a prior probability much higher than 50% favoring the story of the plaintiff.

If the defense is smart, they won't allow the plaintiff to dictate
their case. Maybe they can find a claim H_{2}, usually that
the true father is related to the accused, that will explain the
evidence nearly as well as H_{0} does. They will say, "Wait
a minute! I never claimed H_{1}! Maybe I claim
H_{2} -- THEN how does your calculation work? So the defense
wants to change assumption
(1), which they are certainly entitled to
do.

(The brother defense is especially likely to be useful when the alleged father is not tested, as can happen in an inheritance case where he may be dead and relatives are tested in his stead.)

That doesn't mean the defense wins, it just means the plaintiff
has to be smart. If the plaintiff case is in fact correct (the
accused really is the father), then they will be able to modify
assumption (2), and that will be their
winning response. For example, suppose the defense suggests
(H_{2}) that the true father is the brother of the accused.
If the plaintiff can show that the accused has no brother, then
assumption (2) changes a lot.

Comparing the alternatives

the likelihood ratio is 200:1 favoring OJ as the origin of the blood.

Incidentally, the press^{2} promptly
misreported this as meaning that it is 200:1 that OJ left the
blood -- thus committing the blunder (prosecutor's fallacy)
of presuming a prior probability of 50%.

The defense was very clever. They took the approach that I
described above as changing assumption (1).
They said, "No, we don't allege your H_{1}. We allege

H_{2}: The detective planted OJ's blood
from a vial"

Notice what this does to the likelihood ratio. Assuming
H_{0}, what is the chance to observe the evidence (namely,
that the sidewalk stain matches OJ)? Essentially 100%, so that
is the numerator of the likelihood ratio.

Similarly, assuming H_{2}, what is the chance to
observe that the sidewalk stain matches OJ? Again, 100%. So
100% is also the denominator of the likelihood ratio.

Result -- the likelihood ratio is 1. The evidence has no probative value whatever.

Following the lesson above, the prosecution now should have argued about assumption (2). That is, they should argue that the (prior) probability that a the detective would, and could successfully, do such a thing is very small. That seems an easy enough argument to make, but I don't think they made a serious attempt at it. Rather, they sputtered quite a bit, apparently enraged at the defense's suggestion and objecting in principle to the defense's tactic of "putting the police on trial." I don't think the criticism is a fair one. The defense adopted a logical (if desperate) argument, and if that defense seems to amount to "putting the police on trial" so be it.

Comments? Questions? Disputes?

Links: Forensic mathematics home page.

What's wrong with the "exclusion probability".