comments on "Fundamental problem of forensic mathematics – The Evidential Value of a Rare Haplotype"


Brenner CH (2009)
in press 2009
Fundamental problem of Forensic Mathematics — Evidential value of a rare haplotype Nov 10, 2009 (maximally readable html version of the paper)

abbreviated abstract

When a rare haplotype is shared between suspect and crime scene, how strong is the evidence linking the two? The fundamental question is the matching probability:
What is the probability that an innocent suspect will match the crime scene haplotype?

The common and interesting situation is a previously unobserved haplotype. The traditional tools of product rule and sample frequency are not useful when there are no components to multiply and the sample frequency is zero. A useful statistic is the fraction K (kappa) of the population sample that consists of "singletons" – of once-observed types. A simple argument shows that the probability for a random innocent suspect to match a previously unobserved crime scene type is (1-K)/n – distinctly less than 1/n, likely ten times less. The robust validity of this model is confirmed by testing it against a range of population models.

personal comment

I've been trying to write this paper since 1997 and had a lot of problems along the way. The goal was never to present a conservative number or to make an official recommendation having the force of authority, but to understand the problem.

As early as 1999 I presented an answer at a statistics conference (although a key point is that the problem is not a statistical problem) in North Carolina, the approach called in the present paper the t model. However, I hit a snag trying to justify it mathematically which I thought necessary since the mathematical derivation is unfortunately not quite simple.

I had, though, by that time realized three key insights which contradict common practice in the forensic community and perhaps it is fair to say commonly held beliefs:

The crime stain must be counted as part of the population sample.
Implication: A "zero observations" situation never arises.
The pertinent question is a question not of frequency but of probability (and there is a difference)
Implication: "Confidence intervals" are irrelevant to the problem.
As an estimate of the chance to see a trait in a population, the chance to see it in a population sample may be neither neutral nor reasonable.
Implication: The matching LR may very much exceed the size, n, of the reference sample.

A couple of years ago I found an easier way to arrive at approximately the same answer – the kappa model and eagerly imagined that with this new-found simplicity the paper was only weeks away. To curtail a long story that proved not to be the case. However finally, after a final push of several months during which I determinedly avoided and procrastinated almost all other priorities I finally got the paper out the door. A link to the submission draft is above.


Return to home page of Charles H. Brenner