Mixture likelihood ratio based on exclusion

20 June 2000

Forensic mathematics index
Likelihood ratio

The likelihood ratio method – for example for paternity, or for a mixed stain in a criminal situation – consists of formulating two hypotheses and comparing the probability to see the evidence, assuming each hypothesis in turn.

If we take a limited view of the evidence, the same scheme produces an "exclusion likelihood ratio." The limited view consists in ignoring the details of the man's type, and only taking cognizance of the fact that he is "excluded." I put "excluded" in quotes because it is a questionable, if not downright phoney, concept.

In the case of a mixed stain, we probably define exclusion like this:

A man is excluded if he has any allele at any locus not detected in the stain.

Normally we do not take any account of number of contributors. Thus, if the stain is PQRS, a homozygous Q man is not excluded even if the witness swears there were only two assailants.

We define A=probability to exclude a random non-contributor. Given the stain, then, there is a formula to evaluate A. For the likelihood ratio formulation, we let
E=evidence= DNA types of the stain, plus the fact that the suspect is not excluded.
H0 = "suspect contributed"
H1 = "suspect is unrelated to any contributor"

Then X=P(E|H0) = 1

Y = P(E | H1) = 1-A.

Therefore LRexclusion= 1/(1-A).

Evaluation of A

Define Ai = probability to exclude a random non-contributor at the i-th locus.

The formulas come out naturally in terms of 1-A and 1-Ai, so we denote these quantities by B and Bi respectively. Thus B=inclusion probability, which is to say that B is the proportion of the population whose DNA alleles are contained among the stain alleles.

If we believe in the product rule across loci, B=PROD Bi.

To see how to evaluate the locus-specific inclusion probability Bi, consider an example. Suppose that p, q, r, s are the frequencies of alleles P, Q, R, S respectively, and that o (which may be 0) is the rate of null or undetected alleles.

Then for the locus i mentioned above, any man both of whose alleles are among P,Q,R,S or null will be included. If you accept the product rule within a locus,

Bi = (p+q+r+s+o)2.

It is easiest to understand this expression by taking the point of view that P, Q, etc are sub-types of some type Z={P, Q, R, S, null}. The set of eligible men is then the set of men who are "homozygous for Z." Since the frequency of the type Z is p+q+r+s+o, the expression for Bi is now obvious.

Alternatively, by expanding the square we get

Bi = p2 + q2 +... + 2pq + 2pr + ...,

So that Bi represents the sum of the frequencies of all genotypes consistent with the stain. Therefore, if we want to abandon the assumption of the product rule for genotype frequencies and instead follow the NRC II recommendation of compensating for homozygous types based on population substructure, instead of p2 we should write p2 + p(1-p)theta. This is what the DNA·VIEW DNA exclusion command does.