J For Sci 42(2):221-222
APPENDIX
Charles H. Brenner
The genetic markers ("alleles") of an evidence stain may be identical to the alleles of a reference sample (such as a suspect for example). The likelihood ratio for the evidentiary strength favoring association is then simply the inverse of the profile frequency. However the evidence stain is often complicated by the presence of additional alleles, variously from additional known or unknown suspects or victims. The likelihood ratio is clearly more complicated in such cases, but Weir et al (1996) presents without proof a general and elegant formula for the probabilities that occur as numerator or denominator. In this paper we give a proof.
Key words: forensic science, mixed stains, DNA profiles, likelihood ratios
(Not all symbols are properly translated & proofread yet)
Let E be the set of alleles observed in the evidence
for some discrete-allele system. Of these some
may be attributable to known parties; the remainder
UE
are to be explained by x people with
two alleles (not necessarily distinct) each. Explained means that
U
X
E, where X is the set of all
alleles in the x people. More generally for a subset
S
U,
we will say that people with alleles X
exactly explain S if
X
E
and
X CAP U = S, or equivalently,
putting W=U\S,
if U\W
X
E\W.
The set-notation symbols are to be understood according to the
standard conventions:
UE
means that the alleles U are among those of E,
including the possibility that U=E.
X CAP U is the
intersection -- the set of alleles that are both in
X and in U.
U\S is the set difference -- the set of
alleles that are in U excluding those that are also in S.
The cardinality (size, in number of alleles)
of a set J is written |J|.
The symbol epsilon denotes membership;
j epsilon U means that j is an allele of the
set U.
Following Weir et al we write Px(U|E) for the probability that x random people explain U. Let J, JU be a set of alleles. We will be interested in sets of people who omit the set J. Let
(1) TmJ = the total of the frequencies of the alleles in E\J.
Weir discovered that
(2)
but did not supply a proof.
A proof seems worthwhile. The general idea is clear enough -- (2) is an instance of the principle
of inclusion and exclusion (Hall, 1967). From the definition (1) and the assumption of a discrete
allele system, is the probability that x people's alleles are all in E\J. As the basis for the
inclusion-exclusion analysis, we note that
(3)
because any set of people whose alleles are among E\J exactly explains some one and only one subset, U\W, of U\J. The summation is taken over all sets of alleles W that satisfy JWU. Introduction of the sets W, as a means of effectively classifying the various positive and negative contributions to the sum in (2), is the key idea in the proof.
Define
(4)
In this notation, (2) is expressed as
(5) Px(U|E) = Q0 - Q1 +Q2 - + . . ..
Summing (3) for fixed m over all sets JU of cardinality m we obtain from (4)
(6) Qm =
(7) =
(8) =
(9) = .
On the right hand side of line (6) each W occurs many times, once for each J of which it is a
superset. The object is to count how many times. Classifying the W's according to their size k on
line (7) we see on (8) that it is the same as the number of m allele subsets of a k-set, which is
exactly the definition of the binomial symbol . Hence line (9).
To verify (5) form now the alternating sum over m, where the sum runs to n=|U|,
Q0 - Q1 + Q2 - + . . . =
(10) =
(11) = .
In line (10) shows the same set W may occur in several Qm terms. To compute the net
contribution due to each W, it is natural to reverse the order of summation so that the
classification is on W first and then on m, which is formula (11). To verify the transition from (10)
to (11) note that the index sets of the double summations and
range
over the same pairs (k, m) -- namely the triangular array where 0 <= m& <= k <= n.
Hence the net number of times that a contribution from each set W is included and excluded is
given by the last factor in (11). That factor is simply unity when k=0, and when k>0 is it even
simpler, for by the binomial theorem = (1-1)k = 0. So
Q0 - Q1 + Q2 - + . . . =
= , Q.E.D.
Hall M. Combinatorial Theory. New York: John Wiley & Sons, 1967.
Weir BS, Triggs CM, Starling L, Stowell LI, Walsh KAJ, Buckleton J. (1997) Interpreting DNA
mixtures. J For Sci. 42(2):..-220