Probable Race of a Stain Donor

Proceedings from the Seventh International Symposium on Human Identification 1996, Promega Corp 1997, pp 48-52

Charles H. Brenner, Ph.D. (home page)
E-mail: Discuss race
A mixed-three-race example with attractive 3d-graphs.


Sometimes it would be useful to know the race of a stain donor. Any DNA typing provides some evidence, provided that population data of fragment sizes for the races in question is available. Quantitative estimates are given of how good that evidence is likely to be, and ideas discussed for the best calculations to extract the evidence. As a rough conclusion, distinguishing Caucasian from African-American source can usually be done confidently. Distinguishing Caucasian from Hispanic is more problematic.

The ratio of profile frequencies for the same profile in different races is a likelihood ratio. By standard Bayesian reasoning it quantifies the evidence that the profile comes from one race rather than the other. Moreover there is nothing in the reasoning that depends on what "profile" means. Therefore the best strategy to compute a "profile" frequency may differ from standard casework methods. If another formula (so long as gives the frequency of something, anything) gives better predictive value it is preferable.


The object of the exercise in this paper is to guess the population of origin of a given genetic profile. If populations are isolated from one another for long enough, then through genetic drift and mutations they become statistically differentiated by their differing allelic frequencies.

Suppose that a certain DNA profile P is calculated to occur at the rate of fA=1/5000 for population A and fB=1/50000 for population B, and that there are two million equally likely suspected donors divided equally among the two populations. Then there are about 200 A-people and 20 B-people who would match the profile, so it is 10:1 that the donor is an A. In short, the ratio fA/fB behaves as a likelihood ratio expressing the evidence that the donor is A rather than B. Evett et al (1992) and Buse et al (1993) have also made this observation.

Definition of frequency

It is worth exploring the reasoning in explicit detail because there is more than one way to define frequency.

The expression

is a likelihood ratio by definition expressing the superiority of the hypothesis race=A over the hypothesis race=B. This statement is true regardless of the interpretation put on the statement "person matches P." For example, if "matching" is defined as having bands in the same FBI fixed bins, then numerator and denominator correspond to the FBI definition of frequency. That will not be a particulary useful interpretation for race-discrimination purposes however. It will overlook any component of genetic drift that merely shuffles allelic preponderances within the same fixed bin, and is only sensitive to gross drift that manifests as changes in the total frequency of a bin. Hence by this definition L will often not be far from one.

More plausibly useful, "match" can be defined according to a ±3% window. This rule estimates frequencies by counting a fragment from the database is counted as "matching" a fragment in P if they are within 3% in size. L then is the ratio of frequencies as defined according what had been dubbed a "floating bin" of size ±3%. A smaller window would give still greater discrimination between races, but a limiting concern is the precision of measurement. After all the matching comparisons are not made between actual sizes of fragments, but only between measurements. So as the window becomes small, the likelihood ratio test becomes less a test of similarity between fragments and more a test of similarity between random noise.

It is therefore desirable to define "match" in such a way as to compensate for measurement error. The appropriate mathematical expression is implicit in Gjertson et al (1988) and several later papers. Let q1, . . . , qN be the sizes in the database, be a normal distribution with mean zero and standard deviation chosen to imitate the variation from repeat measurements of the profile fragment q. Then

is a "Bayesian" frequency for q calculated by counting each database fragment qj on a pro rata basis according to its relative chance to be measured as q. Evett et al also mentioned a formula of this sort.

Figure 1 illustrates the difference between the last two approaches. Frequencies as calculated by the Bayesian method are the lower, more sharply undulating curve. For present purposes it is the fact that it undulates more sharply that makes it more useful.

An example

Table 1 is based on the DNA types from a habitual rapist of the St. Louis area, one popularly known by the sobriquet "gentleman rapist." He managed to disguise himself sufficiently that his victims were unable to identify his race with confidence. However, the demographics of that part of the country are about half-and-half, black and white. The types based on five RFLP probes are shown in the table. The likelihood ratios, like frequencies themselves, can be computed on a band-by-band basis. The table exhibits the band-by-band frequency ratios. The profile, interpreted as a set of ±3% match windows, is thus seen to be 45 times more common among whites as among blacks. Alternatively, the Bayesian interpretation is that the profile, considered as the exact reported measured results (taking a reasonable amount of measurement uncertainty into account), is 360 times more characteristic of whites than of blacks. This strongly confirms what was anyway the police suspicion.

Expected performance and validation

It is of interest to give some kind of statistical prediction as to how good a job one can expect to do in general. Is the example in the previous section a lucky hit, or is it typical?

One way to make such a prediction is by simulating a large number of cases, then summarizing them. This means, select two races or ethnic groups for which allele population frequency databases are available across several loci. Simulate a case by selecting a profile representing the first race by using the frequencies for that race. Compute the likelihood ratio of the frequencies from both races. Then construct an appropriate statistical summary.

If the statistical procedure is appropriate, then it must satisfy the following validation test. Construct a pair of databases artificially by partitioning a single population at random into two halves. The typical likelihood ratio distinguishing one half from the other must of course be unity.

At least two caveats are necessary in order to satisfy the validation experiment:

i) Include the sample profile bands in the database when computing the frequency. Of course, it is already included in the database for the race from which it is sampled. Failure to include it in the other one would create an ascertainment bias.

ii) The appropriate average of the sample likelihood ratios is the geometric, not the arithmetic mean.

Erikson et al (1991) also used these rules to compare databases, although they did not consider the racial discrimination interpretation in the present sense.

Note that point ii is equivalent to dealing with lod scores and taking an arithmetic average. Since the sum of per-locus lod scores is the lod score across several loci, it follows from the Central Limit Theorem that as the number of loci increases the distribution of likelihood ratios approaches log-normal. Applied to the validation test case, the result is, as it should be, that the likelihood ratios are log-normally distributed around unity.

The distribution between Caucasian and Black using a 3% window and the loci of Table 1 gives approximately the distribution of Figure 2. The St. Louis case turns out to be exactly typical -- a likelihood ratio of 45 is achieved half the time. The one-standard deviation tail (84% to the right) is marked at a likelihood ratio of 5.3. Assuming a 50% prior probability for each race, this means that there is an 84% chance to guess correctly with at least 84% (=5.3/6.3) confidence.


The foregoing distribution of likelihood ratios is without even using the Bayesian method (formula 2), which may well be a dramatic improvement. Regardless, it seems that the race of a stain can be inferred with fair confidence most of the time. It is yet to be seen whether the Bayesian method is sufficiently powerful to distinguish Caucasian from Hispanic with a useful degree of confidence. Certainly the window method is not.

Work is in progress (based on the data discussed in Meyer et al) to evaluate the effectiveness of STRs for distinguishing racial groups. Direct comparisons are hard to make, in part because the Meyer data is for sundry populations around the world and not US groups, but broadly speaking it seems that STR's may work about as well "per unit heterozygosity." That is, one must use more STR's for the same effect in distinguishing individuals, and must also use more STR's, in roughly the same proportion, to distinguish races.


I am grateful to Dr. Keith Monson and to Prof. Bernhard Brinkmann for making data compiled by their laboratories available to me for this study. I also wish to thank Kim Gorman for the St. Louis rapist data.


Buse EL, Houlihan B, Hartmann J (1993) Investigation of the Feasibility of Inferring Racial and Ethnic Origin from Fixed-bin DNA Profiles (poster abstract), Proceedings from The Fourth International Symposium on Human Identification 1993, Promega Corporation. pp214-215

Erikson B, Svensmark O (1994) DNA polymorphism in Greenland, Int J Legal Med 106:254-257

Evett IW, Pinchin R, Buffery C (1992) An investigation of the feasibility of inferring ethnic origin from DNA profiles, JFSS 32(4):301-306

Gjertson D, Mickey R, Hopfield, Takenouchi & Terasaki P (1988) Calculation of Probability of Paternity Using DNA Sequences, Am J Hum Genet 43:860-869

Meyer E, Wiegand P, Brinkmann B (1995) Phenotype differences of STRs in 7 human populations, Int J Legal Med 107:314-322

Table 1 St. Louis rapist likelihood ratios. "Likelihood ratio" is the product of the per-band likelihood ratios. "Likelihood ratio per band" is the geometric mean of the per-band ratios (i.e. the value, which if replacing all the per-band values, gives the same product). The "Bayesian" column is probably legitimate, but only the "3% window" algorithm has been validated in this paper.
band sizes Caucasian vs.
Black Hispanic
3% window Bayesian 3% window
D2S44 2630 1.6 1.5 0.91
D1S7 4300
D17S79 1520
D4S139 11600
D5S110 3610
D10S28 1780
Likelihood ratio = 45 360 1.33
Likelihood ratio per band = 1.41 1.71 1.03