The littlest database
The problem: 
Suppose you want to estimate allele frequencies
for some DNA locus. How big should the database be? Sometimes N=100
individuals (200 alleles) is suggested as a practical size. But
surely N=99 will do almost as well. And if that is so, why not N=98?
And so on. Naturally the utility gradually diminishes as N becomes
smaller. But for what value of N does the utility disappear
completely? What is the absolutely smallest database that is any use
at all? And what use is it? 

N=0 can be useful. Suppose that analysis of a crime stain reveals two
alleles, PQ. If a PQ suspect turns up, there is a definite amount of
evidence against him, even with no information about frequencies at
all. Reason: The alleles P and Q have some (unknown) frequency in
the population, call them p and q. Now,
(i)  p = ½ + (p ½)  and 
 p+q ≤ 1  so 
(ii)  q ≤ 1p = ½  (p ½),
 hence multiplying together (i) and (ii) 
 2pq ≤ 2(¼  (p ½ )^{2})
≤ ½, 
i.e. at most ½ the population is PQ. If we can
get the same result in 10 loci, then the suspect is narrowed down to
1 person in 1024 who matches the stain. Not bad for no databases!
Comments? Questions? Disputes?
Links: Forensic mathematics home page.
Posers in forensic mathematics.