Paternity calculation with 3-banded pattern

References:

Paternity calculation with 3-banded pattern

The (trio) problem

Sometimes the child and alleged father in a paternity case each appear to have three alleles at a particular locus. For example, suppose the genetic evidence E is

Mother	P	Q
Child		Q	R	S
Man			R	S	T

Apparently the man and the child share a genetic trait – quite a peculiar one in fact. Therefore the types at this locus are evidence favoring paternity. But how much evidence? How do we quantify it?

Considerations

It is impossible to make any sort of computation without knowing (or at least assuming) something about the underlying biology.

"... can now pretty well be assumed ..."

February 2014 — I was wrong about that.

I was half right — it's not necessarily (or even often?) trisomy. But it didn't occur to me that there's a third possibility until I read the brilliant article
“The nature of tri-allelic TPOX genotypes in African populations”, Lane AB
Forensic Sci Int Genet. 2008 Mar;2(2):134-7.

It reads like a statistical detective story, gradually revealing the explanation for the observed data.

In summary, I can now think of at least three explanations for tri-allelism:

Tandem duplication. The person has the usual two chromosomes one of which has an extra allele.

That's the model this web page base was based on.

Trisomy (as Downs)
Extra allele on a completely different chromosome (Lane's explanation)
De novo mutation

and each would lead to a different calculation

What is it?

Fortunately it can now pretty well be assumed that at least the vast majority of instances of a 3-banded pattern are due to two bands coming from one chromosome (i.e. not trisomy). In the RFLP case, we can assume that a mutation has introduced an extra cleavage site within the tandem repeat region.

In any case, we will assume that two of the bands are transmitted as a unit. Let's call these two linked bands a doublet.

Frequencies of traits

I have heard reports that, in the population of chromosomes that show the extra band, certain band sizes are very common and typical. This would be expected if the cleavage mutation occurred fairly recently and only once or a few times, and if there has therefore not been much opportunity for subsequent length-altering mutations to create a great variety of band sizes among bands that are part of a doublet.

Certainly there is no reason to expect any relationship between the distribution of doublet band sizes and of non-doublet band sizes.

Bad strategy

Therefore I don't agree with the approach that sometimes seems tempting, to calculate a paternity index based on treating one or the other of the doublet bands as a normal band. The frequency that will be obtained for that one band will surely come from a database of normal chromosomes only, and therefore conceivably it will be unfairly small. It certainly will be irrelevant.

Preferred strategy

A better strategy, I think, is to concentrate on the obvious genetic peculiarity – the extra cleavage site (or whatever causes the doublet). That is a sufficiently rare trait that by regarding it, and even ignoring the specific band sizes of the doublet (except if they happen to exclude the man), we probably can obtain sufficient and useful information from the locus while making justifiable and conservative assumptions.

In other words, think of the all instances of the locus that have a doublet as belonging to a classes of alleles that we shall call M (for cleavage-mutation). Sometimes we will distinguish different doublets with subscripts – M_PS meaning a doublet with individual bands P and S.

Solution to Trio Problem

For the typical case mentioned above, with evidence E, we note that the mother passed a Q to the child, and the biological father therefore passed a chromosome containing RS. That is, we take the view that he passed the trait M, and more specifically that he passed M_RS.

The tested man certainly has M, but what kind of M? His genotype can be any of RM_ST, SM_RT, or TM_RS. If we had no data about the relative frequencies of the different varieties of M, then we would assume that they are all equally common. In that case it would be right to say that the relative frequencies of the three possible genotypes are in proportion r:s:t (where r, s, and t represent the frequencies of the corresponding alleles) – hence that
(1)

the chance that the man is TM_RS is t/(r+s+t).

The fact that M_RS has been observed at least once – in the child – means that (1) is "conservative". It is an underestimate for the chance that the man has the requisite paternal variety of M. The probability for the man to transmit this requisite allele is therefore at least
(2)

P(man transmits RS) > t/(2·(r+s+t)),

and we can regard (2) as the numerator, X, of the PI (paternity index) in this case.

The denominator, Y, is the frequency in the population of chromosomes of the trait M_RS. Probably the data will not be available to estimate that frequency. However, obviously P(M_RS)<P(M), so an adequate (conservative) estimate for Y is P(M). In summary,

Rough estimate

There is no reason to expect any of T, R, or S to be more or less common than any of the others, so on average t/(r+s+t) = 1/3. If, as seems likely, we can be confident that fewer than one person in 100 exhibits M, then P(M)<1/200. Therefore the recommended approach will probably give PI>33 – no doubt a vast underestimate, but useful.