Table of contents

Haplotype DNA evidence
  1. Y-chromosome analysis
    1. identity
    2. paternity – ordinary case
    3. paternity – mutation
      1. child-centric approach
      2. father-centric approach
    4. Which is right?
    5. Pragmatic estimate

Analysis of Y-haplotype information in a kinship case
Forensic mathematics home page
Comments are welcome (see home page for email)

Haplotype DNA evidence

A Y-chromosome or a mitochondria has genetic information that is useful for identification or for kinship problems such as paternity attribution, but has to be treated somewhat differently from the more typical nuclear (=autosomal) DNA profile for two reasons:
  1. Several markers are linked, i.e. physically chained and inherited together, so they must be considered as a unit. No recombination. The product rule doesn't apply at all.
  2. The genetic rules are simpler – the trait is either known to be passed or known not to be passed (depending on the sexes involved) to each offspring; there are no choices or 50% probabilities of transmission as with nuclear DNA.

  1. Y-haplotype in indentification and paternity
  2. I discuss here basic principles in using Y-haplotype information for identity or paternity.

    1. Identity
    2. Suppose suspect and crime stain have the same Y-chromosome haplotype. That result is normal and expect (i.e. 100%) if the suspect is the donor; it is the probably of seeing the haplotype among random men if the suspect is a random man.

      The strength of the evidence is therefore simply expressed as matching odds (or equivalently as a likelihood ratio) of

      matching odds = 1 / P(haplotype).

    3. Paternity – ordinary case
    4. Typically father and son share a Y-haplotype just as if the son were a crime scene. Therefore in the typical case the equation above also gives the paternity index:

      PI = 1 / P(haplotype).

    5. Paternity – mutation
    6. Of course that's not 100% true; there are mutations. Available data supports that the mutation rates and behavior for STR loci on the Y-chromosome are typical for the genome; so around μ=1/400 per locus per generation for single step mutations, but with a lot of variation depending on the locus.

      Suppose a man M has Y-haplotype which we call YM and a boy C has the type YC which differs from YM by a single step at just one locus.

      Obviously, mutation cannot be ignored in this case. Since μ is the probability of any mutation, but nearly all (90-95%) STR mutations are one-step and expansion and contraction are about equally common, to a reasonable approximation the probability to mutate in either direction between YC and YM is μ/2.

      There are several possible approaches. We use the notation PI for the paternity index, and
      PI = X/Y, where
      X = Prob(observed haplotypes | F father C) and
      Y = Prob(observed haplotypes | F unrelated to C).

      To evaluate Y, we can write
      Y = mc where
      m=Prob(YM) and
      c=Prob(YC).

      X is a little more problematic.

      1. child-centric approach
      2. The child has YC, inherited from his father. A mutation between YC and YM may have occurred, with probability μ/2. Therefore, given that a child is type c the probability is approximately μ/2 that his father is type YM.

        Hence
        X = cμ/2 and
        LR = X/Y = X/cu = 3μ/2u.

        It remains to estimate u.

      3. father-centric approach
      4. In a symmetrical way we could begin with the alleged father, and obtain instead the formula
        LR = 3μ/2c.

    7. Which approach is right? How to estimate c and/or u?
    8. Deep questions. What is right depends on such things as what you think the population database represents – grandfather's generation? the child's? If the population were in drift and mutation equilibrium, then I suppose all methods would give the same answer.

      1. Pragmatic estimate of the Y-haplotype evidence
      2. Note that all formulas are equivalent if c = u. Therefore to be conservative let's take the uncle-centric view and take c=2/171.
        Hence LR = 3•0.009/2(2/171) = 1.15.

        The meaning of this neutral result is that the chance to see so rare a haplotype by mutation is about the same as the chance to see it at random in an unrelated individual.

  3. Approach to "frequencies"
  4. Frequencies of an unobserved trait is impossible to know. Fortunately frequency isn't the question. Probability is.

    My recent paper on rare haplotypes offers several approaches. Bottom line: simple counting (but add 1) is very conservative. A pretty accurate method that is not complicated is also given. June 2009


Go to top