|
|
WTC DNA identification prospectus Analysis of screening; Powerpoint presentation Tsunami victim identification considerations Forensic mathematics home page |
|
| Hals' Willem Coymans |
Back at the hotel I had switched on the TV in time to see some of the live action and numerous replays of now-familiar events, accompanied by surprisingly little voice-over aside from a short period during which a smug BBC commentator analyzed the context and future significance of the goings-on until mercifully he was given the hook.
Howard's immediate thought was the application of the Kinship program, but I thought I had even more to offer. There was going to be a lot of genetic data to manipulate, whose exact nature couldn't be predicted in advance. As one who has worked with computers since 1959, earned a doctorate in mathematics, and done dozens of practical or research projects involving DNA-relationship ideas and computations, I figure I am uniquely prepared to perform whatever manipulations and analysis might be necessary to wring information from the data.
Swissair identification paradigmFrom earlier experience in disaster identification with the Swissair 111 crash, I assumed that there would be a necessary "screening" step in making the WTC identifications based on relatives; further, I could extrapolate that due to the larger scale of the WTC problem new complexities would need to be faced (links WTC prospectus and WTC Powerpoint above).Some of the victims of the Swissair crash needed to be identified indirectly, by comparison with living (or dead) relatives. A two step paradigm emerged:
|
The main difficulty that I foresaw emerging as the sizes of the two lists the victim list and the family reference list grow, is the increasing incidence of "false positives." If both lists are small and some person C in the reference list has a brother who died, and some profile V in the victim list looks like a brother of C, it probably is. However, if the victim list has thousands of profiles, then for any given reference person C there will be dozens of victim profiles that coincidentally resemble C just as much as a typical true brother does. The proportion of false positives is proportional to the size of the victim list.
Therefore I was sure that a simple sorting program of the sort that
had been adequate for the Swissair identifications, would not be very
useful for the WTC disaster. Specifically, in my London talk on
Of course the first several of my estimates have proven to be quite far off.
However, #4-8 are in the ballpark. The implication of the estimate #7 is, that for
every 1000 victims, there will be about one who coincidentally resembles any
given reference person to the same extent as does a true child. Thus, using
individual parents as references to fish victims out of the rubble would result
in more false leads than true ones. On the other hand, #5 implies that if a more
sophisticated trolling operation is used, wherein two reference parents are
simulataneously compared with each victim to accomplish a sort of
The upshot was, on
At some juncture, concerned that the plans might be steering toward an unnecessary and ponderous software project, I made a comment to the same effect as I have indicated above, that once I am able to get my hands on the data, I will quite quickly be able to produce the tentative identifications by myself. At this Howard Cash piped in, "Surely, Charles, even your work can stand a second opinion." I told him he had a fair point.
The three-day meeting ranged over a variety of topics. The one topic originally mentioned to me was the same that Bob Shaler had already asked of me: choose which screening program to use. To that end I put together a Powerpoint presentation to explain the difficulties and pitfalls as I foresaw and, by now, had computed.
In assessing the candidate screening programs, I had in mind several design requirements:
Next on the list was another identical twin case. This one or at any rate at least one of the two was a new identification, the first I had found. The third candidate identifications on the list was a case where the mother, the daughter, and a brother had presented themselves as references. The screening report only told me that two of these people bore a resemblance to the same victim. To confirm the identity of that victim, I needed to make a family-specific computation with the Kinship program to check that the entire assemblage is genetically consistent and numerically convincing.
It was. According to the kinship computation, it was either the right victim, or it was a one-in-twenty-billion coincidence. That's good enough to call it a confirmed identification.
And so it went, easily, for the first thirty or so cases, that
We discussed the codings schemes that would be used for the airplane victims, and I considered what minor modifications would be needed. Plane crashes, unlike office environments, tend to include related people. It's important to know about them and to consider them in the analysis. Shaler and his colleagues were already thinking about how to improve on the WTC experience in collecting samples from relatives. It may seem a ghoulish observation, but an experienced disaster identification team was swinging into action.
We left New York on
I believe that every disaster is unique. Contrary to the hope of a few of the KADAP group, I don't believe it is practical or realistic to expect a "disaster identification" program to be a result of the current identification effort, or efforts. Useful tools, yes. Worthwhile experience, also.
Most of the AA587 victims were identified within a few weeks, which is sensational. Announcement of the final identification took about three months, as inevitably a few cases were delayed by special problems.
No new victims will be found. It will take a few weeks for the DNA profiles of the most recently excavated victim pieces to be reported. Once those have been checked against the reference materials direct or indirect already in hand, I expect a pause in the identifications. Further progress will depend mainly on success in the new DNA techologies that are being attempted, namely SNP's and mtDNA.
Last July Bruce Weir suggested that I submit an article on World Trade identifications to an edition of Theoretical Population Genetics as the name implies a rather high-brow scientific journal that he was editing. He thought "it would have high interest," and so it proved. The article, completed with Bruce's considerable help and co-authorship, attracted what is by my standards a lot of interest. I gave an interview to a German radio (in English) and a newspaper, to the children's science magazine Odyssey, and to Nature.com.
One point I made in the article was an estimate of the maximum number of bodies for which any DNA has been found. I did this probabalistically, trying to account for those victim fragments that produced any DNA at all. If we then make the optimistic assumption that all such fragments can eventually, through ultimately sophisticated DNA typing methods, be identified, as many as 2100 victims might eventually be identified. Perhaps a better way to put it is that no more than 2100 will be found. Dr. Hirsch, the NY Chief Medical Examiner, has expressed his commitment to continue with the work possibly for a long time, with a goal in mind of 2000 identifications. I agree that even that number seems very difficult.
Orchid Biosciences has developed a high-speed, high-throughput, largely automated method for SNP typing, which is being used for WTC samples.
For SNP's, the region of interest is a single nucleotide, which considerably reduces the fragment size. Moreover, the method by which multiple assays are acheived is entirely different, so there is no need for a spacer in the flanking region. Consequently SNP's may succeed where STR's fail when the DNA has degraded to the point where the typical fragment size is around 100 bp.
Over 5000 samples, some victims, some from living relatives, have been typed. The attempted panel of typing is 70 loci. Given that many of the samples are degraded, as we already know from the mixed success of STR typing, there is mixed success in SNP typing. Often only a partial profile is obtained. However, for sure there are times that SNP is quite successful even though STR was not. For these cases the SNP technology is quite likely to provide new identifications. Additionally, there will be cases where both technologies are only moderately successful but the combination is just good enough.