The STRmix team has been working with Dr John Butler to develop material in response to deficiencies outlined in the PCAST report and subsequently arising in a large number of court cases. These deficiencies arise largely from the restriction by PCAST and Butler to material published in the peer reviewed literature. There is a large mass of empirical material present in the internal validations of many labs using STRmix. This material is not published in the peer reviewed literature and would almost certainly be rejected as not novel if submitted. However much of this material is already in the public domain or would be made available by the lab if they received a legitimate request.
At a meeting with members of PCAST and Butler on 18th November 2016 it was clear that PCAST was unaware of the difficulties we have in publishing such material. At a meeting at the National Commission on Forensic Science on 9th and 10th January 2017 it became apparent that Jim Gates (also a member of PCAST) was still unaware of our difficulties in publication.
As part of our response to the PCAST report we are assembling these internal validations and will attempt publication. In addition some specific experimentation has been undertaken into high ratio mixtures. This was undertaken at the Forensic Science South Australia laboratory in Adelaide, South Australia. There is a range of mixtures examined but we report here only the four person mixtures. They are profiled using Globalfiler as per manufacturer’s instructions and run on two different 3500xl CE machines using 50rfu analytical thresholds.
Two, three, four and five person mixtures were constructed in varying proportions and amplified with varying amounts of template DNA as described in Table 1. Each experimental setup was amplified in duplicate.
|1||1||1||50, 100, 200, 400|
|16||10||5||2||1||100, 200, 400, 1000|
Table 1: Mixture setup
Profiles were analysed using software STRmix™ V2.4.05 and V2.5.02. In all analyses the Y-indel locus and DYS391 were ignored. For all calculations, the product rule was used (i.e. no co-ancestry coefficient) and the point estimate has been given. LR calculations considered each person on the 194-individual database as a potential contributor, or person of interest (POI), to the mixed DNA profiles. In doing so there are comparisons to all individuals who are known to have contributed (when H1 is true) to the DNA profile and the remainder, who are known not to have contributed (when H2 is true).
We present here some figures representing the results.
Figure 1. A plot of template for the contributor under consideration vs log(LR) for that contributor. The true donor values are split into those at rations 10:1 and higher and those at lower ratios.
Figure 2. A plot of mixture proportion for the contributor under consideration vs log(LR) for the smallest (lowest mixture proportion) of the four donors.
These graphs confirm an already know fact. STRmix operates as expected at low template. It faithfully reports uninformative when the profile is uninformative.
PCAST, John Butler, and the People v Oral Hillary have focussed attention on low template or very high ratio mixtures. As part of the STRmix development in this regard Duncan Taylor, Jo-Anne Bright and I have been looking at mixtures where one contributor is “not there.” This may seem a strange term but we are using it to describe the situation where we have, say, a known ground truth profile known to be single source but we treat it as a two person mixture. One contributor, therefore, is not there. This is actually one of the standard tests prescribed by the very excellent SWGDAM (2015) guidelines for the validation of probabilistic genotyping systems. As time has passed these guidelines have proven their worth and we commend the committee.
We were interested in investigating some of the details of the behaviour of the LR for a contributor who is not there. To do this we compared a real single source profile with one manufactured to have perfect heights (all stutters perfect ratio and al heterozygotes in perfect balance. This does not necessarily mean the same height since there is a degradation curve and an effect of stutter). Using informed priors we forced this additional contributor to have a low mixture proportion.
Below we give the layout of the D8S1179 locus as an example.
The “perfect” stutter ratio for this allele is 5.02%. We see that the real sample has a stutter at position 11 that is slightly too large but certainly not so large that it would draw any attention. However, if we treat this as a two person profile, it is easier to explain the 11 height if there is an allele of the imaginary trace at 11.
Below are the genotype distributions from STRmix.
|Genotype of the||Real||Manufactured|
We see that for the manufactured sample there is no sensible place to put the trace and hence STRmix distributes the weights equally. However for the real sample there is some slight advantage in having a trace with the 11 allele.
There is an interesting follow on from this. This explains the vertical height of the distribution of Hd true tests for very low template contributors that has been observed in all specificity tests done. For any real profile, mixed or otherwise, there are some stochastic effects. There is an advantage in having the very low level trace with alleles in those positions. Hence some Hd true test donors help explain the profile whilst others actually make it harder to explain. Those that help get a positive log likelihood and those that hinder get a negative one. This can be seen in the figure below which is the result of the experiment described above of a real and manufactured single source treated as a two person mixture.
I conclude that STRmix, and most probably other PG softwares, are treating very low level of non-existent contributors correctly. I do not this that this is a sensitive area for PG.