GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2005-04 > 1113356244


From: (John Chandler)
Subject: STR mutation rates again
Date: Tue, 12 Apr 2005 21:37:24 -0400 (EDT)


With the new release of data and software at the SMGF software, it's
time to take another look at the mutation rates. The old version of
smgf.org had a peculiar kink in the probability curve for a perfect
match, such that the stated probability was always a little low for
the submitter and always a little high for everyone else. That kink
is gone now. More importantly, there are eight loci there now that
weren't there before. I have read off the implied mutation rates from
the web site, using a simple and robust method. Note: the MRCA
likelihoods are reported there to 8 decimal digits, but I am including
only 5 decimal digits in the derived mutation rates here. The
absolute calibration is still not published.

Here's the "executive summary".

- The table of rates deduced from SMGF by Doug McDonald using a less
reliable method and reported last year, and now displayed with some
updates on the World Families Network web site, is noticeably higher
than the rates assumed at SMGF, by about 10% on average.

- All of the 28 markers displayed at SMGF last year still have exactly
the same rate estimates, to within one or two at the 8th decimal digit.

- The 8 "new" markers, alas, do not have locus-specific rates in this
release of the web site. All 8 have an identical assumed rate of
0.002.

In the table below, I show the implicit SMGF mutation rates and, for
comparison, the rates listed at WFN. The multi-copy marker rates here
are per-copy.

I also show the averages for various panels of markers (using Doug's
estimate of 0.0035 for DYS464 as needed.)

Locus ____ -SMGF-- ___ -WFN--
385ab ____ 0.00280 ___ 0.0033
388 ______ 0.00038 ___ 0.0005 *
389i _____ 0.00218 ___ 0.0021
389ii-i __ 0.00257 ___ 0.0028
390 ______ 0.00440 ___ 0.0045
391 ______ 0.00316 ___ 0.0036
392 ______ 0.00147 ___ 0.0016
393 ______ 0.00111 ___ 0.0012
394 ______ 0.00153 ___ 0.0016
426 ______ 0.00027 ___ 0.0005
437 ______ 0.00174 ___ 0.0020
438 ______ 0.00100 ___ 0.0012
439 ______ 0.00418 ___ 0.0045
441 ______ 0.00200** _ ------
442 ______ 0.00200** _ ------
444 ______ 0.00200** _ ------
445 ______ 0.00200** _ ------
446 ______ 0.00200** _ ------
447 ______ 0.00388 ___ 0.0045
448 ______ 0.00236 ___ 0.0028
449 ______ 0.00646 ___ 0.0075
452 ______ 0.00200** _ ------
454 ______ 0.00023 ___ 0.0005
455 ______ 0.00031 ___ 0.0005
456 ______ 0.00200** _ ------
458 ______ 0.00580 ___ 0.0066
459ab ____ 0.00118 ___ 0.0014
460 ______ 0.00294 ___ 0.0028
461 ______ 0.00231 ___ 0.0028
462 ______ 0.00053 ___ 0.0005
463 ______ 0.00200** _ ------
464abcd __ ------- ___ 0.0035
A10 ______ 0.00380 ___ 0.0045
C4 _______ 0.00236 ___ 0.0028
H4 _______ 0.00304 ___ 0.0036
YCAIIab __ 0.00124 ___ 0.0014
1B07 _____ 0.00091 ___ 0.0013

panels:
FT1-12 ___ 0.00224 ___ 0.00246
FT13-25 __ 0.00286 ___ 0.00317
FT1-25 ___ 0.00256 ___ 0.00283
RG26 _____ 0.00202 ___ 0.00228
RG43 _____ 0.00231 ___ 0.00253

* Also listed as 0.0006 in the Relative Genetics Panel.
** An obviously arbitrary figure.

A few words on the method of reading these rates from the web site:

For an exact match, neglecting the possibility of mutations that
cancel each other out, the likelihood is proportional to the 2N'th
power of the probability that all markers fail to mutate in a
transmission event. Thus, it is only necessary to pull out the ratio
of likelihoods between generations for two cases: one where all
markers are verified to match and another where one marker is ignored.
The ratio between these two ratios is the square of the probability
that the selected marker did not mutate. I have verified that the
ratios from generation to generation are all the same in the SMGF
calculation (now that the "kink" is gone), thus demonstrating that
their calculation does indeed neglect the possibility of cancellation.
Even with the kink in place last year, all ratios except the first
were the same.

John Chandler


This thread: