GENEALOGY-DNA-L ArchivesArchiver > GENEALOGY-DNA > 2003-05 > 1053345490
Subject: Re: [DNA] DYS464 simulation [Bruce Walsh response]
Date: Mon, 19 May 2003 07:58:37 EDT
I sent a copy of my message about the DYS464 simulation to Bruce Walsh, the
population geneticist who consults with FTDNA. He responded to the mailing
list, but since he is not a subscriber, his message didn't come through. I am
pasting it below.
His response addresses the general principle of stepwise mutations (the
"random-walk" effect I mentioned), but I sent him a follow-up message to
double-check whether this has any special significance for the DYS464 system.
Do take the time to look at the sample curves on his web-site, where you'll
see that it makes little difference which model you use if the number of
mutations is small and the time to the MRCA is short.
--- begin response from Bruce Walsh
I'm on my way out to door to catch a plane, but let me just offer a quick
comment on the recent posting of Ann Turner based on simulation results
for models with multiple mutational hits.
The calculations I present for ftDNA use two types of mutational models,
the first being the infinite alleles model wherein each mutation produces
a different allele. This is the method apparently referred to in the
"Another method, currently used by FTDNA in their Genetic Distance
report, is to calculate how many values the two people have in common."
"The FTDNA method is actually farther off the mark -- it's too generous in
This is expected, as this method DOES NOT count multiple mutations. The
problem is that the OBSERVED number of mutations is always a LOWER BOUND
on the ACTUAL number of mutations. Population Geneticists have a number
of models to estimate the actual number form the observed number (e.g.,
Jukes-Canter, 2- and 4-parameter Kimura, etc.).
The second method that ftDNA uses, which accounts for back mutations, is
the stepwise mutational model
(see http://nitro.biosci.arizona.edu/ftDNA/models.html#Step). This
correctly estimates the number of actual mutations, but is much harder to
compute (it uses integrals of type II Bessel functions) than the finite
model. You can see the differences in the infinite allele and stepwise
mutational models times by looking at the various curves on
For example, compare the curves for a 25 marker test where only 23 of the
markers are exact matches. Likewise, note for these various curves that
when all the markers agree, the two different mutation models give very
similar results, but when there are a few mismatches, the times to common
ancestor are always longer under the stepwise model than under the infinite
alleles model. This arises because the stepwise model computes the
probability that the number of ACTUAL mutations thw two individuals differ by
greater than the observed number, and this inflates the time.