GENEALOGY-DNA-L ArchivesArchiver > GENEALOGY-DNA > 2002-07 > 1025750800
Subject: Re: [DNA] Re: MRCA and FTDNA
Date: Wed, 3 Jul 2002 22:46:40 EDT
In a message dated 07/02/02 11:45:24 AM Pacific Daylight Time,
> Lloyd wrote:
> > For a 12/12 match (.002) at 95 % FTDNA shows 62 generations while our
> > MRCA site shows 76.9 generations. A substantial difference in years !
> "Our" MRCA site? Perhaps you mean Ann's MRCA calculator? As I
> understand it, her algorithm assumes a Poisson distribution for the
> number of mutations. Basically, this means that the time line is
> treated as a continuum, and a mutation is deemed equally likely to
> occur in any time segment of the same length as another segment.
> This is a good model for many physical processes. Unfortunately,
> the unit of time in this implementation is the "generation" and is
> not constant, so the model suffers from a statistical distortion.
> In addition, it is not at all clear whether the Poisson model is
> applicable to mutations between generations, since the production
> of sperm cells is from many parallel lines of source cells, not just
> An alternative approach is to treat time as discrete, with indivisible
> generation units. The relevant model is the binomial distribution.
> It's not clear that this is the right approach either, but it's the one
> I use, and I can confirm that 62.4 is the 95% confidence result for a
> 12/12 match using this model. I wouldn't worry too much about the
> difference between the two approaches, since the mutation rate is
> poorly determined and enters directly in each.
Actually, my MRCA calculator is based on equations from Bruce Walsh's paper
"Estimating the Time to the Most Recent Common Ancestor for the Y chromosome
or Mitochondrial DNA for a Pair of Individuals." The full text is available
on-line at http://www.genetics.org/cgi/reprint/158/2/897.pdf. I don't pretend
to understand his method beyond the most superficial level, but it is based
on Bayesian analysis, where you assume some prior knowledge. Walsh seems to
think this is a better model than the binomial distribution, but I'm just
taking his word for it. Have you looked at Walsh's paper, John?
Anyway, the number you saw in the table at http://www.ftdna.com/faq2.html was
the 95th "percentile." That's what you get if you take a random sample of say
1000 people who match 12/12, and see how many people find their common
ancestor within a certain number of generations, starting with one generation
and continuing until you've covered 95% of the cases.
Bruce Walsh is a population geneticist who consults for FTDNA, and he has
written some background material which isn't QUITE as technical as the paper.
If you click on the link for 12-0-0 (12 marker test with no mutations),
you'll see an additional column for 95% "confidence interval." That's what
you get if you start in the middle and start including cases which are
shorter or longer until you've covered 95% all of the possibilities. That's
what the MRCA calculator displays. It's quite a bit larger, partly because
the curve is not symmetrical and has a long tail (see figures 1-3 in Walsh's
The MRCA calculator only covers 0, 1, or 2 mutations, because the equations
got so complex. But Walsh does have a table where he shows what happens with
more and more mutations out of 5, 10, 20, 50 or 100 markers. The 10 and 20
marker sections should be a useful approximation for those of you who are
wondering how closely you are related when you have more than two mismatches.
The answer is -- not very close at all when you're looking at two randomly
selected people. However, when you start with a known common ancestor with
many descendants, you'll be able to find examples where one branch shows a
one or two mutation difference from the CA, and another branch shows a couple
of different mutations compared to the CA. If you select out those two
specific people, they could differ from EACH OTHER by more than two mutations
and still have a common ancestor within a reasonable number of generations.
Bottom line for everyone: the table in Walsh's paper is worth looking at,
even if you skip all the equations.