GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2002-07 > 1026162660


From: "John F. Chandler" <>
Subject: Re: [DNA] MRCA calculations, sperm, and phone calls
Date: Mon, 8 Jul 2002 17:11 EDT
In-Reply-To: DNACousins@aol.com message <71.21f418c7.2a59fb7b@aol.com> of Sun, 7 Jul 2002 14:15:55 -0600


Ann wrote:
> I don't follow your objection to the use of the generation as a unit of
> measurement

The objection is not to the generation as a time unit, but rather to the
application of Poisson statistics to a phenomenon with a quantized
time variable. In the studies people are actually doing, 10 generations
is a typical scale (representing approximately the time span from
American colonists to the present).

> Could you write up your method for calculating the MRCA in such a way that
> anyone with a calculator or a spread sheet would be able to use it? This is a
> recurrent question on the mailing list. The variables would be matching on N
> out of M markers with a mutation rate R per marker, and the answer would be a
> P chance (50%, 95%, whatever) that you match within G generations (or
> whatever unit you prefer).

Case 1. N=M

P = 1 - (1-R)^(MG)

or G = log(1-P) / M log(1-R)

In other words, if you assume 12 markers and a rate of .002, the median
is 28.9 generations between the two endpoints, or 14.4 back to the MRCA
and 14.4 forward to the other testee. The 95-%ile is 124.7 generations,
or 62.3 forward and 62.3 back. The 97.5-%ile is 153.5, and so on.
Note: if you use the natural logarithm ("ln") instead of log base 10,
you see that ln(1-R) is very nearly -R, so the above equation can be
written approximately:

G ~ -ln(1-P) / MR

Case 2. N=M-1

P = 1 - [1 + G - G(1-R)^M] (1-R)^(MG)

~ 1 - [1 + MGR] (1-R)^(MG) (for small R)

The solution can be given as a set of values of MGR corresponding to
choices of P: 0.5-> MGR=1.678, 0.95-> MGR=4.744, 0.05-> MGR=0.355,
0.975-> MGR=5.572, 0.025-> MGR=0.242. Simply divide the appropriate
value by MR to get G.

Case 3. N=M-2

P ~ 1 - [1 + MGR + 0.5 (MGR)^2] (1-R)^(MG)

Again, there is a set of values of MGR, one for each P:
0.5-> MGR=2.674, 0.95-> 6.296, 0.05-> MGR=0.818, 0.975-> MGR=7.225,
0.025-> MGR=0.619.

Properly speaking, the "R" in all these numeric formulas should be
replaced by "-ln(1-R)", but the difference is negligible for the
values of R that we are likely to run into.

John Chandler


This thread: