GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2003-12 > 1070312017
From:
Subject: Re: [DNA] Genetic Distance calculation - Comments re MacGregor and a further question
Date: Mon, 1 Dec 2003 15:53:43 -0500 (EST)
References: <1ec.13e271f9.2cf0ed2f@aol.com> <002b01c3b1c7$f967e610$799c89d9@helen>
In-Reply-To: <002b01c3b1c7$f967e610$799c89d9@helen>(richardmcgregor@ntlworld.com)
Richard wrote:
> Particularly interesting in the light of the debate is the case of
> individual 1 who is one mutation away from the 'root line' and individual 2
> who is 2 away from the root line. There seems little doubt that both these
> individuals have mutations away from the 'root' and, as it happens, the
> mutation occurs at the same locus in both but in opposite directions. By
> the stepwise calculation these two individuals who are related through the
> root line would have a genetic distance of 3, and by the squaring method it
> would be 5 ie (2*2) + 1*1).
Several things are worth pointing out here. First, when you have a
reconstructed ancestral haplotype, the whole picture changes -- it is
no longer useful or interesting to compare one person against another,
but only against the ancestral haplotype. Second, when you have many
members of a common-descent group, it is time to look at the
statistics of the group as a whole. If you have a VERY large group
(such as the east Asian cluster that has been postulated as kin of
Genghis Khan), it is a very simple exercise to calculate the time
elapsed since the common ancestor. You need only two things: the
average mutation rate and a properly random sampling of the entire
group of descendants. Obviously, both of these requirements are a
little slippery, as is the scale factor needed for converting time in
generations to time in years, but the basic calculation is quite
simple: just add up the squares of the differences marker-by-marker of
all loci of all the persons in the group from the ancestral haplotype
and then divide by the number of loci and the number of persons and
the mutation rate. This is generally called the "convergent time" and
is the calculation that was done for the Asian cluster that gave a
time consistent with common descent from Genghis Khan. With a sample
group of only 14, it is likely that they will NOT be a really random
sample, but will instead overemphasize some subgroup(s) and will
therefore have an "effective" convergent time that is SHORTER than the
actual time elapsed since the common ancestor. Still, this is the
starting point for further analysis. For example, the effective
convergent time might indicate an important event, such as a major
emigration or a massacre. Finally, there is nothing about the
stepwise model that is tied to Bruce's sum-of-differences method, as
opposed to the sum-of-squares method. Both methods are built on the
same stepwise base using different assumptions.
> Of the 14 who share the 'root line' we have 9 individuals who share the same
> DNA signature
> (8 MacGregor and 1 Stirling - a known and documented MacGregor alias), 1
> Stirling with one
> mutation (385b), one MacGregor with one mutation at 464d, one with one
> mutation at 449, one
> with one mutation at 458, one with two mutations - one at 458 and one at
> 439.
By that tally, the total of the squares is 6, the average is 0.017,
and the convergent time is 8.6 generations (just a tad over two
centuries if we assume 25 years per generation).
> In addition to these there is 1 individual whose distance from the 'root'
> is 3 mutations on
> separate loci and one whose distance is 4 at 4 separate loci. (I have not
> counted these two
> into the total of 14. We also hbve 1 Greig with 4 single mutations on 4
> loci).
Counting those three additional people would raise the average to 0.04
and the convergent time to 20 generations (five centuries). If your
window is seven centuries or more, you should be giving these other
three people more consideration as possible kin. After seven centuries,
you expect about 1/6 of the group to have three or more mutations
relative to the ancestor.
> My question is: if we know a rough date for the split of different
> haplogroups, is it possible to identify a 'root' line for each haplogroup
> and from this calculate mutations away from the 'root' rather than trying
> to measure individuals against each other???
If you are talking about an entire haplogroup, the chances of getting
a truly representative sample are just about nil, even if you were to
test everyone on Earth -- because of the turbulence of human history.
In any case, the main haplogroups are so old that everyone can expect
to have many, many mutations.
John Chandler
This thread:
| Re: [DNA] Genetic Distance calculation - Comments re MacGregor and a further question by |