GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2010-02 > 1265777902


From: "Anatole Klyosov" <>
Subject: Re: [DNA] TMRCA Estimate of a group of Marsh families
Date: Tue, 9 Feb 2010 23:58:22 -0500
References: <mailman.3839.1265771522.2099.genealogy-dna@rootsweb.com>


>From: "Alister John Marsh" <>
>...Perhaps I understand your processes better through you working through this example.
Your assessment of there being two clusters is correct.
The age you estimate for the larger cluster (A to I) at 400 years ago is
possibly in the right ball park, although one or two of the haplotypes may
be from branches earlier. If anything, the cluster A to I may be a little
older than 400 years.
This smaller group of J and K is only two haplotypes, I suspect your
estimate of 100 years to branch common ancestor is considerably out.
However, because the haplotypes were the same, and only two of them, I think
there is not a large enough group to make predictions with statistical
confidence, which you in effect stated. You can't be criticized for
predictions on such limited information as 2 haplotypes.


My response:

Dear John,

I am glad to be of some help. Perhaps our discussion and a consideration of your example might be also of some help to people here who elect to focus on problems and not on their solution. It also shows that there is nothing mysterious with counting of mutations, that it is good to work with specific examples and not just to talk generalities, and that it is not a big problem to dissect a haplotype datasets into separate branches.

I should also add that I like, John, your attitude directed to problem solving. That is how things are done, not just discussed.

Now, back to your example. As you well know, haplotypes and their mutations rule. One cannot get a reliable TMRCA with just a few haplotypes and with a few mutations in them. One cannot beat a high margin of error in those cases. That was an objective issue in your example. Eleven haplotypes with two branches in them is not a gift.

What does it mean? It means that statistics is not there. It means than an extra mutation which can easily happen in any generation shifts an apparent timespan to a common ancestor. Nevertheless, an estimate can be made.

In your set of 11 haplotypes it was quite obvious that the last two haplotypes formed a separate branch, aka a lineage. They had a triple sync deviation from the rest of the pack, and one deviation was by as many as two mutations in the both haplotypes in the same markers. Things like that do not happen randomly. It was a system there. To verify it I have composed a haplotype tree, and - sure enough - those two haplotypes were sticking out as a sore thumb.

However, those two 37-marker haplotypes contained only one (1) mutation in all 37 pairs of alleles. It means that a common ancestor for those two haplotypes lived 1/37/2/0.00243 = 5.6 generations back (to be ridiculously precise), that is 140 years ago. Since one mutation results in the margin of error of 100%, we have 140+/-140 years to a common ancestor for those two haplotypes (in my previous message there was a typo, I was in a hurry).

The rest of 9 haplotypes contained 13 mutations. I have counted that 4-step mutation as just one on obvious reasons. However, it did not change the outcome much, as I will show shortly. 13/9/37/0.00243 = 16 generations, that is 400 years to a common ancestor. A margin of error in this case is 56.4% for the 95% confidence, that is 400+/-225 years to a common ancestor. If we count in the 4-step mutation, it will be 16/9/37/0.00243 = 20 generations, that is 500 years to a common ancestor. As you see, it is within the margin of error.

Since there are 4 mutations between the both branches, it places their common ancestor at about 845 years back, that is still at the 12th century. You can play around with margins of error, which will be certainly more than a century, however, my experience shows that those margins or error are typically overestimated. By placing too high margins of error one just fools himself. I would suggest you to consider - for the time being - that a common ancestor of the population lived in the middle of the 12th century. Only if documents contradict this estimate, it can (and need) be reconsidered. Besides, it can be examined with a more extended series of haplotypes.

>Also I suspect a number of back mutations


No, disregard them. Simple math shows that a contribution of back mutations is practically zero during the first 26 generations to a common ancestor, and truly negligible in the first 1200 years or so. "Your" the most recent common ancestor lived within that range. After about 2500 years to a common ancestor back mutations are kicking in progressively.

Regards,

Anatole Klyosov


This thread: