GENEALOGY-DNA-L ArchivesArchiver > GENEALOGY-DNA > 2010-02 > 1266404094
From: "Anatole Klyosov" <>
Subject: Re: [DNA] TMRCA assessments
Date: Wed, 17 Feb 2010 05:54:56 -0500
>From: "Alister John Marsh" <>
>If that is the case, and CDY has twice the mutation rate I had allowed in
example, the chances of back and parallel mutations would be much higher
than I had estimated, meaning the problem I was highlighting would be bigger
than I had supposed.
>In the example I was examining, I suspected a back mutation on CDYb,
suggested a back mutation was improbable in the genealogical time frame.
As I see, the misunderstanding continues. I was talking not on "back
mutation was improbable in the genealogical time frame", but on a
contribution of those "back mutation" events into a total pool of mutations
in the genealogical time frame. Therefore, I was talking on a negligible
effect of back mutations in TMRCA calculations in the first 650 years, and
on a very small contribution of back mutations in the first 2000 years. By
"a very small contribution" I meant (and defined) that this contribution is
within the margin or error in TMRCA calculations.
You continue talking on like "it might happen". Of course it might happen.
If it happens once on a background of 100 other mutations, with its 1%
contribution, this contribution would not effect the TMRCA.
The same confusion was with "parallel mutations". You (and others)
apparently meant that those mutations were useful for identification of
close relatives. Who argues with that? However, I have asked how many of
those "parallel mutations" might have happened in those 509 of 67-marker L21
haplotypes we discussed earlier, and how they might "distort" the TMRCA
calculations, and I have not seen an answer. The likely answer is "they
would not distort, since they would have been counted as any other
mutations, unless they form a distinct branch. In that case the branch would
be analyzed separately".
This probably is a core of many of our mutual misunderstandings. I say "they
do not affect the TMRCA". You (and others) say "but they are useful for
family studies". That is fine and correct. However, there is no conflict
whatsoever between the two statements.
>If there were chances of 3.5 mutations on CDYb between 9 haplotypes, (about
100 mutation opportunities on each marker in the group) that is about 30%
chance of a back mutation occurring in a single individual on that marker.
Further, there is also a 30% chance of a back mutation one of the 9
individuals on CDYa. That is only 2 markers out of 37, and we have in total
about 60% chance of a back mutation in that set of 9 haplotypes in
approximately a 330 year period. If all 37 markers were considered, the
chances of a back mutation would increase. If this is the case, back
mutations might be more of an issue in the genealogical time frame than
Anatole suggested. And this is not even considering the considerably
increased chances of parallel mutations.
I think the issue which Anatole has not taken into consideration, is that
most of the mutation activity takes place on a small group of the very fast
mutating markers in the 37 marker set, and because of this the chances of
back mutations are greater than if all markers had the same mutation rates.
Anatole appears to have "calibrated" his system by counting total mutations
in a large family grouping, but doing this he has not been aware of the fact
that some markers have mutation rates close to 100 times faster than others
in the marker set. It has been this imbalance of mutation rates which I was
trying to bring to Anatole's attention, and why I asked him in my first
posting on the issue...
"BACK MUTATIONS: When determining the "mathematical fact" that back
mutations are practically undetectable in the first 26 generations, did you
base the maths on an assumption that all markers have the average mutation
rate, or on the fact that in a mixed set of fast/ slow markers, most of the
mutations are happening on a very small subset of very fast mutating
I have been concerned about this imbalance for a long time. I wondered if
it would impact at all on the "variance of variance' calculations to
determine TMRCA, but I was unable to understand the maths enough to check
myself. I had presumed that Ken and others had allowed for individual
mutation rates of individual markers in the variance calculation, rather
than just doing the calculations based on all markers having average
mutation rates. Is that assumption correct? Or does having a mixed set of
fast and slow markers not affect the variance calculation?
In Tim's series of TMRCA calculations using variance, he consistently seemed
to get different results for faster markers than slower markers. Could this
be evidence that in mixed sets of fast/ slow markers, there is an effect
from the mixture which is not being mathematically allowed for?
[mailto:] On Behalf Of John Chandler
Sent: Wednesday, February 17, 2010 1:48 PM
Subject: Re: [DNA] Question for John Chandler
> Can you confirm that your estimates of the mutation rates for CDYa and
> are about .017?
Sorry, I misspoke in my previous note. My estimated rates for CDYa and b
are 0.035 each. When looking at John's message, and remembering the 0.035
number, I mistakenly divided by two an extra time to arrive at the
conclusion his rates for CDY were consistent with mine. They are not,
|Re: [DNA] TMRCA assessments by "Anatole Klyosov" <>|