GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2009-02 > 1235407028


From: James Heald <>
Subject: [DNA] TMRCAs for groups of haplotypes ?
Date: Mon, 23 Feb 2009 16:37:08 +0000


I wonder if people can help me out?

I seem to have got myself into a difference of views in another place
with Anatole Klysov (not entirely unknown to this list),
http://en.wikipedia.org/wiki/Talk:Y-chromosomal_Aaron#THE_Y_CHROMOSOME_OF_THE_JEWISH_HIGH_PRIESTS_ZADOKITES
and I was wondering if I could sanity-check a point that has come up
with people here, particularly the opinion of the people like Ken and
John Chandler.

Anatole has made some comments there that seem quite odd to me, such as

* A change from 13-18 to 14-17 in DYS 385 a/b should be counted as only
a one step mutation (giving rise to an 11/12 match), not a two step
mutation (giving a 10/12 match); and

* "J1 CMH are not Cohanim ... J2 are the reals Cohanim today according
to fundamental and standard basis."


But the real proposition I want to bring to the list is this. It seems
to me that, if you have only tested a given number of markers, that puts
an unavoidable ceiling on the accuracy with which you can estimate a
TMRCA for a group, no matter how great the number of haplotypes you
sample from that group.

And the reason is this: even if you knew (from an infallible oracle) how
far back you had to go to get back to only two lines left standing,
there would still be a remaining uncertainty in the coalescence time for
the last two lines -- viz the confidence interval in the TMRCA for a
12/12 match or a 25/25 match or a 37/37 match (as appropriate) that one
can look up on Bruce Walsh's page at
http://nitro.biosci.arizona.edu/ftdna/TMRCA.html
So, just on the basis of this, there must a 95% interval of at least 650
years in any 37 marker calculation, 975 years in any 25 marker
calculation, and 2225 years in a calculation based only on 12 markers.

(In fact, I suspect reasonable uncertainties must be substantially
larger, since this is the unavoidable contribution to the uncertainty
from only the very last step of the coalescence. But even having a
minimum ballpark figure is at least a start).


Similarly, if one has a big group of haplotypes centred on a signature
12-23-15-10-14-17-11-15-12-13-11-29
(and maybe some other markers known as well), and a separate group of
three 12-marker haplotypes all with signature
12-23-15-10-13-18-11-15-12-13-11-29
it seems to me that there must be an uncertainty in the TMRCA of those
groups given by a 95% confidence interval of at least 4925 years wide,
that being the number that comes out of Bruce Walsh's figures for the
uncertainty in the TMRCA of a 10/12 match, corresponding as a minimum to
the uncertainty in the TMRCA between the common ancestor of the three
13-18 haplotypes, and the central 14-17 haplotype.


Anatole on the other hand insists that this is "ignorance"; shows I have
"no understanding of kinetics in general, and kinetics of mutations in
haplotypes in particular"; I simply "do not understand the concepts of
DNA", and that of course the number of haplotypes matters.

He gives an uncertainty in the overall TMRCA as only "+/- 200 years"


What do people think?

My view is that even if he has 100 haplotypes in his 14-17 group (which
he claims), he simply doesn't have the data to claim a TMRCA to the
13-18 group with any 95% CI better than the 4925 years of Bruce Walsh's
calculation; and if his calculations aren't reflecting that, then his
calculations, at least as regards the uncertainty, are simply broken.

(Perhaps in much the the same way that Thomas et al in the original CMH
paper used an ASD method to calculate a 95% CI of 1150 years despite
only using 5 markers,
http://www.ucl.ac.uk/tcga/tcgapdf/Thomas-98-Nat-Cohen.pdf
ignoring in their calculations that the lineages would certainly
coalesce, so could in no way be considered independent).


The *right* way to do a calculation like this is presumably to do a
full-on Bayesian sampling simulation with something like BATWING.


But as people often do make claims about group TMRCAs, or that
particular individuals "must" have a recent coalescent time with
particular clades, even on similarly very short-length haplotypes, I
thought it would be useful to raise here.

Cheers,

James.


This thread: