From: "Tim Janzen" <>
Subject: Re: [DNA] Y-chromosomes of Jewish priests (2009 paper)
Date: Wed, 12 Aug 2009 17:25:10 -0700
Dear Ken,
The logic behind omitting duplicate copies of the same haplotype is
that a substantial number of duplicates will skew intraclade (and to a much
lesser extent interclade) TMRCA estimates lower than they otherwise would
be. This is the reason that I only include one haplotype per surname when I
am running interclade TMRCA estimates. This is also the reason why
intraclade (and to some extent also interclade) TMRCA estimates tend to
start falling once you hit about 100 (or sometimes less) haplotypes.
Omitting the duplicate copies of the same haplotype when doing intraclade
TMRCA estimates helps you better establish a probable upper limit as to what
the true TMRCA is for that particular group of haplotypes.
There is also value in doing intraclade TMRCA estimates where you
include all duplicate haplotypes in a dataset (except those with the same
surname). This helps better establish a probable lower limit as to what the
true TMRCA is for that particular group of haplotypes.
Ideally, the 99 J-P58* samples whose haplotypes were included in
Hammer's paper would have been tested to 67 markers. This would have
allowed intraclade calculations to be done that were potentially more
accurate. In any case, the intraclade TMRCA estimate I got using all 99
J-P58* 12-marker haplotypes from Hammer's paper was similar to the one I got
using 25 Cohanim 67-marker haplotypes from the FTDNA haplogroup J project.
The problem with doing only interclade age estimates on some of
these clusters such as I just did for the Cohanim group from the FTDNA
project is that you can't be certain that all of the haplotypes have
actually been placed into the correct cluster when you run the calculations
since there may have been more than one mutation that has occurred for the
STR (or STRs) that have been used to divide the group of haplotypes into
separate clusters. Hopefully SNPs will eventually be discovered that will
allow accurate segregation of these groups of haplotypes into specific

What's the logic with keeping or omitting copies of the same 12 marker
haplotypes? By selecting your haplotypes you can get about any coalescence
age you want?

Better to do interclade age estimates.

