Archiver > GENEALOGY-DNA > 2010-02 > 1265755464

From: "Lancaster-Boon" <>
Subject: [DNA] Variance Assessment of R:U106 DYS425Null Cluster
Date: Tue, 9 Feb 2010 23:44:24 +0100

Dear Anatole

Once again thanks. Coming to your response...

Concerning the example you asked me to give I note that you have not
responded to it really because you say I bastardized the data and you say
you do not believe me about the markers I did not paste into this forum.

>Haplotypes should be shown in their entirety, not by fragments. What is
"merely the same"? You mean that only shown alleles are different from
each other, and all other alleles in 37- or 67-markers are practically
the same? Let me not to believe in it.

Well, maybe I gave the wrong impression by saying we did not need to type
all the markers, but I did give you a web address where you could get the
full haplotypes. It was just a question of making readable mails. In the
past I have also explained to you how to collect hundreds of E-V13
haplotypes. I only reduced these for presentation. You did not need to guess
what the other marker values were.

>My guess, though I cannot be certain with these bastardized haplotypes,
that they belong to several "shallow" branches (that is several hundred
years "old" each), however, they are derived from a rather ancient "most
recent" common ancestor. There is nothing like "shooting in the dark"

Well maybe, but in order to count mutations I think you need a full
"family tree", such as the ones you show in your papers. You need
knowledge of not only one common ancestor for whole uncontroversial
clades, but also the common ancestors for each sub-branch and
sub-sub-branch etc. But maybe I understood you wrong...

You give your own SMALL example (so small examples are sometimes
acceptable!) in order to show a case with an obvious most likely
ancestral modal. I am not sure why you think this needed to be shown. Of
course there are many non-controversial clades, groups with obvious
common ancestors. I am not only speaking of ones defined by SNP, but
also many defined by STR analysis. I hope I did not imply that I doubt

So let's take E-V13, or R-M222, or the whole Somerled modal group, or
any other, as a whole clade. We have reasonable confidence that they
have ONE common ancestor, but we know that sub-sets of each of these
clades have more recent common ancestors. And we know that true mutation
counting requires knowledge of all the nodes in this descent down to
today, in other words a "family tree" or phylogeny?

I do not think you are saying that my example is a "comb shaped" family
trees. Certainly your publications show family trees analysed, for
example within E-V13, where others find it very hard to be confident
about this.

Also, in your answer to John concerning his example, I see the point which I
do not understand in one identifiable place: "Hence, the dataset shows two
"local lineages", each one with its common ancestor, which deviates from
each other by four mutations in the 37 marker haplotypes." So you do look
for a family tree, but this step is the one you never explain: just "hence".
So how do you see the two clades within the one dataset? How do you exclude
other possibilities? Do you use one of the networking computer programmes or
just do it by eye?

So I guess I need to look at the example you offer, and also comments like
this one:-

I wrote: I understand that what you objected to in Ken's example is that
there were numerous possibilities...

You replied, a little weakly I think: "No. It means that you did not
understand my objection. If so, I cannot help with it."

I have been saying, as above, that you need two things to count mutations:-
1. The ancestral haplotype
2. A knowledge of all the nodes in the family tree

You sometimes seem to agree with 2 (e.g. "To mix "common ancestors", aka
different branches, is the most frequent mistake done by authors of
"academic" papers. It is still worse that after it they employ the
"Zhivotovsky coefficient" and divide that mess by about three."), but
what I understand is that you in fact also claim that you only need the
ancestral haplotype, 1? For example you say:

"Back mutations produce a negligible effect on the first 26 generations,
and minimal one on the first 40-50 generations. I do not think that your
family genealogy goes that deep in time."

So I guess for recent clades you just count mutations from the ONE
common ancestral haplotype and ignore the "shallow" time depth???

Unfortunately, if you look at the example I gave, it appears to have
back migrations or something equivalent in a shallow time depth. Reality
has not obeyed you? Will you just be satisfied to make the claim that
this is an exceptional case? I do not agree.

At one point in your response you say that "THE ancient "the most
recent" common ancestor ... can be deduced, though, by comparing base
haplotypes of several subsets, which can be identified on the haplotype
tree" and if these are "poorly identifiable" this is "due to
insufficient statistics."

So I guess what I am saying is that most datasets I see are
insufficient. To come back to my common sense way of seeing this, if
adding a few new haplotypes or markers (or taking a few away) gives
completely different results in your analysis, as I think it will, then
the "tolerance" must be pretty big, and the statistics are probably
"insufficient" to be confident about the "hence" step above.

The parts of the human Y haplotype tree which have been identified so
far are:
1. The broad branching which is now clearly defined by SNPs.
2. A small number of clear STR haplotypes, often found discussed on
forums like this one.
3. The many very shallow recent family groups like the one in my example and
every surname project.

These leaves a lot of gaps, and whenever the data we have is good enough to
fill a gap, this seems to me like a big event. You seem to be saying that
you just look at some data from a clade and you can instantly see the family
tree structure? How? And if you can not, how can you count mutation other
than just by counting the mutations for each individual result from the one
common ancestor?

Indeed, we are forced to work with a very small amount of representative
data in this field.

Best Regards

This thread: