Archiver > GENEALOGY-DNA > 2010-02 > 1265446807

From: "Lancaster-Boon" <>
Subject: [DNA] Variance Assessment of R:U106 DYS425Null Cluster
Date: Sat, 6 Feb 2010 10:00:07 +0100

Dear Anatole

Apparently there is something "wrong" with the example Ken gives, which
makes it give a bigger margin of error than the real example you were
discussing. You call it absurd and fuzzy. However it does not seem absurd or
fuzzy to me as a genetic genealogist. Before you write that this must be
because I simply do not understand anything, please hear me out.

I presume this is because it implies no clear ancestral modal and no clear
family tree structure? Is this correct?

Or if not can you explain more precisely what is "wrong" with the example?

To me Ken's example looks like something we see all the time even within
solidly defined family groups, or SNP defined clades. Again, before you say
this shows I am ignorant, let me define that what I mean by this is that
implied family tree structures and ancestral haplotypes are normally fuzzy.

My experience in practice is that the implied family tree and ancestral
haplotype coming from a particular set of, lets say 284 x 25 marker
haplotypes, can be completely changed by just using a few different markers
or a few different haplotypes.

I am not only talking about small groups. As you know the E-M35 project has
well over a thousand people, but we still see that predicting a family tree
and ancestral modal for any group of these haplotypes is very sensitive to
relatively small changes in the markers being looked at, or the individuals
being considered.

This indicates to me in a non-theoretical way, but nevertheless pretty
convincing way, that Ken's example is not extreme, and that the family trees
and ancestral haplotypes which we can develop as a starting point in your
method are never strong enough to be assumed correct. If you look at real
examples and try to locate the STRs which make tree assumption fail to work
the same way every time, you find patterns within haplotype groups very much
like the ones Ken mentions.

Perhaps this will be wrong, but it seems like the hidden part of your
modelling is not any complicated maths, but just an assumption that the most
simple family tree and ancestral haplotype implied by a group of haplotypes
is true, and can be assumed as a fixed foundation for further work.

Obviously if you assume this then your whole perspective will change and
everything will be as simple as you describe it. Effectively a big part of
your margin of error, connected to the possibility that your estimation of
the tree structure and ancestor haplotype are wrong, would in this case be
ignored and assumed away.

Best regards

From: "Anatole Klyosov" <>
Subject: Re: [DNA] Variance Assessment of R:U106 DYS425Null Cluster
Date: Sat, 6 Feb 2010 00:10:43 -0500
References: <>

From: "Ken Nordtvedt" < >

> From: "Anatole Klyosov" < >
> If mutation counting is very mysterious to you, how can we go
>> further?

> What you call mutation counting is mysterious to me. Suppose you have a
> collection of ten haplotypes of 4 STRs...But I personally would not
> consider that a count
> of the number of mutations. Some of those differences are probably the
> consequences of a singular past mutation.
> So is your "mutation count" in this example seven (7) or ten (10)?
> (4) 10 10 10 10
> (1) 10 10 9 10
> (2) 10 11 10 10
> (1) 10 9 11 9
> (2) 10 9 10 9

My response:

Why didn't you give me just two 2 marker haplotypes and request to count

I am familiar with that "approach" to suggest an absurd example to further
dismiss the rational approach. Mind you, I gave you before an example of 284
of 25-marker haplotypes, which clearly show a base (the most frequent)
haplotype. Why wouldn't you stick to THAT case? Why such a burning desire to
dismiss a rational approach even on a ground of an absurd/fuzzy example?

Of course I can count mutations in your example based on a "permutational"
approach, also described in my paper which I cited in the preceding message.
However, a margin of error in that case will be high. Just read the paper
and you will get an answer. However, I was talking about a much simpler case
and, respectively, a much smaller margin of error.

Anatole Klyosov

This thread: