Archiver > GENEALOGY-DNA > 2010-02 > 1266113523

From: "Anatole Klyosov" <>
Subject: Re: [DNA] Y Tree SNPs can not be counted
Date: Sat, 13 Feb 2010 21:12:03 -0500
References: <>

>From: Sasson Margaliot <>
>Why is it that you do not simply use standard terminology?

Dear Sasson,

Who said that it is "standard"? The next guy?

I use terminology which I see as the most appropriate one for cases that I
describe. Terminology is not carved in stone. Terminology is always

> (Sasson) 1) There are 500 samples. 14 samples are identical, they all
> belong to the same haplotype. You say they are "14 haplotypes.".
> Standardly, there is just one "base" haplotype, and 14 samples are said to
> belong to it.

(Anatole) When 14 haplotypes among 500 are identical to each other, they are
still 14 haplotypes. Yes, they do represent the base haplotypes. So, what is
a problem? When I go further and write it as ln(500/14), you can see that
there are 14 haplotypes in the denominator.

I sincerely suggest you to look into the core of the issue, not to some kind
of side dishes.

>(Sasson) 2) The other 486 samples have 1788 differences from the base
>haplotype. You call these differences "mutations". People naturally think
>you count the number of mutation events, because this is what you appear to
>be saying. You probably can estimate the number of mutation events, but of
>course this is not what you count.

(Anatole) I respectfully disagree. People "naturally think" of very
different things. I cannot please everyone who "naturally think" of whatever
they want. I have a system I work with, and this system has its logic. I am
not going to change that logic because some people think "naturally" about
something else. In that logic a "diversity" of haplotypes is reflected by an
average number of mutations occurred since a timespan passed from their
common ancestor. For instance, it is 0.236 mutations per marker. It means
that during that timespan (3350 years in this particular case) every marker
in the chosen haplotype was mutated 0.236 times on average.

That is why I called it a "mutation". Every single change of an allele,
which makes it shorter or longer, it is a mutation. And, I repeat, after
3350 years every allele underwent 0.236 mutations on average (in a certain
haplotype). Whatever you call it - an "event", a "change", a "jump", an
"occurrence", "a god's will", it is still a mutation.

>(Sasson) 3) When a cluster has a prominent sub-cluster, you usually say
>that it "has two common ancestors". This is not a standard way to describe
>the situation.

(Anatole) Said who?

>(Sasson) 4) For confidence intervals, other writers give theoretical upper
>limits for a GENERAL CASE, and they know their intervals are correct. You
>are eliminating all the suspicious sub-clusters, so you are not dealing
>with the most general case. Naturally, your intervals are shorter, since
>you only accept well behaved clusters. But people simply think you give
>wrong intervals, because they are different.

That is not my problem. If those people cannot read and understand, I cannot
help them other than to educate them.

>"You are eliminating all the suspicious sub-clusters"... Where on Earth did
>you get that??
>"you only accept well behaved clusters"... Where on Earth did you get that?

I analyze EVERY branch (I do not use the word "cluster"), and every one has
its own margin of error.

>"But people simply think you give wrong intervals..."
Those people are just plain ignorant. If they would have been knowledgeable,
they would give their correctly (as they think) calculated figures. They
would have corrected me using my examples. Nobody of my "critics" here did
it. They produce empty words and no numbers. They talk generalities.

>(Sasson) Much of the discussion is concentrated on misunderstandings about

It happens. However, more typically this is not a reason. As you might have
noticed, the "discussion" here is not pointed. It is some fuzzy
generalities, elusive comments, negative remarks without producing a clear
alternative solution. I gave here lately at least five specific examples of
concrete haplotype datasets. Nobody have considered them and gave a
different solution, different figures, different (but concrete) margins of
error. That would be a proper and a constructive ground for a discussion.
And why do you think nobody of my "critics" did it? "Terminology"?? Are you

I can tell you why. Because it is my professional territory. Not theirs.

Now, typing of haplotypes and haplogroups, phylogeny, etc. is NOT my
professional territory, and I am not ashamed to admit it. When people talk
about it, I humbly listen and humbly ask questions. I learn. Why those folks
who do not understand dynamics of mutations, because it is NOT their cup of
tea, why they don't do the same? Meaning, humbly listen and humbly ask
questions? Why they do not want to learn? A BIG puzzle.


Anatole Klyosov

> If we have, say, 500 of 67-marker haplotypes,
> and 14 haplotypes among them
> are identical to each other (base haplotypes),
> and the other 486 mutated haplotypes have
> (collectively) 1788 mutations from the base,
> then those 1788 mutations are those
> that we count"

This thread: