Archiver > GENEALOGY-DNA > 2007-01 > 1167705361

From: "Ken Nordtvedt" <>
Subject: Re: [DNA] DYS388 mutation rate
Date: Mon, 1 Jan 2007 19:36:01 -0700
References: <>

----- Original Message -----
From: "Bonnie Schrack" <>
To: <>
Sent: Monday, January 01, 2007 7:15 PM
Subject: Re: [DNA] DYS388 mutation rate

> John Chandler wrote:
>> Linkage disequilibrium is a major effect on Y chromosome haplotypes.
>> I made the following table based on a comparison of the modes and rates
>> determined from two disjoint subsets of my Y37 dataset. Subset 1 consists
>> of all haplotypes with DYS388<=12, while Subset 2 has all haplotypes with
>> DYS388>12. The rate estimates for most markers differed sharply between
>> the two subsets, not just for DYS388.
> Thanks very much for this extremely useful work, John! I have been
> discontented for quite a while now with the general, all-purpose
> mutation rates that were being proposed for each marker (regardless of
> haplogroup, etc.), because I knew the mutation rate they were giving for
> DYS388 was far too low in comparison to that of other markers, when
> we're dealing with haplogroup J data, where as you know, the highest
> DYS388 modal values are found The idea of calculating the mutation
> rate for those with values of 12 or less, vs. those with 13 or more, is
> excellent.

I have some serious doubts concerning the magnitude of the results
concerning specifically the DYS388 marker, when the population is divided in
such a manner (12 or fewer repeats at 388 versus 13 or greater repeats at

Note that 98.5 percent of the haplotypes in the "low" repeats sub-population
are sitting on the boundary with 12 repeats. So one loses all the mutations
"up" to 13 for the great bulk of haplotypes in that sub-population, and this
amounts to roughly half the mutations for DYS388. On the other hand only 35
percent of the haplotypes in the "high" repeats sub-population sit at the
boundary value of 13. So a much smaller percent of DYS388 mutations are
lost because of the missing 13 to 12 repeats.

I think it much better to divide the haplotype population into "low" and
"high" based on their memberships or clades in those haplogroups with low
repeat modal value in one case versus those haplogroups or clades with high
repeat modal value at DYS388 --- NOT because we are looking for haplogroup
effect on mutation rate, but because this does the essential separation we
want based on repeat length without non-randomly throwing away mutations
that cross some artificial boundary. If the separation is done using
robustly identified clades, there will be a negligible count of lost


This thread: