DNA-R1B1C7-L ArchivesArchiver > DNA-R1B1C7 > 2012-03 > 1333023771
From: Malcolm McClure <>
Subject: Re: [R-M222] Geographical distribution of M222+ and a peak at DF23+
Date: Thu, 29 Mar 2012 13:22:51 +0100
Ken Nordtvedt himself says: "There is no way at all to estimate age back to (mutation?) happening of an snp."
His explanation of how he uses variance is at
So far as I can understand his rationale, Nordtveld has established his variance analysis methodology using data derived from black box Monte Carlo simulation of mutation rates and clade formation, not from actual SNPs in real families. Nested ANOVA analysis of real data requires randomisation of sampling. For observational data, the derivation of confidence intervals must use subjective models. In practice, estimates of effects from observational studies are often inconsistent. In practice, "statistical models" and observational data can be useful for suggesting hypotheses but otherwise be treated very cautiously.
On 29 Mar 2012, at 09:37, Sandy Paterson <> wrote:
> Ken Nordtvedt, erstwhile Emeritus Professor of Physics at Montana State has
> proved that
> E(v) = mG
> where E(v)= expected marker variance
> m=mutation rate
> G=number of generations
> So the number of generations taken to reach a given level of dispersion of
> marker scores can be estimated as (observed variance)/(mutation rate).
> Obviously, the more markers the better. This means it's quite natural to
> divide the observed sum of variance of one haplogroup by that of another in
> order to get a feel for the age of one haplogroup relative to another.
> That's what Mike did, and he did so in order to avoid arguments about poorly
> researched mutation rates. I think that's perfectly valid.
> Below are links in .xlxs and .xls formats to a file containing the known
> DF23+ as of this morning.
> The variances sum to 19.99, approximately double the comparable figure for
> M222+. Given that there are only 13 haplotypes so far, it's probably a good
> idea to multiply the 19.99 by 13 and divide by 12 in order to convert to an
> unbiased estimate (Excel doesn't bother with this refinement).
> Looking at the names so far, all three of Lamont, Johnson and Kelly now have
> both DF23+ and M222+, although none of these 3 surnames have much M222+. I'm
> sure there will be many surnames with both M222+ and DF23+, but so far none
> have been reported with origins outside of Ireland/Scotland.
> -----Original Message-----
> [mailto:] On Behalf Of Malcolm McClure
> Sent: 28 March 2012 11:13
> Subject: Re: [R-M222] Geographical distribution of M222+
> We can only have confidence in statistical methods based on samples if they
> can be calibrated and verified. So far as I know, no group has yet
> established a verified line of descent from a known MRCA extending back 2000
> years. I am unaware also of peer reviewed studies in any field where
> variance ratios of (samples including estimates) from different populations
> have divulged verifiable results in the time domain.
> Mike's statistical method assumes that all point mutations are equally
> random, unidirectional and unaffected by variation with time of
> environmental mutagens such as solar radiation and nicotine.
> Each of those assumptions seems at best to be questionable.
> On 28 Mar 2012, at 08:15, Sandy Paterson <>
>> I don't believe what Mike did in dividing the sum of observed
>> variances of
>> M222+ by those of P312 is an abuse at all. Of the 111 markers in the
>> M222+ largest
>> FTDNA test, only about 35 of the mutation rates are (reasonably)
>> well-researched the remaining 76 have to be estimated in order to do
>> estimated TMRCA calculations. Of the 67-marker panel, about 26 are
>> fairly well-researched, with very little known about the remaining 41.
>> I have chosen for the most part to quote sum of variances rather than
>> ETMRCA, but Mike chose to quote ratios to P312. So when he quotes a
>> figure of 0.63 for M222+, that means by his estimate, M222+ is about
>> 63% as old as P312. I can't see anything wrong with that at all,
>> although I don't believe it is accurate to use fewer than around 50
>> markers in the summation. In any event, ETMRCAs are not normally
>> distributed, they are skewed, with a long tail at the high values and a
> shorter tail at the lower values.
>> -----Original Message-----
>> [mailto:] On Behalf Of Malcolm McClure
>> Sent: 27 March 2012 14:56
>> To: ;
>> Subject: Re: [R-M222] Geographical distribution of M222+
>> I welcome your clear statement about the aims and objectives of M222
>> It was beginning to get diverted by speculations about the dim and
>> distant past rather than being confined to evidence concerning the
>> relevant past two millennia. Until we can establish a clearer identity
>> for our tribal antecedents, their strifes and allegiances over the
>> generations, their migrations and bottlenecks, we are unlikely to
>> establish individual family and surname antecedents with confidence.
>> The recent correspondence about Variance ratios seems to me to be a
>> misuse of statistical tools that were devised to reflect the shape of
>> Normal Distributions based on a single measurable variable. Lumping
>> variables makes for nonsense statistics.— Just my 2¢ worth.
>> R1b1c7 Research and Links:
>> To unsubscribe from the list, please send an email to
>> with the word 'unsubscribe' without
>> the quotes in the subject and the body of the message
> R1b1c7 Research and Links:
> To unsubscribe from the list, please send an email to
> with the word 'unsubscribe' without the
> quotes in the subject and the body of the message
> R1b1c7 Research and Links:
> To unsubscribe from the list, please send an email to with the word 'unsubscribe' without the quotes in the subject and the body of the message
|Re: [R-M222] Geographical distribution of M222+ and a peak at DF23+ by Malcolm McClure <>|