Archiver > GENEALOGY-DNA > 2010-02 > 1266967854

From: "Ken Nordtvedt" <>
Subject: Re: [DNA] : low variance MRCA dates for P310 clades in Italy andSEEurope
Date: Tue, 23 Feb 2010 16:30:54 -0700
References: <><009901cab349$8797d6a0$5e82af48@Ken1><><><>

----- Original Message -----
From: "Alan R" <>

The problem I see is there is no obvious solution to the lack of good sample
of 67 STR tested, SNP tested to resolution people with roots from eastern
and even central Europe. So, if the present sample issue means no analysis
can be trusted, is there much hope of that changing in the next few years?

[[ If you want dates good to a couple centuries, things look dismal, and
full snp collection from y won't be dramatically better (in a moment).

For a single line of 160 generations, and total M of 1/10 which is about
best you can do after getting rid of a few problem markers, the 1-sigma
uncertainty in age estimate is 40 generations or 1200 years. This does not
improve very much with multiple samples, though it should improve. Problem
is you don't know how much it improves because you don't know the tree
structure, especially the earliest generations part of each tree which will
differ from tree to tree and depend on the luck or conditions on the ground
when the founders' descendants were first expanding.

I chuckle when I see sigmas quoted which invoked the famous 1 / SqRoot (N)
factor of statistics when N independent measurements have been made of
something. The kind of clade tree leading to N sample haplotypes you would
need to invoke that factor has the MRCA spawning N sons, with each son
producing a surviving line all the way to one of the N sample haplotypes of
today. Rather extreme scenario! In reality the N lines of descent are
highly redundant (not independent) because they all coalesce back to the

What happens with full y snp count which might average 1 snp every two
generations? In 160 generations you have expected count of 80 snps. Square
root of that is plus or minus 9 or 270 years for 1-sigma statistical

Best strategy right now is use most extended haplotypes possible --- get
biggest M possible --- for each and every haplotype used. Ken ]]

This thread: