GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2008-05 > 1211256177


From: "Ken Nordtvedt" <>
Subject: Re: [DNA] S21/S28 Split+m223 stuff
Date: Mon, 19 May 2008 22:02:57 -0600
References: <018701c8b9dc$c8cbeb10$6400a8c0@Ken1><ea3bd9560805191258n7d57e52an52752fc51f79edc4@mail.gmail.com><01f301c8b9ed$119eb2e0$6400a8c0@Ken1><ea3bd9560805191336u212d50bxda454d1c27958f9f@mail.gmail.com><024001c8b9f5$b85eb640$6400a8c0@Ken1><ea3bd9560805191457j17fda021nd57964e0743a802d@mail.gmail.com><026301c8b9fd$d86792b0$6400a8c0@Ken1><00d601c8ba05$6159b150$0100a8c0@john><031b01c8ba08$bbd348f0$6400a8c0@Ken1><002101c8ba27$5c3f6ad0$0100a8c0@john>


There is a confidence interval for the formula. It, incidently, is so
straightforward and intuitive (the one for age to MRCA for two different
clade populations), it must be used widely in journal papers. I think it
has never been applied before to the S21 and S28 populations. If anyone
has, it should be straightforward to compare calculations and results.

I have not yet worked out the confidence interval for the interclade
formula, but I'd guess it is something like

delta G / G = 1/squareroot(2MG) times 1 / square root(effective N) with
effective N the number of effectively independent paths from a haplotype in
clade A to a a haplotype in clade B. N would have nothing to do with sample
population size and could be close to 1 for two clades connected mainly by
long pre-MRCA single branch lines. But the 2MG values are typically of
order 16 or so, so deltaG/G of 25 percent or less would not be
unreasonable. Our S21/S28 application has essentially no appreciable
branch lines before their two MRCAs, so we might get some effective N > 1

Yes, it assumes selection neutrality of the ydna mutations. But we can't
throw in selection for one application and leave it out for other
applications.

Remember what the basic counting is all about for the S21/S28 MRCA. We are
just averaging the squared distance version of genetic distance between
pairs of haplotypes, one of the pair being a present day S21+ and the other
of the pair being a S28+. If G is the distance back to the MRCA for S21 and
S28 clades, each pair is precisely separated by 2G generations. The
expected variance produced in 2G generations is 2MG for every pair. This is
not rocket science, although my round about derivation sort of was. This
problem of interclade MRCA age estimate does not have the complication of
path length differences between different pairs of haplotypes that you raise
and which modifies the self variance for a single clade's MRCA age.

My excel software is available to anyone who wants to check it out for
validity and mistakes, or for use.

Ken


----- Original Message -----
From: "Alister John Marsh" <>
To: <>
Sent: Monday, May 19, 2008 9:12 PM
Subject: Re: [DNA] S21/S28 Split+m223 stuff


> Ken,
>
> I admit my posting was based on assumptions. I was putting forward a way
> of
> looking at the issue. I don't think the assumptions can be considered
> disproven just yet.
>
> I think that an Ockham's razor approach would find the assumption that
> S116+
> was in Ireland and other places 8,000 years ago would be one of the
> simpler
> ways of explaining the present day distribution of S116+. However, I have
> an open mind, and it will be interesting if 8,000 year old Y-DNA is ever
> recovered to prove this matter one way or the other.
>
> If S28+ is not long after the common ancestor of S21+ and S28+, an
> explanation could be that S28+ is older than 8,000 years. Could it be
> that
> "if" S28+ is 8000+ years old, that your formula would need a "fudge
> factor"
> added?
>
> Your formula is probably reliant on assumptions, such as assumptions that
> no
> Y-STR mutations are detrimental to the long term survival of a Y-line.
> Your
> formula may give misleading results by having large numbers of survivors
> of
> recent branches, and none or very few survivors from very old thin
> branches
> from antiquity, or none from extinct branch lines. Perhaps what your
> formula is showing is the time of large expansion of S28+, rather than the
> age of S28+?
>
> Do you have "statistical confidence intervals" for your formula? Is the
> 95%
> confidence interval 2000 to 8000 years?
>
> John.



This thread: