**GENEALOGY-DNA-L Archives**

From:"Ken Nordtvedt" <>Subject:[DNA] Interclade TMRCA influenced by Number of STRsDate:Sun, 14 Feb 2010 16:55:28 -0700It is known that it is total mutation rate of all haplotype STRs which is used to make (interclade) age estimates which includes TMRCA for any pair of haplotypes.

<G> = Sum i of Var(i) divided by 2 Sum i of m(i) == Sum i of Var(i) / 2M

i is summed over STRs, <G> is the expected value estimator of age in generations.

M = Sum i of m(i)

But it turns out many small m(i) adding up to total M is better than fewer large m(i) adding up to total M as far as the statistical confidence interval of the age estimate is concerned. For simplicity, let's suppose our total M is composed of N identical rate STRs, each with m = M/N

Then the 1 sigma value for G estimation is given by:

dG(1sigma) = Squareroot { G(1+4MG/N) / 2M }

As can be seen, the larger N the smaller 1sigma dG is for fixed M. This is due to the non-linear nature of the individual STR distribution of variance values.

Variance of Var(i) = 2 m(i) G {1 + 4 m(i) G }

Without that non-linearity, the 1 sigma would be just Squareroot { G/2M } , in which case how M was composed would not matter.

The result has a generalization valid for haplotype STRs of mixed mutation rates.

It should be interesting to see how the actual shape of the Sum i of Var(i) distribution changes for fixed total M, as we go from few fast mutators to many slow mutators.

The simulation program has to be fired up to do this.

Ken

a..

**This thread:**

**[DNA] Interclade TMRCA influenced by Number of STRs by "Ken Nordtvedt" <>**