GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2011-11 > 1320647499

From: Stephen Forrest <>
Subject: Re: [DNA] Any feedback on the RCC method of TMRCA estimates?
Date: Mon, 7 Nov 2011 01:31:39 -0500
References: <mailman.1080.1320605197.10215.genealogy-dna@rootsweb.com><20111106152808.L0KW2.1830091.imail@fed1rmwml4201>

Jumping into this discussion, I looked at these papers when the author
posted them on a mailing list earlier. I've been meaning to discuss the
method for a while and detail my objections to it but haven't done so aside
from a few exchanges with the author. Please note I'm just a hobbyist with
a mathematics background so I imagine someone who studies this field in
depth could present a much more detailed case.

Sandy's explanation of the RCC definition is correct. My geometric
interpretation (which ties into my objections) is this:

The correlation coefficient is the cosine of the angle theta between the
two shifted vectors, so the RCC is thus 1/cos(theta)-1. For small values
of theta this value is reasonably close to the square of the Euclidean
distance between the shifted vectors (i.e. the sum of squares of
elementwise differences). Using the Euclidean distance (what
mathematicians call the 2-norm) instead of the sum of pairwise differences
(the 1-norm) doesn't seem to be traditional for comparing STRs. but it at
least makes some physical sense. However for large values of theta the
difference between the RCC and Euclidean distance grows, until of course
the RCC approaches infinity as the angle theta -> Pi/2.

I believe that whatever value the RCC has comes entirely from its use as an
approximation to Euclidean distance for small angles, and that its utility
decreases quickly the farther apart STR profiles become. The paper asserts
there is a universal linear relationship between an RCC unit and some
number of generations. I doubt this very much because the RCC diverges
from both the 1-norm and 2-norm as the angle between vectors grows and the
method makes no use of mutation rates.

Steve

On 6 November 2011 15:28, RThrift <> wrote:

> Sandy, please do, that's the point here.
>
> FYI, part 2 of Howard's pair of papers is at
> http://www.jogg.info/52/files/Howard2.pdf
>
> Ken, just read it (part 1).
>
> Once a matrix of pairwise correlation coefficients is set up (some details
> seem to be obscured in the paper), as Sandy mentioned each value is scaled:
> Take the reciprocal of the 'correlation coefficient' so calculated.
> Subtract 1. Multiply by 10,000. =RCC
>
> But that's just the starting point. He in effect comes up with an
> overall mutation rate ("time scale") from known pedigree /haplotype data
> using the Hodges-Lehmann estimator and pairwise TMRCAs. 1 RCC = 43.3
> years, SD ~ 8%.
>
> He argues that the average RCC of a cluster is a good proxy for the MRCA
> of the cluster. Time to the common ancestor = 52.7 x (the average RCC of
> the cluster) (just read the paper!)
>
> Richard Thrift
>
>
> From: "Sandy Paterson" <alexanderpatterson@
> ...I could also easily show how the paper uses (abuses?) the concept in
> trying to apply it to two haplotypes.
>
> -------------------------------
> To unsubscribe from the list, please send an email to
> with the word 'unsubscribe' without
> the quotes in the subject and the body of the message
>