Archiver > GENEALOGY-DNA > 2005-05 > 1117211508

From: (John Chandler)
Subject: Re: [DNA] Middle Eastern ancestral markers on new Euro 1.0 test
Date: Fri, 27 May 2005 12:31:48 -0400 (EDT)
References: <>
In-Reply-To: <> (

Ann wrote:
> The Euro test subdivides just the "European" component of the prior tests --

The word "subdivides" is misleading here, but read on...

> in other words, the four categories (Northern, Middle Eastern, Southeastern
> Europe, and South Asian) will always add up to 100%, regardless of whether you
> tested 50%, 75%, or 100% European on the early tests with global subdivisions.

This statement is exactly true, but it is important to realize that
the test can be (and assuredly has been) applied to people with 0% and
25% European ethinicity. The results still add up to 100%, by design.
All humans have all of the markers used in this test, and so everybody
will get a result. I would guess that among the 310 markers used in
this new test there are at least a few that are also in the global
tests, but clearly there are many additional markers.

This test does not isolate just the European portion of someone's
genome, but rather assays the whole picture.

> There's lots more detail about the test methodology at the Ancestry by DNA
> site:

This web site presents some more simulation results that are supposed
to give you an idea of the test uncertainties. Unfortunately, the
testing company still does not understand the math behind their own
test, and the simulations do not mean anything like what the
discussion says. They calculate a "bias" for each type of simulation
and claim that this bias is due to the continuous allele frequencies
of the markers used in the test. Actually, the bias is due to their
convention of truncating all off-scale values at 0% or 100%. For
example, they start with a simulated sample that is 100% South Asian
and find that the average result is only 94% South Asian. Obviously,
the random variation within the South Asian population will give a
variety of results, but these variations are both plus and minus, and
all the plus variations lead to results >100%, which are clipped off
and reported as just 100%. This clipping is the source of the bias.

Fortunately, we can recover the statistical information implicit in
these results. If we multiply the bias by 2 x sqrt(2 pi), we get out
the statistical uncertainty. Bottom line: from the simulations, we
can conclude that the statistical uncertainty is 43 percentage points
(!) on the North European estimate, 31 on the Southeastern European, 5
on the Middle Eastern, and 29 on the South Asian. Note that the
Middle Eastern results are far less uncertain than the others.

Applying this to Charles' results, we can say that his Southeastern
European reading is statistically not significant; his Northern
European reading is just barely significant (but it's too big to
ignore, of course); and his Middle Eastern reading really does mean

What it actually means is another story. Don't forget that everything
depends on exactly who was included in the four reference samples that
define these categories.

John Chandler

This thread: