Archiver > GENEALOGY-DNA > 2011-10 > 1319636810

From: "Peter Op den Velde Boots" <>
Subject: [DNA] FT-DNA's R-Z18 and Subgroups Project
Date: Wed, 26 Oct 2011 15:46:50 +0200

Hello All,

It's a couple of months since the FT-DNA R-Z18 and Subgroup Project was
started and recently we have passed the 50 member mark, so it's about
time we report on our plans and where we currently stand. The web site
of the project can be found here: For the presentation of the
analysis of the results the (we will shortly extent the
name) web site will be used, as we have complete control and we can use
all available web technologies in this environment.

R-Z18 is a new SNP downstream of R-U106 and was discovered as part of
the 1000Genomes analysis activity lead by Greg R. M. on
Greg has drawn a phylogeny in which R-Z18 was downstream of R-U106 and
upstream of R-L257, a SNP that was discovered early 2010 in FT-DNA's WTY
Project. The SNP details of Z18, together with a lot of other SNPs, were
submitted to Thomas Krahn, who first named the SNP YSC053 and
incorporated the primer in the set of the WTY project. As expected, the
SNP soon turned up in a participant of the R-U106xL48WTY project and was
put on FT-DNA's SNP menu, restoring the name to Z18.

About the time the Z18 SNP was available for testing, the results of the
first round of 111 marker upgrade became available. A lot of L257+
ordered this upgrade. One thing that stood out in the results was all
L257+ turned out to carry DYS463=25 (a >67/"SMGF" marker). The first
WTY-discovered Z18+ also had this allele and the assumption was made
that DYS463=25 might be characteristic of Z18+. A group of people having
DYS463=25 were asked to test Z18. To date, 19 people with DYS463=25 have
been tested for Z18, of which 15 were positive, the others were
scattered over sub-clades of R-U106 (no clear pattern). The tentative
conclusion is about 75% of U106+ with DYS463=25 are Z18+.

We currently are trying to find as many Z18+ as possible using DYS463=25
as a proxy. For every Z18+ we found via DYS463, we did a cluster
analysis on the basis of shared off-modals with all U106 profiles in our
database. In some cases, we could not find any similar profiles (and
these samples are now in our Z18* Group). In a number of cases we found
a remarkable close group of similar profiles we call a cluster. We've
given each cluster a name on the basis of the origin of the first
member. We do not expect that these names will turn out to be related to
the true origin of the cluster. But we guess most people to prefer the
names over STR-related names like 10-11-14-12-11-11, although we are
fully aware it's always hard to predict what will be in vogue in a few
months time. $:-)

1. East Anglia Cluster (DYS391=10, DYS511=11, DYS487=12, DYS534=14 and
possibly 643=11 and 461=11 or 10-11-14-12)
This is a fairly large group of samples; we currently have about 130
listed on the FT-DNA web site and we expect 90% or more of these people
to turn out Z18+.This cluster is called East Anglia, because the first
prospects that were identified, tended to have a name that's closely
linked to this area in England.

2. Swede Cluster (DYS391=10, DYS5458a=8, DYS447=24, DYS464b=14,
DYS607=14, DYS436=13 and DYS444=13-14)
As there are few people tested in this cluster, we do not know exactly
where the boundaries are. We currently estimate is there will be at
least 40-50 people in this cluster. Within this cluster is a 147.x SNP
(that's downstream of Z18), but it is currently unclear if this SNP will
be of any use, as it's at least the fifth occurrence in HgR and FT-DNA
have stopped naming further occurrences (hence the ".x").

3. Cumberland Cluster (DYS390=25, DYS5385a=11, YCAIIb=22 and DYS565=11
or 25-11-22-11)
Again a significant cluster, as we currently assume some 110 prospects.
In this cluster, there are small subgroups with DYS391=null and
DYS425=null, the latter according to the trend of Nul425's being
discovered in all major sub-clades of R-U106.

4. Scandinavia Cluster (DYS439=11, DYS460=10, DYS576=19-21, DYS442=13,
DYS534=16, DYS444=11 and DYS568=12)
This cluster was discovered by Jim Turner and described in a post on
this forum, see:
166437; we currently have some 50 prospects most with a Scandinavian

5. Continental Cluster (DYS385b=15-16, DYS449=30-32)
The STR pattern of this cluster used to be the pattern for the L257 SNP
and we see this cluster as an "extension" of the R-L257 clade. About 70%
of the carriers of the pattern in R-U106 tested L257+ and in the mean
time it has turned out that a significant percentage of those with the
pattern who were L257- (the other 30%) are in fact Z18+ (although a few
are Z18-). It looks like GATA-A10 (a > 67/"SMGF" marker) might turn out
to be an indicator for L257, as all L257+ have GATA-A10=13 (with one
clear back mutation) and most known Z18+ L257- have GATA-A10=12. One
curious thing about this cluster is that both L257 and L325 are in
FT-DNA's Deep Clade test and Z18 itself is not, but we have been
informed that Z18 will be added as part of the update of the company's
Y-Tree that will shortly be anounced.

There are a few smaller candidate clusters we are currently
investigating, so more of these groups might be on their way. If you
know any people who have one of the patterns mentioned and are U106+,
please advise them to test Z18.

When setting up our prospect lists we try to balance between finding as
many Z18+ as possible (thus allowing a little room for back-mutations;
especially if common surnames and family patterns suggest positively)
and guaranteeing the people on the prospect list a very significant
chance of testing positively. Our intention is, to give every prospect a
chance of 80-90% of testing Z18+, also to have some hope of him testing
in the first place (if the chance would be 100%, few people would
consider a Z18 test). We are fully aware that talking about percentages
is only fully justified when lots of tests have been performed, but we
need to move forward as well and we are convinced we give people a very
good idea of their chances: up to now only a single prospect in a single
cluster has tested Z18- (and we put him on the list AFTER knowing this
fact, in the hope of making the boundary of the cluster visible). In all
other clusters the score to date is a full 100%, btw.

In the mean time, we have done some reasonable testing of the other
Z18-related SNPs that were discovered as a result of the 1000Genomes
project. This testing is not yet fully complete, but we currently assume
Z14 and Z19 to be equivalent to Z18 and Z15 to be equivalent to L257.

Why are we doing all this ? Our intention is to bring people with a
related origin together (the people in a cluster will share part of
their background), but more importantly, we want to investigate and
describe the history of R-Z18 and R-L257. We are finalising a
significant upgrade of the geographic map system used on the site. The result will be an interactive map based on the
latest Google Maps V3.0 API in which each group or cluster of Z18 will
have its own distinctive colour of pins. We hope and expect to see a
pattern emerge. We intend to add all U106+ Z18- as transparent white
pins, so as to show the distinction between Z18/L257 and U106 as a
whole. The possibility of making full use of the facilities of Google
Maps is one of the reasons for using as our main web site; we
have full freedom to use all current web technology there (in a second
stage, it would be relatively easy to show geographic maps for other
clades, please inform us if you are interested (and sympathise with
R-Z18 of course)).

Any comments (we prefer constructive discussion over comments from
people who just don't accept the existence of the R-Z18 and Subgroups
Project and want it "dismantled") $:-)

Peter M. Op den Velde Boots, Amsterdam, Netherlands
Dave Stedman, Newport, Wales ().

This thread: