GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2009-11 > 1258926084


From:
Subject: Re: [DNA] A Note on L222
Date: Sun, 22 Nov 2009 21:41:24 +0000 (UTC)
In-Reply-To: <1935442127.122401258925166391.JavaMail.root@sz0002a.westchester.pa.mail.comcast.net>


Thomas, as you have noted, the L159.2 and L69.4 mutations are similar to the L222 mutation, because each mutation shows up in an area between two repeating elements. With L222, both are STRs, while for L159.2 and L69.4, on one side there is an STR with a very high mutation rate and on the other, there is a poly G sequence. One question that I have about these SNPs is whether you think the probability of reversing one of these mutations is high. These mutations break up the repetition, and normally I would think that that would reduce the probability of mutations, but with the mechanism of mutations being affected by nearby STRs, I am not sure how that would work.

My main reason for asking this question is the discovery of one or two R-L21 men with haplotypes quite different from typical Leinster cluster haplotypes, and I have to wonder if at least one of those is due to an independent L159 mutation. Within the Leinster cluster, we have two quite distinct groups, one of which is L159+, and one of which is not. Their 67 marker haplotypes are very close, but there are differences on some key markers between the two groups, with other key markers values shared by the two groups. It seems certain that at least some of the matching is due to common descent, so the surprise L159+ haplotypes are a puzzle. They might be due to a lot of divergence from ancestral haplotypes in their lines, convergence between the groups in the Leinster cluster, common descent with a back mutation on L159 making part of the Leinster cluster L159-, or an independent L159 mutation. Looking at the DNA results alone, I would guess that there was an independent L159 mutation, but it's just a guess.

Since you have mentioned these STRs, I wonder if you think that looking at the number of repeats on these fast STRs would be of any help to us. I think that there are other ways that we could explore the relatedness of these men, but I don't want to ignore the STRs if you think they might be helpful.

Kirsten Saxe

----- Original Message -----
From: "Thomas Krahn" <>
To:
Sent: Sunday, November 15, 2009 5:32:02 PM GMT -05:00 US/Canada Eastern
Subject: [DNA] A Note on L222

A Note on L222:

The L222 (rs9786587, ChrY:15318872 T/G) marker has recently caught the
attention of haplogroup J1 researchers because it showed up "positive"
in three individuals on an Illumina chip based assay.

Looking at ymap
http://ymap.ftdna.com/cgi-bin/gbrowse/hs_chrY/?start=15318872;stop=15318872;ref=ChrY;width=1024;version=100;grid=on;id=86d6cea3ff791fbfb8cd1ebca6668b3a;label=CYT%3Aoverview-Ideogram-Palindromes-DNA%2FGC%20Content-Genes-STRs-dbSNP-JW-KOREF-NA07022-NA18507-HelicosP0-CV-YH-M-PS-PerlegenPrimerPairs-other;h_feat=M2%40yellow
shows that this same mutation has already previously been observed in
the NA18705 cell line which is known to be of African origin and
according to the listed mutations can be assigned to haplogroup E-U174*.

This already cautions us that the marker stability is questionable and
we need to split the different haplogroups into two different marker
categories.
By definition I have then assigned
L222.1 to L222 derived individuals downstream of haplogroup E (exact
phylogenetic location of the mutation not determined) and
L222.2 to L222 derived individuals downstream of haplogroup J-L147.

The full amplicon sequencing of L222+ control individuals and J-L147+
negative controls revealed the actual reason for the L222 marker
instability. Here are the actual sequencing results of a L222+ sample
aligned to the (HUGO) reference sequence:
AAGGTAGAGATAGAAACAGGTAGACAGGTATATAGATGATGGGTATATAGATGAGAGAGAGAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGGCAGGCAGGCAGACAGAGAGACAGATACATAGGCAGGTAGGTAGGTAGAGGACAGAG
AAGGTAGAGATAGAAACAGGTAGACAGGTATATAGATGATGGGTATATAGAT----GAGAGAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGGCAGGCAGGCAGACAGAGAGACAGATACATAGGCAGGTAGGTAGGTAGAGGACAGAG
................................................................^ L222

What we see is the L222+ control on the top and the HUGO reference at
the bottom sequence. (if lines break up you may try to reconstruct the
sequences in an external editor with a fixed width font).

We also see that the whole area around the L222 position is clustered
with repetitive elements.
Notably the L222 marker is right at the border in between a (GA)n and a
(TAGA)n STR repeat block. The L222 positive controls seem to have an
insertion of two (GA) repeats, just like a STR frame shift mutation.
This 4bp mutation on a repeat block reminds us very much on the
mechanism of regular STR mutations which we already know are not a UEP
and which can happen back and forth in both directions. With this
background knowledge we must be very cautious when interpreting L222+ or
L222- results in a phylogenetic context. Because of the relatively low
repeat count I expect a rather low mutation frequency which may be
useful for a classification downstream of J-L147 but on the other hand
we need to be aware that a back mutation or a parallel mutation can
never be absolutely excluded.

This type of marker composed by a fusion of two distinct STR repeats has
been recently observed at a few other Y chromosome locations. The
L69/L159 and L49/L138 marker complexes are in the same category and are
likely driven by the same mutation mechanism.

Looking at the alignment above it appears that both sequences have the
ancestral T allele at L222. So why does the Illumina chip recognize a G
allele?
The alignment I have shown was created by finding the least base
mismatches an actually considers the insertions and deletions with a
very low priority. We can also create a slightly less optimal alignment
by shifting the insertion into the (GATA) repeat region:
AAGGTAGAGATAGAAACAGGTAGACAGGTATATAGATGATGGGTATATAGATGAGAGAGAGAGATAGATAGATAGATAGATAGATAGATAGATAGATAGATAGGCAGGCAGGCAGACAGAGAGACAGATACATAGGCAGGTAGGTAGGTAGAGGACAGAG
AAGGTAGAGATAGAAACAGGTAGACAGGTATATAGATGATGGGTATATAGATGAGAGAGATAGATAGATAGATAGATAGATAGATAGATAGATAGA----TAGGCAGGCAGGCAGACAGAGAGACAGATACATAGGCAGGTAGGTAGGTAGAGGACAGAG
............................................................^ L222

Since the Illumina chip uses a probe ending with the sequence
TGGGTATATAGATGAGAGAGA
and just appends the following base (T or G) they consider this as a
(T/G) polymorphism. However it would be more appropriate if we are
talking about an (ins/del) polymorphism. That's why FTDNA is going to
report L222 as ins+ or del-;

Summary:
- L222 is observed in two different haplogroups. L222.2 is assigned to
the J-L147 branch.
- The L222 mutation is really a STR-like insertion of 4 bases (GAGA).
- The STR like nature of this marker requires caution if the results are
interpreted in a phylogenetic context
- An L222.2+ insertion is interpreted as a G+ allele at Illumina chip
based assays.

I hope this helps,

Thomas




-------------------------------
To unsubscribe from the list, please send an email to with the word 'unsubscribe' without the quotes in the subject and the body of the message



This thread: