GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2007-07 > 1185576938


From: Thomas Krahn <>
Subject: [DNA] A little note about haplogroup G Deep Clade tests
Date: Fri, 27 Jul 2007 17:55:38 -0500


Dear List,

While I was developing a new SNaPShot multiplex for Y haplogroup G I
came across some difficulties of technical and phylogenetic nature.
This will explain why the G deep clade results that will be released in
the next days may sound a little bit unexpected for some.

First I need to mention that P15 (on segment DYS221) is not exactly at
the position as it was published in [1] but really 2 bases left (in
direction Yp):

(use fixed width)

TCTTACGCCTGAAGGCAGATGAAAGTTGCAAAAGTGATTCCCA --> HUGO DNA
~~~~~~~~~~~~~~~~~~~~~~~AGTTGCAAAAGTGATTCCCA --> Published P15 reverse Primer
tcttacgTctgaaggcagatgaaAGTTGCAAAAGTGATTCCCA --> Published P15 SNP
tcttaTgcctgaaggcagatgaaAGTTGCAAAAGTGATTCCCA --> De facto P15 SNP
~~~~~~GCCTGAAGGCAGATGAAAGTTGCAAAAGTGATTCCCA --> My SNaPShot Primer
(reverse cpl.)
.....^.^................................... --> Two different positions

Published position is 138
Actual position is 136 on the PCR segment defined by the published
primer pair.

Of course this communication error has caused some delay because I
needed to order a second primer.
But now P15 works fine and makes perfect sense with control DNA and the
phylogenetic context.
The first 50 customers have been confirmed by direct sequencing of the
PCR product.

It is very important to know that P16, P18 and P20 are palindromic SNPs
at Yq11. The exact positions can be visualized by reverse ePCR on the
NCBI site (http://www.ncbi.nlm.nih.gov/sutils/e-pcr/reverse.cgi):

Primer input:
P16 aggctccatctgtagcacac taaccttatagaccaaccccg 150-300
P18 tggatctgattcacaggtag ccaacaatatgtcacaatctc 500-750
P20 tggatctgattcacaggtag ccaacaatatgtcacaatctc 500-750
DYS464 TTACGAGCTTTGGGCTATG CCTGGGTAACAGAGAGACTCTT 200-400
DYF399 GGGTTTTCACCAGTTTGCAT CCATGTTTTGGGACATTCCT 250-350
DYS385 AGCATGGGTGACAGAGCTA TGGGATGCTAGGTAAAGCTG 350-450

Dataset: Homo sapiens gnome

DYS464, DYS385 and DYF399 are just added for better orientation on the
palindromic map.

http://www.ncbi.nlm.nih.gov/sutils/e-pcr/reverse.cgi?rid=46aa4ca6b6a7744c

The result can be displayed in Mapviewer:
http://www.ncbi.nlm.nih.gov/mapview/maps.cgi?taxid=9606&chr=Y&RID=46aa45dce0197809&QUERY_NUMBER=*&remote=e-PCR&maps=epcr_set&cmd=focus

P18 and P20 are on the same 3 segments. They are close to the 3 DYF399
alleles on palindrome P1 and on the segment between P3 and P4
palindromes. So there are really three alleles that describe the SNP
state for each of these markers.
The theoretically possible results may be:

P18:
(C-)-(C-)-(C-) -> ancestral alleles at all 3 markers
(C-)-(C-)-(T+) -> ancestral alleles at 2 markers and derived allele on
one marker
(C-)-(T+)-(T+) -> ancestral alleles at one marker and derived on two markers
(T+)-(T+)-(T+) -> derived alleles at all markers

P20 in a more abbreviated form:
3(C-)
2(C-)-(delC+)
(C-)-2(delC+)
3(delC+)

We have observed the alleles 3(C-) and 2(C-)-(T+) at P18 and the alleles
3(C-) and 2(C-)-(delC+) at P20.
Of course a recLOH may change this very quickly and we haven't observed
enough derived states so that we get a clear picture at these markers.

The NCBI site seems to have some glitch at the region around palindrome
P4 because it shows 4 DYS385 alleles. In fact there are only two. P16 is
also on the palindrome P4 which means that there exist also two markers
which will be reported with two alleles:

2(A-) -> ancestral
(A-)-(T+) -> heterozygote
2(T+) -> pure derived

We have already observed all three states at P16 which proves that
recLOH happens.
This means that after the point mutation was acquired, some descendants
have copied over the SNP to the other palindromic arm which consolidated
the haplotype to a stable 2(T+) conformation. But on the other hand it
may have been also possible that a heterozygous P16 may have reverted to
2(A-) and the derived mutation was lost again although they are
phylogenetically below haplogroup G2a.
I would consider a 50% chance of reversals, the same number as 2(T+)
positives unless I know a reason for an asymmetrical distribution. For
this reason I no longer consider P16 as a phylogenetically relevant
marker and I propose to depreciate P16 (and P18 and P20) from the Y
haplogroup tree. On the other hand it may be worthwhile to have a closer
look at DYF411 because it is not far away from P16.
If you have a P15+ result consider testing the DYF411 marker, because
the modal in G2 is 11-14 and in case of a recLOH at palindrome 4 this
should collapse to 11-11 or 14-14. In deed I have already seen a G2 with
DYF411 = 14-14. This may suggest a "hidden" G2a haplotype.

For technical reasons the P16, P18 and P20 results will be uploaded in
the classical way, but customers that have any of the derived states at
these markers will be informed by e-mail on the complete details.

I would recommend to remove the branches below G2 on the ISOGG tree and
treat them like private SNPs.

M377 turns out to be a valuable indicator and comprises the second
largest haplogroup below G. In our Houston lab we have been able to
classify every individual below G1, G2 or G5 so far. There was no G3 or G*.

I hope this clarifies some confusion that may occur when customers
receive their Deep G panel results.

Thomas

[1] YCC 2002 A nomenclature system for the tree of human Ychromosomal
binary haplogroups. Genome Res. 12, 339–348.
http://ycc.biosci.arizona.edu/nomenclature_system/data.html


This thread: