GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2008-01 > 1199611159
From: "Tim Janzen" <>
Subject: Re: [DNA] SNP chasing using STSs
Date: Sun, 6 Jan 2008 01:19:19 -0800
In-Reply-To: <BFECJOAEEPCFBFFLLBGPAEGKEGAA.scorpion@netconnect.com.au>
Dear Dennis, Dick, Scott, and others,
Thanks for your additional insights into this situation. Perhaps it
would be good for us to collectively summarize what we have learned over the
past several weeks as we have been pursuing this thread:
1. The HUGO Reference Sequence is predominantly from one R1b1c male with
the exception of most of the AZFa region (positions 12933864-13681909)(which
comes at least partly from a male in haplogroup G). There are two gaps in
the HUGO Reference Sequence. Per the article by Skaletsky referred to below
these two gaps are each about 50,000 bp in length. Per the map at the end
of that article one of these is located in the Inverted Repeat #4 region of
the Y chromosome near the TSPY family of genes. This appears to be at about
position 9 million on the Y chromosome. The second gap is in the Inverted
Repeat #4 region of the Y chromosome near several RBMY1 genes and near
Palindrome #3. This appears to be at about position 22.2 million on the Y
chromosome.
2. Some regions of the Y chromosome have a significantly higher density of
SNPs than others. The region from positions 57392694 to 57440935 in
particular seems to be rich in SNPs. Watson has 513 SNPs in this region and
Ventor has 684 SNPs in this region. They have an amazingly high 33 SNPs
that that share that are in this region, out of 118 total Y chromosome SNPs
that they share. There also seems to be a high SNP density in regions
between positions 3 million to 7 million as well. To a lesser extent the
region near the centromere between positions 10 and 14 million is also rich
in SNPs. I am not exactly sure why these regions have a high SNP density.
There are likely to be at least several possible factors that play a role in
this:
a. Regions near the pseudoautosomal regions and the centromere may be more
prone to SNPs.
b. The regions rich in SNPs may contain relatively few functioning genes
and therefore there are few biologic consequences if SNPs occur in these
regions. This would be consistent with what we see with mtDNA.
c. These regions may be hard to sequence and there thus could be a lot of
sequencing errors being reported from these regions that are not valid SNPs.
One topic we haven't really discussed in detail yet is the
heterochromatin in the Y chromosome. The map of the Y chromosome from the
journal that Gareth pointed out at
http://nar.oxfordjournals.org/content/vol0/issue2007/images/large/gkm849f1.j
peg (http://tinyurl.com/2faj4h) shows that there are 4 separate regions of
the Y chromosome that contain heterochromatin:
1. From position 10691951 to approximately 11800000. This is the region
where the centromere is located.
2. From approximately position 12100000 to 12380229.
3. From approximately position 20550000 to 20950000.
4. From position 27193520 to 57377045.
It appears that nearly all of the heterochromatin was sequenced as
part of the HUGO project. See the article by Skaletsky, et al, in Nature at
http://www.ncbi.nlm.nih.gov/pubmed/12815422. This article on p. 826 says
that they succeeded in sequencing the heterochromatic portions of the Y
chromosome with the exception that the distal boundary of the major
heterochromatic region on the distal q arm was not identified with
certainty. It appears that this very large section (about 30 million base
pairs) has a primary repeat unit that is "GGAAT" and secondary repeat units
that are 2864 bp or 3584 bp in length. A question I have is whether there
are deep subclade SNPs and/or family SNPs that are located in this very
large section of heterochromatin. It doesn't appear that this large section
was sequenced for Watson and Ventor, probably because the DNA from Watson
and Ventor was compared to the HUGO Reference Sequence when Watson's and
Ventor's sequences were determined. Since there is no sequence data for the
HUGO Reference Sequence for this very large section of heterochromatin, this
section was apparently ignored for Watson and Ventor as well. As far as I
can tell Watson and Ventor don't have very many shared SNPs that come from
first three heterochromatic regions of the Y chromosome. The only ones I
could find were at positions 10701769, 20616699 and 20778997.
Sincerely,
Tim
-----Original Message-----
From:
[mailto:] On Behalf Of Dennis Wright
Sent: Friday, January 04, 2008 6:07 PM
To:
Subject: Re: [DNA] SNP chasing using STSs
This image may help with what is and what is not heterochromatin.
http://nar.oxfordjournals.org/content/vol0/issue2007/images/large/gkm849f1.j
peg
or http://tinyurl.com/2faj4h
It appears there is a small region 57377045 to 57443438 that is Male
Specific towards the end of the heterochromatin zone.
Cheers
Dennis W
This thread:
| Re: [DNA] SNP chasing using STSs by "Tim Janzen" <> |