GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2007-12 > 1199141898
From: "Tim Janzen" <>
Subject: Re: [DNA] Chances for Finding Clade-separating SNP
Date: Mon, 31 Dec 2007 14:58:18 -0800
In-Reply-To: <004b01c84b23$f130b9a0$6400a8c0@Ken1>
Dear Ken, Thomas, Gareth, John, Ron, and others,
James Watson and Craig Ventor are both S21+ (thus in haplogroup
R1b1c9) per the information found in Ron Scott's web page at
http://freepages.genealogy.rootsweb.com/~ncscotts/ in the Watson/Ventor
section. If you download Watson's Y SNPs from that web site and search for
rs16981293 (the reference number for S21), you will find that Watson is
positive for this SNP at position 8856078 on Watson's Y chromosome. You can
also see that Watson is S21+ if you go to
http://jimwatsonsequence.cshl.edu/cgi-perl/gbrowse/jwsequence and enter
"chrY:8856073..8856078" into the "Landmark or Region" field and click on
"search". If you download the SNPs for Ventor from Ron's web site you will
note that they are in a different format than the SNPs for Watson. However,
if you scroll down to position 8856077/8856078 you will see a "+" in the
next column and the mutation C/T next to that. Watson and Ventor are
negative for both M467 and S26 (L1) and thus are not in haplogroup R1b1c9a
or R1b1c9b.
I decided to compare Gareth Henson's spreadsheets of Watson's and
Ventor's SNPs from Ron's web site. I entered Watson's Y SNP position
numbers in one column in an Excel file and I entered Ventor's second SNP
position column (the one that corresponds to Watson's position numbers) in a
second column in the Excel file. I then used Excel's VLOOKUP function to
generate a list of matching SNPs that both Watson and Ventor share relative
to the HUGO Reference Sequence. At the end of this message is the list of
the SNPs that Watson and Ventor share.
I also looked to see which of the SNPs that Watson and Ventor share
that are novel SNPs that don't currently have rs numbers. The 14 novel SNPs
they share are as follows:
5481562
5861957
5862032
10596978
11912433
11912664
11935703
11942911
11942984
11961963
11961983
11962048
12417422
13184196
I then categorized the SNPs by the position on the Y as I have
previously done with Watson's and Ventor's SNPs relative to the HUGO
Reference Sequence (see
http://archiver.rootsweb.com/th/read/GENEALOGY-DNA/2007-09/1189709479-01 and
http://archiver.rootsweb.com/th/read/GENEALOGY-DNA/2007-12/1197886255. The
breakdown for the locations of the SNPs that Watson and Ventor share in
common is as follows:
2.72-3 million: 3
3-4 million: 20
4-5 million: 2
5-6 million: 17
6-7 million: 7
7-8 million: 2
8-9 million: 2
9-10 million: 0
10-11 million: 2
11-12 million: 11
12-13 million: 7
13-14 million: 10
14-15 million: 0
15-16 million: 1
16-17 million: 2
17-18 million: 0
18-19 million: 0
19-20 million: 8
20-21 million: 3
21-22 million: 1
22-23 million: 2
23-24 million: 0
24-25 million: 0
25-26 million: 0
26-27 million: 0
27-28 million: 2
28-57.39 million: 0 (not sequenced since it is
heterochromatin)
57.39-57.45 million: 33
Overall, I think that the fact that Watson and Ventor share 133 SNPS
relative to the HUGO Reference Sequence is quite interesting and bodes quite
well for us eventually discovering hundreds if not thousands more deep
subclade SNPs. I compared the 133 SNPs that Watson and Ventor share in
common with all of the SNPs on the ISOGG list at
http://www.isogg.org/tree/ISOGG_YDNA_SNP_Index07.html and found only one of
these 133 SNPs listed there. This was M173 (rs2032624). This SNP appears
in the list of 133 SNPs because the HUGO Reference Sequence is M173
negative. The cause for this is unclear, but is likely secondary to a
sequencing error in the HUGO Reference Sequence or secondary to the fact
that the section of the HUGO Reference Sequence where M173 is located (at
position 13535818) came from a male from haplogroup G (see below) or from
someone else who was not in haplogroup R1. In any case, we know from the
fact that the HUGO sequence is M269 positive that the primary person tested
is in the R1b1c haplogroup. Excluding M173, this would suggest that there
are 132 SNPs that have occurred between the ancestor that Watson, Ventor and
the HUGO Reference Sequence male (who in haplogroup R1b1c) share in common
and the ancestor that Watson and Ventor share in common who was in
haplogroup R1b1c19 (S21+).
There are some important implications from these results for the
Walk on the Y project or any similar effort to look for deep subclade or
family SNPs. First of all note that the highest density of SNPS that Watson
and Ventor share is in the 57.39-57.45 million base pair range. Therefore,
it may be reasonable to start the Walk on the Y in this section,
particularly if we are only going to be checking 50,000 base pairs per test
with the initial testing and since we would like to find as many true SNPs
as possible with each section tested. The 3-4 million and the 5-6 million
bp range also look very promising. The 11-14 million bp section also looks
reasonably good. There is also a blip of 8 SNPs in a small section between
19612663 and 19613854 for some unknown reason.
Thomas, can you tell us exactly what section of the HUGO Reference
Sequence is from a male in haplogroup G? If you will recall, you mentioned
this in this in a message to Gareth Henson, John Marsh, and me in September
as follows: "Note that the region around M200 ff is a relict from the time
before the HUGO project started. E. g. M201 is positive in the reference
sequence which would place HUGO in haplogroup G. Most of the rest of the
HUGO reference and the STR counts clearly indicate R1b1c." Per your Y
browser, M201 is at position 13536923 and M200 is at position 13540810. It
would be helpful to know more precisely the beginning and ending sections of
the HUGO section that is from the haplogroup G male. It would appear that
at least from position 13535818 to position 13540810 the HUGO Reference
Sequence is from a male in haplogroup G.
Thomas, have you already made up your mind as to what section of the
Y chromosome you want to start testing first and if so are you willing to
tell us what it is?
One issue that will keep coming up in the future is the fact that
the HUGO sequence isn't from only one man. It would seem like we should
establish a Y reference sequence that is from only one person as was done
with the CRS for mtDNA. This person could be James Watson, Craig Ventor, or
someone else. Another option would be that the HUGO reference sequence
could be revised to remove to remove the portion that came from a male from
haplogroup G and replace it will the corresponding section from the original
R1b1c male whose DNA made up the predominant portion of the HUGO sequence.
Gareth, can you explain to us the steps you went through to create
the Excel files of Watson's and Ventor's SNPs? I suspect that we will be
doing a lot of conversions of Fasta files to lists of SNPs in Excel in the
future and I would appreciate learning exactly how you created the Excel
files on Ron's web site. I would also appreciate knowing exactly which
files at ftp://ftp.ncbi.nih.gov/pub/TraceDB/Personal_Genomics/Venter/
contain Ventor's Y chromosome segments.
Ron, you may want to add the SNPs below to an Excel file on your web
site in the Watson/Ventor section or simply create a link to this message in
the RootsWeb archive from your web site.
It would be nice if we could start testing a series of R1b1c1b
(S21+) males and R1b1c* (M269+) males so that we can sort out which of these
SNPs are upstream from S21 (but yet still downstream from R1b1c and thus
M269+) and which are downstream of S21.
Sincerely,
Tim Janzen
2721694rs2253109
2980492rs2535180
2980633rs2535181.1
3017650rs2534240
3124831rs9650933
3126506rs2652913
3129109rs2652921
3131386rs2652929
3161174rs2534214.1
3230969rs2907867
3231196rs3014802.1
3302116rs1270115
3362125rs2552813
3513793rs2444421
3516391rs2444439
3516714rs34256727.1
3533956rs1026295
3583547rs2450641
3583642rs2450642
3619501rs35206948.1
3698438rs2752276
3830122rs2752197.1
3873959rs33998049
4686104rs34007059
4830044rs3102702
5040970rs2571700
5046175rs2563438
5049199rs2563450
5054235rs35177802
5061776rs2563491.1
5061906rs2563492
5147053rs35664920
5195960rs28720892
5363879rs2574026
5367435rs2755371
5367493rs2574060
5481562novel
5861957novel
5862032novel
5909423rs2500852
5909525rs2500853
5988866rs34922110.1
6001749rs2441313
6003666rs2921692
6474753rs2558898
6474766rs2558899
6474804rs2558900
6597379rs2434027
6992790rs7892861
7306726rs34001725
7424282rs7067275
8856078rs16981293
8865525rs1264078
10596978novel
10701769rs2566496
11912433novel
11912664novel
11935703novel
11942911novel
11942984novel
11961963novel
11961983novel
11962048novel
12417422novel
12654641rs9785657
12933864rs9786460
12934053rs7892854
12942936rs2740980
12943108rs2740981
12979419rs2713254
13032836rs11799149
13151202rs13304168
13184196novel
13211040rs7892925
13373585rs7893052
13465511rs35108305
13499115rs9786537
13595577rs7067251
13681909rs11799198
15514463rs13304223
16354175rs11799151
16370123rs9786171
19612663rs6530599
19612862rs1136210
19612997rs6530602
19613055rs6530603
19613128rs4030415
19613392rs3950479
19613711rs10465459
19613854rs10465460
20411776rs7067348
20616699rs34276300
20778997rs7892901
21120848rs9306842
22382982rs2178500
22857377rs6530626
26248434rs804774.1
27198031rs35733966
27198083rs34303901
57392694rs7067267
57393321rs11799258
57393445rs2527475
57393534rs2527474
57393562rs2527473
57393594rs10449125
57393597rs10449157
57393621rs2527472
57393698rs2527471
57395638rs9633271
57395806rs10449160
57398199rs2527455
57398403rs2527454
57398460rs2527453
57402503rs2527447
57402619rs2641170
57402655rs2527446.1
57403213rs2334093
57403647rs9633253
57403691rs9633252
57425535rs2641136
57425576rs1817880
57425782rs4104975
57425918rs3855803
57426009rs3878828
57435386rs12171801
57437589rs2334089
57437671rs4047348
57437745rs11152878
57437834rs28546143
57438124rs1849971
57440935rs6568293
57441017rs7893053
-----Original Message-----
From:
[mailto:] On Behalf Of Ken Nordtvedt
Sent: Sunday, December 30, 2007 12:39 PM
To:
Subject: Re: [DNA] Chances for Finding Clade-separating SNP
I have to start from scratch concerning JW and CV. I have not seen their
haplotypes and only remember they are R1b1c. Are they both S21+?
Ken
This thread:
| Re: [DNA] Chances for Finding Clade-separating SNP by "Tim Janzen" <> |