GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2010-03 > 1268591618


From: Steven Bird <>
Subject: [DNA] Slow vs. fast markers - was "are testing companies beingguided to the wrong things?"
Date: Sun, 14 Mar 2010 14:33:38 -0400
References: <mailman.421.1268553642.27272.genealogy-dna@rootsweb.com>,<229BEF456B5645388ED4D7E774DAD31A@TERRYSHUTTLE>
In-Reply-To: <229BEF456B5645388ED4D7E774DAD31A@TERRYSHUTTLE>


Terry,



The bigger issue, in my opinion, is not slow vs. fast but rather single site vs. multi-part STRs. Vermuelen et al 2008 has stated that ssSTRs (such as DYS 393) are inherently more well behaved concerning conformity to the stepwise mutation model than are the multipart STR markers (DYS 464 or 385). Thus the age estimates for ASD-type calculations based solely on single site markers are more likely to accurately reflect the correct number of generations to the coalescent/MRCA.



http://www.ncbi.nlm.nih.gov/pubmed/19647704



There are 49 ssSTRs available within the FtDNA standard 67 marker data set. As Ken has pointed out in previous threads, using only the slow markers increases the error bars. I have done some experimentation within Mike Weale's program Ytime with a test set of 150 haplotypes that confirms Ken's observation. So my own current thinking is that the most accurate estimate of the relative age of a given cluster of haplotypes can be obtained by using ASD and the 49 available single site STR's in the 67 locus set.



I did some empirical analysis on the subject, after reading Vermeulen, and compared coalescence estimates for a test set of 150 67-marker haplotypes with the loci reduced to 7, 10, 17, 24 and 49 single-site STR markers within Ytime, and using a MCMC bootstrap routine to generate the confidence intervals. The improvement from 7, 10 and 17 STRs showed a marked narrowing of the estimated date range (the error bars), up to 24 loci. Interestingly, the accuracy of the T statistic (number of generations to coalescence - equivalent to G) improved just a few percent (1-2%) when moving from 24 ssSTR's (out of the 37 marker standard FtDNA set) to the full 49 markers. This suggests that for most purposes and especially for estimates that exceed ten or so generations, 24 ssSTRs out of 37 is sufficient.



The advantage is that there are many more 37 marker haplotypes out there, which seems to be more important in narrowing the error bars than whether one moves from 24 to 49 loci, at least when using ASD. I would therefore recommend using ssSTRs exclusively, with as many haplotypes as possible to be compared (within a given subclade, of course) but that one should opt for more haplotypes (37 marker) rather than more markers (49 ssSTRs).



Steven Bird, DMA



>
> Does anyone have a systematic approach to which markers would be slow and
> which fast? My inclination would be to include only these markers as "fast"
> from the first 67: 464, 724 (CDY), 570 and 557. (If we used all of the
> FTDNA "red markers", we'd also include 385, 439, 458, 449, 456, 576 and some
> more from 38-67)
>
>
>

_________________________________________________________________
Hotmail is redefining busy with tools for the New Busy. Get more from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_2


This thread: