Archiver > GENEALOGY-DNA > 2005-06 > 1117920755

Subject: DNAPrint SNPs [was Re: [DNA] Gene for Lactose Intolerance Identified]
Date: Sat, 4 Jun 2005 17:32:35 EDT

In a message dated 6/4/05 9:53:41 AM Pacific Daylight Time,

> This may be stating the obvious, but from your statement, you are
> saying that if DNAPRINT actually posted the the marker values they
> test, that their test could actually be of use to us for genealogy?

Don't forget to read the fine print in my message :) The common ancestor
could have lived xx thousand years ago, and in fact probably did for the DNAPrint
markers. There's a catch-22 for selecting markers: you want a SNP to occur in
a significant percentage of the local population, so it must have occurred
long enough ago to accumulate lots of living descendants. On the other hand, you
want a SNP to occur recently enough that it's not widespread throughout the
world due to migration patterns. I would expect that the balancing point is on
the order of many thousands of years ago, but in theory, any two people who
have the derived allele would have a common ancestor more recent than the
beginning of Homo sapiens, just as any two people who have a common SNP on the Y
chromosome have a common ancestor more recent than Y-Adam. [SNPs have an advantage
over STRs in this context: if you share an STR allele with someone, you don't
know if you're identical by descent or identical by state.] The additional
complication for autosomal SNPs is that you could have inherited each one
through many different pathways, not just the straight Y-line, and even siblings
will not have the same combination of alleles.

At the time we were first learning about the DNAPrint, I created a
spreadsheet that gave the SNP results for people with varying percentages of continental
ancestry. There are several sets of data for people who are related to some
degree or other. That would be worth studying in this context to get a feel for
what the data looks like. David F mentioned the "pseudo-paternity" test. The
results for that are on a separate page of the spreadsheet, and there were
some false positives. I'd have to check my old messages to be sure, but off the
top of my head, Tony Frudakis said it would take about double the number of
SNPs to avoid the possibility of a false positive. The spreadsheet also has a
page with the continental frequencies from Shriver's paper, for the markers they

As I read your messages about the utility of the SNPs for genealogy ("a
specific pattern at certain markers that can be uniquely separated from other
results"), I get the feeling that you are picturing "haplotypes" in the broadest
sense of the word -- theset of results from any tests performed on a single
chromosome. We don't know which of DNAPrints markers are on which chromosomes, so
we can't construct haplotype blocks.

However, you might be interested in reading more about the approach taken by
the Sorenson Molecular Genealogy Foundation project. They are using STRs in
haplotype blocks which are small enough to travel together for some number of
generations without recombination. Thus each haplotype block is enormously
informative: there are many more combinations of STRs than just the range of
alleles you get with one STR. The haplotype block is also a quasi-SNP, at least for
some period of time: if you share the same haplotype, you are probably
identical by descent, not just identical by state.

The reason that SMGF likes to collect samples from extended family groups
(siblings, cousins, in-laws) is to help identify these haplotype blocks.

Ann Turner

This thread: