GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2012-11 > 1353599592


From: "McDonald, J Douglas" <>
Subject: Re: [DNA] Geno 2.0 - First Look!
Date: Thu, 22 Nov 2012 15:53:12 +0000
References: <BLU151-W640013D3D9FF886C6041FAB5540@phx.gbl><008101cdc84a$8ed5be00$ac813a00$@net><CAA-Ub_CvwYdKgS2ofi=7gvFHOUABL901rhPjriOs5ueYute82g@mail.gmail.com><00ae01cdc850$59edd4b0$0dc97e10$@net><BLU151-W250001AD2A8B71030D7E62B55B0@phx.gbl><4433B4544490754DB40AB9285065357621A17187@CITESMBX4.ad.uillinois.edu><CAA-Ub_AdOfcDAGpbwDCmodMwS-Nk+CHbP5cQpa8Zpad+z=AU=A@mail.gmail.com>
In-Reply-To: <CAA-Ub_AdOfcDAGpbwDCmodMwS-Nk+CHbP5cQpa8Zpad+z=AU=A@mail.gmail.com>


-----Original Message-----
From: On Behalf Of Ann Turner


It's very likely that L176 was omitted because it is an STR. I don't know
the specifics of the GenoChip design, but probes are typically quite short
sequences of DNA, designed to be complementary to a specific known variant
(a SNP or short indel). They wouldn't be able to span an STR, and repeat
motifs could grab a probe even if it wasn't the specific number of repeats
you were aiming for.

***************
Maybe one of their currently unknown-place SNPs will be a synonym or
better yet a near-synonym.
***************

You will probably be besieged by people who want to compare their GenoChip
results with your analysis. As I noted in another message, the rs results
are given in alphabetical order, so the chromosomes are intermixed. Will
this play havoc with your approach?

****************
No, that's just a one-time computer programming bother.
The problem is worse.
****************

In case you didn't catch it, CeCe
posted an "empty" file at
https://www.dropbox.com/sh/lg9fdpumzf29qpf/YOYU45qNnQ.

***************
The problem is that of the 124718 autosomal rs-SNPs on the Geno chip,
only 57785 are in common with the 296,000 odd SNPs I use. Only 68723
are in common with a recent FTDNA Illumina chip, which is an upper
limit I could find in my reference data sets. That's the bad news. While
its true that I COULD offer a test with 57785 SNPs, the partly phased chromosome
painting algorithm would be severely crippled, and my PCA plots of only one
of my "continental group" areas on the chromosomes compromised by this ...
badly ... and also much noisier. More data is better! Really! This is because
of the average length of haplotypes.

The good new is that since 57785 of 68723 is a very good fraction, I likely
chose well the SNPs I use, assuming they did likewise.

One bad-news item for the Geno results, noted by Mayka, is that they are,
like Pop Finder and the corresponding list I send, going to "bracket" people
which is confusing. That is, they are going to call Poles as probably German + Russian
or worse, like I call them Russian+English or Lithuanian + Irish or even Finnish-Italian.
They don't seem to offer a good "boilerplate" explanation of this possibility;
at least I try to do that.

Also ... to the folks on the list ... I'm going to be out of touch on an exotic
vacation from Dec 13 to Jan 5. Please ask people not to send me data after Dec. 9. What gets sent
anyway will get accumulated and eventually done, but very slowly.


Doug McDonald




This thread: