Archiver > GENEALOGY-DNA > 2010-03 > 1269947346

From: Harold Vannoy <>
Subject: Re: [DNA] Need Help In Understanding and Using the Term "SNP"
Date: Tue, 30 Mar 2010 07:14:32 -0400
References: <>
In-Reply-To: <>

Thanks for the response Ann.

Actually Ann, I have found that the "build numbers" relate directly
to the two reference sequences. The current version of the human
genome reference genome assembly build is 37.1 and was released
August 2009 and the "MT" data now coincides with "rCRS" (as covered
in the link you sent below) and described in RefSeq NC_012920. The
previous "genome build" was 36.3 and in the "MT" section it
references the "Yoruban" sequence described in RefSeq NC_001807
(apparently this build is used by 23andME). I only mention this
because by consulting these references I was able, in at least one
case, to relate the 23andME position, given in my mtDNA download,
back to rCRS (when this information was not given in 23andMe's
"Browse Raw Data" pages).

In this current "thread" I am trying to clear up my understanding as
to whether each line in my 23andMe mtDNA download (2,153 lines)
"corresponds to a SNP" as 23andMe states in the first few lines in
the download or if only the "mutations" that I have in my mtDNA
results are "SNPs" (as I understand the SMGF glossary states in their
definition of a "SNP"). In my previous post (that you responded to) I
gave the example that in the range from position 1 through position
304 I have 4 "mutations." My question is: Are only these 4 mutations
SNPs or does every line of data in this range "correspond to a SNP"?

Ann, can you help me with this question?

Thanks again.

Harold Vannoy

At 04:48 PM 3/29/2010, Ann Turner wrote:

>Harold, mtDNA is sort of in a world of its own. The two reference
>sequences, Yoruba and the Cambridge Reference Sequence, are not
>related to build
>numbers, but to two specific sequences that were selected for
>comparison years
>ago. The CRS was the very first mtDNA molecule to be sequenced, and it is
>used by most people. For some unknown reason, the Yoruba sequence
>was selected
>for the RefSeq database, the source for protein structure used in medical
>studies. It differs from the CRS by having a few insertions / deletions, so
>the numbering gets out of alignment.
>The chip companies use the medically oriented reference, although the CRS
>finally became the official RefSeq sequence just last year.
>FTDNA's full sequence results list YOUR differences according to the
>revised CRS (see above). Since they are looking at every single
>base, they don't
>need an rs number. In fact, very few variations in the CRS have been given an
>rs number -- they're simply listed by position in the mtDNA molecule.
>23andMe probes for the existence of ~2000 common variations, many of
>which are
>used for haplogroup assignments. If these don't have an rs number,
>they get an
>"i" (internal) number from 23andMe. The probes are keyed to bases
>surrounding the variant, not the absolute position. If you browse
>your raw data,
>you'll see both the Yoruba and the CRS position listed.
>Maybe I'm just muddying the waters more!
>Ann Turner

This thread: