Archiver > GENEALOGY-DNA > 2010-02 > 1265294884

From: William Hurst <>
Subject: Re: [DNA] FTDNA admits to errors in many mtDNA sequences
Date: Thu, 4 Feb 2010 09:48:04 -0500
References: <>
In-Reply-To: <>

Hi Ann and all,

> The next question would be how to incorporate the specialist's eye view in
> computer matching algorithms. My preference would be to ignore this region
> completely, along with insertions around 16189C and 309 and heteroplasmy. I
> would like to see heteroplasmy codes within the FASTA files at GenBank,
> though. If I recall correctly, the FASTA file at FTDNA did it that way.
> Ann Turner

At a certain point, anyone drawing mtDNA trees has to go beyond the computer program level. To use your "insertions around 16189C" as an example, one author published several sequences in a K subclade with 16194.1A. Similar sequences at FTDNA don't have that insertion, because FTDNA apparently has chosen, wisely, to ignore them. That's recommended by experts such as Bandeldt, I believe. So I leave the insertions out of the ones from the scientific paper. I always resolve reticulations which programs leave. Back to 16189C, many scientists leave out 16182C and 16183C, which are all related in a complex way. But the subclade mentioned above always, so far, has all three of those mutations. (The next one won't, of course.)

As for the 309 insertions, I never use those. But I don't ignore them, because they occur more commonly in some subclades than others. They are not random.

I'm still thinking about the use of heteroplasmies on trees. It's just recently that FTDNA started reporting them, except by e-mails to individual customers. The ones in the coding region do look random so far. But what about 16093Y and 114Y, for example? There are many examples in the same subclade of the mutation and the CRS variant for each of those. If you ignore the heteroplasmy, what do you put in its place. (I'm thinking out loud here.)

Bill Hurst

This thread: