GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2006-07 > 1152480025
From: "William Hurst" <>
Subject: Tom Glad's mtDNA-Analysis Utility - Implementation for K Project
Date: Sun, 09 Jul 2006 17:20:25 -0400
In-Reply-To: <e8rnco+97ts@eGroups.com>
Hi all,
Yesterday Tom Glad announced a new utility for mtDNA which produces genetic
distance reports with summary tables plus Fluxus diagrams from FTDNA results
or from MitoSearch. Since then Tom has upgraded the program to a 0.2 version
which implements some of my suggestions and is much faster for large amounts
of data. The mtDNA-Analysis Utility is available at
http://freepages.genealogy.rootsweb.com/~glad/dna/mtdnatool.html The Fluxus
software may be downloaded from
http://www.fluxus-engineering.com/sharenet.htm Tom should be congratulated
for producing this valuable tool.
I have so far produced three sets of tables, two with associated Fluxus
diagrams. The main one - why start small? - is for the full 184 current
members of the mtDNA Haplogroup K Project:
http://freepages.genealogy.rootsweb.com/~wrhurst/mtdna-k/kprojecttable.htm
Starting with the summary table at the top, note that there are now
frequency and count lines and that I used the option of sorting the mutation
lists by frequency, with HVR1 followed by HVR2. Mutations occuring in more
than 50% of the haplotypes are in red. Even I was surprised that 45
mutations occured only once in the 184 members. The list of members' kit
numbers is in the same order as on the project website. I have added "Kroot"
or K* to represent a haplotype with the basic six mutations found in K. It
is not a modal haplotype. A modal haplotype should be the most common; but
as I have stated before, no K I've seen so far has just those six mutations.
In this case, the haplogroup column entry is a K until you reach the end
where there are seven members with subclade designations based on
full-sequence tests. You should read the legend at the end of the table to
fully understand it. The next table is the actual Genetic Distance Report.
Note the color codes at the end of the table. For this particular table, the
number in the first box after the kit number is the genetic distance to the
Kroot or the number of extra mutations acquired since the origin of K. Since
certain mutations occur in pairs - the 524 insertions and the 522 and 523
deletions - I used the input box to specify that the second of each pair
would not be counted for the genetic distance. The mutations ignored are
listed at the top of the genetic distance table and are marked with an
asterisk on the summary table. You could also use the "ignore mutation" box
to not count common recurrent mutations, but I chose not to do that. When
using the genetic distances, make sure you are aware (from the summary
table) whether the two members both have HVR2 mutations or not. In my
opinion, only when both have HVR2 results is the genetic distance
meaningful.
I did produce a Fluxus diagram for the 184 members, but it is a huge
spider-webby meaningless mess! Fluxus diagrams from mtDNA HVR data should
always be viewed with suspicion since the real subclades may be based on
coding-region mutations which are not generally available. On a smaller test
chart I noticed the diagram had a K1b branching off K1a, which is not how it
works in real life. The only redeeming feature of the diagram for 184
members was that it put my kit number at the very top. I did nothing to
cause that. Really.
The next set of tables are for those in the K project, with HVR2 results,
who are probably in subclade K1a1b1a, the largest "Ashkenazi" group as
identified by Dr. Doron Behar. Keep in mind that the subclades are
determined mainly by coding-region mutations; only one on the list has been
so determined. Also remember that some in the subclade may not have an
Ashkenazi ancestor. See:
http://freepages.genealogy.rootsweb.com/~wrhurst/mtdna-k/k1a1b1atable.htm
Since in this table all have HVR2 results, the genetic distances are very
significant.
This time I have created and made available a Fluxus diagram from the data.
See:
http://freepages.genealogy.rootsweb.com/~wrhurst/mtdna-k/k1a1b1atable.jpg
Looking at this diagram, start with the Kroot in the lower left, then follow
the 497T and 16234T mutations to the branching point mv1. The size of the
nodes is proportional to the number of matching haplotypes represented. Each
node is marked with only one kit number, the first on the project page. One
node shooting off the left features a very rare back mutation for 16519C.
Fluxus lists the mutation, but doesn't mark it in any special way as a back
mutation. Maybe it would if I knew about about the program. But most of the
haplotypes are in a cube with some offshoots. The cube is caused mainly by
the common recurrent mutation 309.1C. The diagram would be greatly
simplified if I took out 309.1C, but I left it in for a purpose. I think the
diagram illustrates that there may be no way to determine the exact order of
the mutations in this case.
The next set is for members probably in subclades K1c and K1c2, both of
which are usually easily determined by looking at HVR mutations. I'll admit
to being interested in this one since I appear on it. The tables are at:
http://freepages.genealogy.rootsweb.com/~wrhurst/mtdna-k/k1ctable.htm I set
this program to ignore 523- when determining genetic distances, since that
one is always paired with 522-.
The Fluxus chart for K1c/K1c2 is at:
http://freepages.genealogy.rootsweb.com/~wrhurst/mtdna-k/k1ctable.jpg This
time the Kroot is on the right. Follow the line left to the first node which
is for a "perfect" K1c. There are several offshoots which represent K1c with
one or more additional mutations. The line marked 16320T leads past three
mutations to the largest node, which includes several "perfect" K1c2's. It
has its own offshoots, one of which features a back mutation on 16224C. The
one furtherest from the Kroot is, of course, me. I try to be different.
My plan is to update the largest table above as new members join the
project. I will no doubt add some more of the small tables and Fluxus charts
later. I'll list those on the News tab on the K project website at:
http://www.familytreedna.com/public/mtDNA%5FK/
Suggestions and questions are welcome.
Again, thanks to Tom Glad for this new utility for us to use.
Bill Hurst
Administrator, mtDNA Haplogroup K Project
This thread:
| Tom Glad's mtDNA-Analysis Utility - Implementation for K Project by "William Hurst" <> |