From: David Ewing <>
Subject: Re: [DNA] Variance Assessment wrt back and parallel mutations
Date: Sun, 14 Feb 2010 14:31:59 -0700

Anatole has kindly sent me a copy off list of the family tree he constructed
from the Ewing data I previously shared with the list. This data is still
available at
I have corrected some errors in the shading of mutations in Ewing Group 3,
but numbers have changed. I hesitate to post the images Anatole sent me
without his permission, but I could easily do that if any of you would like
to see them and he would allow it.

It looks like Anatole used a phylogeny generating software program to
compute and draw the tree; presumably this operates with some kind of
maximum parsimony algorithm and contains logic for deciding between
alternative solutions. It shows nodes, branches and the locations of the
terminal haplotypes, but does not label the branches with mutations.

I have figured out what mutations were used by Anatole's program to define
the branch points between and within the first two of the four main branches
Anatole identified for us. I have made a chart somewhat similar to the big
phylogeny chart I shared previously with the branches exactly as Anatole's
program drew them, but I have added labels showing the mutations. This
diagram is available at

This phylogeny tree contains only 16 haplotypes. Just this fraction of the
overall Ewing tree as drawn by Anatole shows three parallel DYS 576 = 19
mutations and parallel mutations at DYS 391 = 10 and DYS 576 = 17. Anatole
has repeatedly reassured us that there is no need to worry about parallel
mutations, but 7 of the 21 mutations in this fraction of the tree he sent to
me are parallel mutations, which is a full third of them. I did not do the
exercise of deconstructing the entire tree, but I can see a number of other
mutations in this fraction of the tree that are found elsewhere in the Ewing
data, so we will be able to demonstrate that although Anatole does not
believe parallel mutations happen very often, his phylogeny tree drawing
program is not at all embarrassed about adducing them in drawing trees.

Any of you that have done very much with network diagrams know that with
surname project data sets of any size you end up with highly reticulated
networks which are essentially impossible to reconcile into nice consistent,
linear sequences of mutations. I think this is because there are so many
parallel mutations, but I still do not understand why this should be so.

The fact remains that we are finding many more parallel mutations than the
mathematics suggests we should. I am too worn out today to make another
argument that we are also finding some back mutations, but my earlier
observation about DYS 391 being 11 in the R:M222 common ancestor, then 10 in
the common ancestor of Ewing Groups 1&2, and then back to 11 again in the
common ancestor of Ewing Group 1 remains as one example that has not been
responded to. However, Anatole's family tree implicitly claims that the
common ancestor of all Ewings in the data set had DYS 391 = 11, and that
there was a mutation DYS 391 11 > 10 giving rise to Anatole's third group
(roughly equivalent to my Ewing Group 2). That eliminates the back mutation,
but raises the question of how the child can be so much older than the

David Ewing

