From: "TK Boyd" <>
Subject: [DNA] mtDNA descriptions- set in stone?
Date: Wed, 23 Jun 2010 18:06:37 +0100

I've been struggling with "what could be done" to help people use
their DNA profiles to trace their family trees. I'm trying to devise
a schema for a database.... but to do so, I need to understand the
things I want to put into it.

I'm more of a computer expert than a DNA expert... in spite of a
human genetics course along the way to my B.Sc. at Cornell. But that
was "a while" ago... Aside from what I've forgotten, there have been
new discoveries. (And yes, dear neice, if you are reading this:
Watson and Crick HAD published their paper BEFORE my university

I understand sundry restraints and opportunies from the computing
side. What I seek help with is EXACTLY (computers don't cope well
with "little details") what we know when we look at mtDNA results we
get from, say, FamilyTreeDNA. And I need help with the terminology
being used.

As I understand mtDNA, it is a single strand of about 16,000
neucleotides... for my purposes "letters", and those letters can
(happily!) only be a, c, g or t. (In DNA. Yes, I know about uracil,
but it doesn't come into what we need, does it?)

And we can (for a price!) get a lab to determine for us the full
sequence of our personal mtDNA. And the report comes back to us in a
shorthand which tells us not, directly, all 16,000 letters, but tells
us where we DIFFER from the CRS.

Those differences can be substitutions, deletions and insertions.

If you see errors in any of the above, I'd be delighted if you were
to explain them for me and the other readers of this thread. No
offense taken!

I hope, however, nothing above is wide of the EXACT truth of how it
all works, and that none of the following questions will be rendered
moot by errors in my premises. I've numbered my questions to help
anyone answering just a selection of them...

1) Is the CRS "set in stone"? (As far as any such thing ever will
be!!) Does anyone know how long the current CRS has been "the" CRS?
If I understand it properly, it is only a point of reference.... no
one is saying that it is "the right" sequence... it is, isn't it,
just a "pattern" on which we can base statements like "your mtDNA
sequence is as the CRS, except you have "a" at position 256"? If
that's right, it seems unnecessary and unhelpful to make changes from
time to time? Or would such a simple scenario be too simple?

2) Is the system of reporting WHERE an individual's mtDNA differs
from the CRS pretty stable? Have instances of "re-numbering" the
sequence, say to allow for deciding that an extra "t" ought to go in
between the old position 432 and 433, been few, and the most recent
one a long time ago, with new ones unlikely?

3) Have instances of multiple insertions between two "standard" CRS
neucleotides been discovered? I.e. If the first four letters in the
sequence are catg, I have no doubt that there could be someone out
there with ctatg... i.e someone with a "t" inserted between the
"standard" c at posn 1 and the "standard" a at posn 2.

Further in that vein: Have instances been seen, say, of ctcatg... a
"t" AND a "c" inserted between the "c" at 1 and the "a" at "2"? And
if not, there's no fundamental reason, is there(?), which would make
such an odd and unlikely "double insertion" impossible? (I realize
that a strand with a double insertion at one point has an even lower
chance of being VIABLE DNA... but it could(?) happen, couldn't it?
(For the computer database schema design to be satisfactory, it has
to be able to cope with anything that COULD happen... not just most
of the things that will probably happen.)

Thanks for any guidance you can offer.

A digression: To illustrate the perils of a programmer's life: I once
was responsible for maintaining a school's database of its pupils. I
had build into it the necessary provisions for the fact that a boy
who came to us as Billy Brown might leave as Billy Jones. What I
hadn't seen the need for was a way to record more than one date of
birth for a given child.

One boy came from a nation our government was hostile towards. He
flew out of his home country on a first passport.. first dob.. to an
intermediary country, changed planes, passports and dates of birth,
and then completed his journey to us! Two dates of birth. Simple when
the records were 3x5 cards in a box. Not so simple when the records
went into a computer! TK Boyd's site with
freeware and shareware for kids, parents, schools... and others.

