Archiver > Y-DNA-PROJECTS > 2010-06 > 1277435190

From: "Diana Gale Matthiesen" <>
Subject: Re: [Y-DNA-projects] Y-DNA-PROJECTS Which mutation path ismorelikely?
Date: Thu, 24 Jun 2010 23:06:30 -0400
References: <mailman.16209.1277360324.3593.y-dna-projects@rootsweb.com><DDDB43EB231745D38CC86377276BF278@Ralphs>
In-Reply-To: <DDDB43EB231745D38CC86377276BF278@Ralphs>

> -----Original Message-----
> From: On Behalf Of Ralph Taylor
> Sent: Thursday, June 24, 2010 9:00 PM
> To:
> Subject: Re: [Y-DNA-projects] Y-DNA-PROJECTS Which mutation
> path is morelikely?
> What little I know about this "mystery marker" CDY (a & b)
> comes from a bit of research to answer your query. It boils
> down to:
> 1. CDY is a multi-copy marker (Thus, "a" & "b".)

Yes, that's correct.

> 2. It is highly volatile, subject to more frequent mutation
> (as demonstrated here).

Yes, that's correct, too. It's said DYS464 is the most volatile marker. If so,
CDY must be a close second.

> 3. It is equally likely to move in either direction (up or down).

In general, yes. If the number is "high" it is more likely to go down. If
"low" it's more likely to go up. But this usually isn't the case because it's
mostly moving up and down in the mid-range of values.

> 4. It is "palindromic". (Which, if I guess correctly, means
> it reads the same way forward & backward.)

Yes. Wikipedia has a nice explanation and illustration of palindromic:

> If anyone has better knowledge, please step in. I'm wondering whether:
> A. CDYa & CDYb have constant places on the Y-chromosome?

Yes. The "Marker" is the name of the location or "locus" (plural loci). The
actual address of the locus on the chromosome is called its "cytogenic
location," which is given as the chromosome number, arm, and band number(s);
however, you won't find genealogists using these.

> B. The stepwise or infinite alleles mutation models are more
> appropriate?

First, a quick explanation of what the two models mean because the names make
them sound more exotic than they really are.

The "infinite alleles" model assumes all mutations are equal, so it's simply a
count of matching markers. It's what we're doing when we say two people match
35/37. However...

The 35/37 figure may be deceiving in that while they match on 35 markers, the
mismatch on the other two may be more than one count each. For example, the
difference at one marker may be one and at the other marker may be two.
Therefore the GD (genetic distance) between the two individuals may actually be
three, not just two. As you can see, a problem with the infinite allele model
is that it may *underestimate* the true genetic difference.

The stepwise method takes into account, not just that the values at a particular
marker are different, but *how* different they are. In its simplest form, you
assume every change in the count is a single-step mutation. The case above
giving a GD of three is an example of one-step, symmetrical (equal chance of up
or down) method, where each change in a count is considered a separate mutation
event. One problem with assuming every one-step change is a separate mutation
event is that you may *overestimate* the true genetic difference.

While most mutations events are, indeed, one-step (ca. 98%), a small percentage
(ca. 2%) may be two-step, plus there may be other kinds of mutation events that
can't be scored by simple counting (e.g., recLOH -- a recombinant loss of
heterozygosity). Palindromic markers are particularly prone to multi-step
mutation events and recLOHs.

So, in casual conversation, I'm apt to say something like, "these cousins have a
65/67 match" (infinite allele model). However, when it comes to seriously
assessing their relationship, I will use the stepwise method, including the
assessment of any possible multi-step mutations, and say something like, "These
cousins have a genetic distance of 3 at 67 markers."

Bottom line: you use both, but the important one is the stepwise model. In
genealogical time frames, the two methods will often yield the same result, that
is, cousins who are a 35/37 match will have a GD of 2 (because they have two
one-step mutations). You just need to be aware that there are situations where
that is not the case.

> C. FTDNA reporting might not be similar to 464a-d, where "a"
> is always the lowest count found regardless of location?
> ("464a" for one may be "464b" for another.)

The "a" and "b", etc., do not refer to the locations of the individual 464
alleles; we don't know their individual locations. By convention, the alleles
are simply reported lo-hi. The same is true of the other multicopy markers. In
the case of DYS385a/b, the order can be ascertained with a Kittler test.

> I was wrong about one thing, at least: No one addressed the
> above statements or questions with this marker. I was hoping
> to hide my ignorance <G> and maybe learn something.
> And that may help explain why your earlier query didn't
> receive a response. Most people are reluctant to answer,
> "Beats the heck out of me." Puzzlement doesn't easily
> convert to words.

The questions you asked are quite technical and have taken me a long time to
answer, which is probably the reason you didn't get a lot of responses. It's
often better to ask one question at a time.

> Statistical note: With a sample of 35,36,37 (n=3) -- the mean
> is 36 and the standard error is 1. That would put the 90%
> confidence limit (CL) for the value in the range of 35 to 37
> & 95% CL in the 34-38 range.

A sample of three individuals is too small for any statistics about them to be
meaningful. Keep in mind, always, that mutations are random, and random is the
opposite of even. IMO, it is simply not ever going to be the case, when working
with single families in genealogical time, where statistics are going to be of
any practical use, beyond ruling out the impossible. I discuss this further on
this page:

> I still can't say which of the two scenarios you presented is
> more likely. Does it make a real difference? Or, is it one of
> those "How many angels can dance on the head of a pin?"
> questions. My take: Call the bet a draw and buy each other
> beers.

For the genealogist, it definitely makes a difference. Genealogists are trying
to strengthen their pedigrees, not weaken them.


This thread: