GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2011-08 > 1313103776


From: "Diana Gale Matthiesen" <>
Subject: Re: [DNA] exact match at 67
Date: Thu, 11 Aug 2011 19:02:56 -0400
References: <005801cc574b$96bb1af0$c43150d0$@dgmweb.net> <CAKWx04SbrJFe_UDuKKBnSY2wxTX2b6BE6KVJmjhjb27M5gHD-g@mail.gmail.com> <950D783B573A4FE78DD7CD26F8C8DB53@jimpc> <CAKWx04SAWri+7oaHsDsp3dUfjY10AY3He7gCgbiNOmdY-o0K1A@mail.gmail.com> <4e43c01b.46b0340a.182f.4774@mx.google.com> <009e01cc5853$e3c54d30$ab4fe790$@dgmweb.net><4E442386.3030500@gmail.com>
In-Reply-To: <4E442386.3030500@gmail.com>


> From: David Johnston
> Sent: Thursday, August 11, 2011 2:47 PM
>
> On 8/11/11 1:24 PM, Diana Gale Matthiesen wrote:
> > The answer is you cannot, and you cannot because
> > mutations are random.
>
>
> Come on Diana.

You can cut the condescension, I'm not wilted by it.

> You keep making this argument but it doesn't make
> any sense.

It makes perfect sense.

> Assuming we trust the mutation rates, we can write
> down the probability at each generation.

Yes, of course you can calculate probabilities from mutation rates,
which are based on large sample sizes. But in the case at hand (a
67/67 match between two persons), those probabilities are of virtually
no *practical* use to the genealogist or project admin because all
they tell you is that these two individuals almost certainly have a
common ancestor in genealogical time.

> What you do with those probabilities is
> up to you.

Of course.

> But it is indeed useful to know what is likely
> and what is not and to have quantitative
> assessments of those probabilities.

Yes, but beyond the division between the possible and impossible,
knowing the probability at each generation is of no practical use.

> > I have brothers who match 66/67, a father-son
> > who match 66/67, and an uncle-nephew who match
> > 65/67. At the other extreme, I have eighth
> > cousins matching 111/111 (and 134/136).
>
>
> You have an enormous database of people in your projects.

I do? I thought my surname projects were small (my five projects have
3, 16, 28, 31, and 61 members). What I do have is two well tested
families: one CARRICO family with all 17 members tested to 67 markers
and a few maxed out, and one STRAUB family with all 25 members tested
to 67 markers, nearly all tested to 111 markers, and many tested to
136 markers. I do think these families give me some perspective on
what the test results of the descendants of a near common ancestor
look like.

One obvious fact is that the number of mutations the descendants of a
common ancestor in genealogical time can accumulate is NOT consistent
or uniform. It does you no earthly good whatsoever to know the
percent probability of a connection in each generation because the
reality could be so different from the prediction. Some descendants
accumulate no mutations over the generations, while other accumulate
one, two, or three, and there's no pattern to who accumulates them.
In other words, their appearance doesn't conform to the probability
statistics.

> Occasionally you are going to find cases where
> unlikely things happened.

Not "occasionally," usually. And it's "usually" because a single
family is such a small sample. You keep implying that I'm not
statistically sophisticated, but you keep seeming to forget the most
fundamental aspect of statistics: sample size. These probabilities
are great for population geneticists, where the sample size is large.
They're of little use to genealogists, beyond drawing the line between
the possible and impossible.

> That doesn't mean the statement that they
> were unlikely beforehand was incorrect.

I have never said the statistics were "incorrect." What I have said
is that they're not "applicable" to individual families in
genealogical time.

> It is just that you cherry picked a few coincidences
> to argue that math said something was unlikely, yet
> it happened.

I only gave the extremes, but it's the case all along the spectrum.
For individual families in genealogical time, the GD does *not*
correlate well with the number of generations from the common
ancestor. That is my central point.

> It is very unlikely that anyone will get struck by
> lightning in their lifetime. We have enormous amounts
> of data which show that. Yet people get struck by
> lightning and when they do, this statement, while
> still true, won't comfort them very much. Yet it
> doesn't mean that the statement is wrong or worthless.

You're messaging someone who has been struck by lightning, so I
definitely appreciate the probability of rare events. But that's not
the point here. The events we're talking about are not that rare.
What would really be rare -- even more rare than being hit by
lightning -- is if everyone descended from the same common ancestor in
genealogical time would have accumulated the same number of mutations.

> > All you can say about your two READING individuals
> > is that they are almost certainly related in
> > genealogical time. Beyond that, they could connect,
> > literally, anywhere along the line.
>
> Yes it could be anywhere but some are much more likely
> than others and that should be important to know.

Actually, it's not a help, not in individual cases. The probability
statistics say the greater the genetic distance, the more distant the
common ancestor, which is probable, in general, but it's not true for
individual members of single families in genealogical time -- and that
is what we project admins are dealing with.

In other words, for a single family in genealogical time, the GD is
*not* an accurate indicator of relative relatedness. I wish it were
because then my job would be easier, but it's not. And it's not rare
that it's not, it's routine that it's not. And if you truly
understand the statistics here, you understand why it is not.

Diana





This thread: