GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2011-08 > 1313116192


From: David Johnston <>
Subject: Re: [DNA] exact match at 67
Date: Thu, 11 Aug 2011 21:29:52 -0500
References: <005801cc574b$96bb1af0$c43150d0$@dgmweb.net> <CAKWx04SbrJFe_UDuKKBnSY2wxTX2b6BE6KVJmjhjb27M5gHD-g@mail.gmail.com> <950D783B573A4FE78DD7CD26F8C8DB53@jimpc> <CAKWx04SAWri+7oaHsDsp3dUfjY10AY3He7gCgbiNOmdY-o0K1A@mail.gmail.com> <4e43c01b.46b0340a.182f.4774@mx.google.com> <009e01cc5853$e3c54d30$ab4fe790$@dgmweb.net> <4E442386.3030500@gmail.com><00f201cc587a$cf36c7f0$6da457d0$@dgmweb.net>
In-Reply-To: <00f201cc587a$cf36c7f0$6da457d0$@dgmweb.net>


On 8/11/11 6:02 PM, Diana Gale Matthiesen wrote:
> Yes, of course you can calculate probabilities from mutation rates,
> which are based on large sample sizes. But in the case at hand (a
> 67/67 match between two persons), those probabilities are of virtually
> no*practical* use to the genealogist or project admin because all
> they tell you is that these two individuals almost certainly have a
> common ancestor in genealogical time.
>
They are not based on large sample sizes. There is no sample. There is
just one unknown variable: the TMRCA of two individuals. Before looking
at DNA data, you know almost nothing about the TMRCA. Every number of
generations , T=1,2,3 10,100 are all about equally likely. If you then
determine that they are a 67/67 match, you can recalculate the
probabilities using Bayes theorem. Now you can effectively rule out T >
10 or so. As you say, that shows they have a common ancestor in
genealogical times scales.

BUT that isn't all you can say. If it were, you would be right but you
are not. In addition to saying that T is between 1 and 10, you can
calculate the probability of each. It isn't equally likely that it is 1
and 10. Far from it. The ratio is actually more like 66. It is 66 times
more likely that it is 1 than 10. Some people would consider that
interesting information. I guess you would not.

You keep talking about samples. There is no sample. There is a single
unknown variable of which you are trying to constrain. It seems to me
like you are very uncomfortable with the idea of a distribution of
probabilities rather than absolute knowledge. Because it doesn't give
you absolute knowledge, you claim it is useless.

Do you ever look at a weather forecast for the next day? What if the
weather people say that there is an 80% chance of rain? Do you find that
completely useless information? I don't. To me that means it will
probably rain so I probably won't plan a picknick. Yes, it might not
rain. If they are good at forecasts, it might rain 80% of the time when
they say that. Yet, there is only one tomorrow. You want to know what
will happen tomorrow, not just the probability. Yet, that exact
knowledge isn't available. Nonetheless people look at forecasts and make
plans based on probabilities and that is sensible behavior. It is far
from useless.
Dave











This thread: