GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2002-07 > 1025952001


From: "Bonner, Gregg" <>
Subject: [DNA] Poisson vs. others
Date: Sat, 6 Jul 2002 06:40:01 -0400


Hi Folks,

John wrote:

If you assume an average rate of mutation PER GENERATION, the Poisson
distribution is a decent approximation in the long run, but the binomial
distribution is the EXACT representation for all cases. I cannot see a
motivation to use an approximation when the exact formula is so simple
and easy.

My reply:

The most obvious motivation to use Poisson, rather than exact binomial is
that exact binomial will be impossible to use as a practical matter in many
cases.

Take my Lentz/Lance scenario, where we test 12 markers through a total of 36
transmission events. The exact binomial in this case would be:

p (observing k mutations out of n [in this case, 432] chances) =
[n!/k!(n-k)!][(p^k)(q^{n-k})]

Rather than plug in any number of hypothetically observed mutations, and the
other values, I will only point out that the first part will crash many
calculators/computers - try getting a value for 432 factorial. I know that
generally older calculaters max out at 69 factorial, because 70 factorial
puts the value at greater than 1E100 (1.198E100). The important part of that
is that that is when the exponent crosses over into 3 digits, whereas 69
factorial is 1.711E98. On my home computer, I can calculate factorial up to
170 (7.257E306). So my home computer can not get me even near the required
n=432, and my sample is small at only 5 people. I realize that people can
call someone up with a bigger, better computer to get factorial values, but
this is not terribly convenient.

Since Poisson is p=(e^-np)(np^k)/k!, it has no such large factorials to
handle, unless you really expect to have a dataset that shows greater than
69 mutations.

Cheers,

Gregg







This thread: