GENEALOGY-DNA-L Archives
Archiver > GENEALOGY-DNA > 2008-04 > 1207648118
From: "Jim Cullen" <>
Subject: Re: [DNA] Haplogroup Predictor
Date: Tue, 8 Apr 2008 05:48:38 -0400
References: <mailman.659.1207639417.7333.genealogy-dna@rootsweb.com>
I've received several emails already with
questions I'll answer all at once in one reply
'on-list'. I'll be in touch with each of you
off-list as well.
There are two sets of modals that I'm
using. One set for haplogroups and sub-
haplogroups - and another set for the Haplo
'I' subclades. The (sub)haplogroup set of
modals are 37-markers. In the beginning it
was meant to include only 20-some groups
but that has of course changed. I may ratchet
these modals up to 67 markers as I include
some of the finer divisions to increase the
resolution.
Haplo-I modals are all 67 markers; at the
subclade level, the extra markers are needed
to resolve the clades and subclades. The
marker ordering and repeat convention are
all those of FTDNA.
The predictor is not quite finished yet. It
will be posted for certain on the Cullen
Genealogy Homepage website but also may
be allowed to be posted elsewhere. Ken is
co-owner of the predictor and so it may be
found on his site as well - I haven't heard
yet - in any case we'd both have to agree to
postings elsewhere.
I am using no markers beyond FTDNAs
first four panels... at least right now there are
no plans for them. This may change in time.
On occasion there will be a modal value of,
for example, 10.5 which is my way of putting
bimodal markers into the database. In the
predictor, the basic genetic distance routine starts
by just comparing haplotype to modal. A
genetic distance of "less than one" is considered
zero. Ten or eleven minus 10.5 results in a
distance of 1/2, which is less than one and is
zero. The 10.5 then indicates bimodal at 10
or 11 repeats for that marker.
There are zeros for several markers; a zero
indicating that a marker is not used. These are
DYS570, 576, and CDYa,b. CDYa,b are of
course too variable to be of any use. The other
two are not used at all in Ken Nordtvedt's data
for Haplo-I subclade modals so these two
markers were also stricken from the list.
I'm using a (-1) to indicate a 'null' result on a
marker, most notably DYS425. Other values,
also negative, perform other functions. Decimal
portions of marker repeats besides 0.5 will
also perform other functions if unusual modal
behavior is observed that is beyond simply
'bimodal'.
Thank you to Victor and Robert for their
assistance with the haplogroup E3b, and to Ellen
and John and Steven for their help with R2. A
list of credits in the program will include every-
one who has been kind enough to lend a hand.
Modal sets added since last night brings the
totals to 55 (sub)haplogroups and 55 Haplo-I
subclades for a total of 110 modal sets.
Jim Cullen
This thread:
| Re: [DNA] Haplogroup Predictor by "Jim Cullen" <> |