GENBOX-L Archives

Archiver > GENBOX > 2005-02 > 1109204897


From: "Geir Thorud" <>
Subject: SV: [GENBOX] Match Finder
Date: Thu, 24 Feb 2005 01:28:17 +0100
In-Reply-To: <CHEMKKAJHOKJPIEOPLKDOEJIFLAA.clcasper@sprynet.com>


Hi Cheri,

Are you using the default settings?

As a result of you request, I did som tests.

I have the opposite problem, too many matches, 3660
matches out of 12500 persons with the default settings,
but I guess that I have less than 20 persons in the
database that are actually duplicates.

If this function is to be of any use in large databases
I guess the user must be able to control the Phonetic
match or set up lists of equivalent spellings of the same
name, otherwise it can only be used with exact matching.

But, even exact matching on Surname and Given name gives 2450
matches. Looks like the problem is the use of "Patronyms" -
more than 50% of the persons in my database has one of the
names Olsen, Hansen or Larsen (used as a patronym, but
registered as a surname). This problem could be solved by
an option that will require at least two surname variants
to match if more than one present, most patronyms are
entered together with a second surname (often the name
of a place). Filtering on Identifier Type=patronym is
no good - since the Identifier Type can not be transferred
in Gedcom, it is not used with the patronym value.

I tried to match surnames on the first 4 letters, and uncheck
"Treat a blank value as a matching value". No response after
20 minutes - Ctrl-Alt-Del! (But the program does not sleep
with Phonetic match, so "blank value" is not the problem)

Another issue is that I am not sure that unchecking the
"Treat a blank value as a matching value" box works, it does
not reduce my number of matches below 3660. Should it not,
in general or "on average" - that is, with most data sets?

It also looks like matching on birth date (5 years)
and unchecked "match blanks" matches, even if no birth dates
are present for any of the 2 persons. I think the help file
should document what happens if zero, one or both values
are present - depending on "match blanks" settings.

One option that would be useful is to be able to match death
dates, but only if present for both persons. Or - maybe that
is what is currently happening if matching on death date is
chosen and "match blank values" is chosen?

If I add a match on death date, within 100 years, the number of
matches goes from 2450 to ZERO, with unchecked "blank values matching".
For example, I have 2 persons that match if matching on death date
is not selected. These persons died 14 years apart, but does not
match with "death date 100 years" - something is wrong.
(My reminder: persons # 5061 and 7784 match)

There should be options to use christening date if birth is missing,
and burial date if death date is missing.

(I have a vague memory about having reported problems with
the Match finder before, so my apologies if I have repeated
some of that.)

Geir



> -----Opprinnelig melding-----
> Fra: Cheri Casper [mailto:]
> Sendt: 23. februar 2005 20:21
> Til:
> Emne: [GENBOX] Match Finder
>
>
> Has anyone used this in a while with successful results? I just
> imported an
> 8,000+ person database into my existing database. There are many many
> duplicates (I know this because I had been meticulously hand entering some
> of the data from this 8000 person database into my own), yet the match
> finder tells me that out of some 14,000 people there are no
> duplicates. Yet
> I can go to the Individuals Pick List and *see* matches.
>
> What's up?
>
> CheriC
>
>
> ==== GENBOX Mailing List ====
> To join this list, send an email to
> with the word "subscribe" as the subject line. Then email your
> messages to and they will appear on this list.
>
> ==============================
> Search the US Census Collection. Over 140 million records added in the
> last 12 months. Largest online collection in the world. Learn
> more: http://www.ancestry.com/s13965/rd.ashx
>


This thread: