TMG-L Archives

Archiver > TMG > 2000-09 > 0968424424


From: "Grawrock, David" <>
Subject: RE: [TMG] Copious Ancestors
Date: Fri, 8 Sep 2000 07:47:04 -0700


Tim,

While I agree that some of the numbers show better quality you have to be
careful about what they mean.

For instance I'm currently going through my database and using my
"Assumption" source. This is for when I have information that I believe to
be true but no documentation yet. For instance no marriage source but I've
found them on 4 census reports so I assume they were married and use the
assumption as my source along with the census citations. That's 5 citations
for one event, however the surety is only a 0 so it's still just a guess.

If I have a marriage certificate than the marriage will only have a single
citation but it's surety will be 3. So this marriage has better
documentation than the first one, but it has 4 fewer citations.

The real number that shows a well documented database is the number of
citations that have a surety of 3.

So Bob in the hope of obtaining real good statistics (so those who've done a
good job can brag) in V5 can we have a statistic that computes the average
surety value. We would end up with stats that look like this

1.71 events per person
2.35 citations per person
2.13 average surety (this person's doing a good job of research)

or what you'd see in my database
0.03 average surety (well maybe not that bad but it sure seems like it !!!!

David

-----Original Message-----
From: Tim Doyle [mailto:]
Sent: Thursday, September 07, 2000 7:58 PM
To:
Subject: Re: [TMG] Copious Ancestors


> I have read people stating "I have 30.000 in my records or 45.000
> accumulated" ...

I would rather have one well-documented person in my database than 100
non-documented individuals imported from a correspondent or an online
database. It's not the quantity, but the quality that matters. It's very
easy to quote the number of people in your database. Perhaps we need
another number that indicates the quality?

I checked two of my databases, my main database known as DOYLE and a
specialty database known as OTTERBG. The OTTERBG database is a project in
which I am extracting all of the early records of the town of Otterberg,
Germany and am being very, very good with recording sources for
everything entered. The DOYLE database is the accumulation of my research
over the years with varying degrees of citation. Let's see what I came up
with:

DOYLE: 11,613 people
OTTERBG: 1,479 people

Conclusion: The Doyle database "wins" as it is MUCH bigger - but is this
meaningful? Let's look further...

DOYLE: 41,255 citations
23,548 events

OTTERBG: 6,187 citations
2,527 events

By sheer numbers, the DOYLE database wins again, but that's just volume,
not value. If we look further, we see something more:

DOYLE: 2.02 events per person
1.75 citations per event

OTTERBG: 1.71 events per person
2.45 citations per event

We can now see that the OTTERBG database far exceeds the DOYLE database in
citations per event - a much better documented database! The DOYLE
database still has an advantage in the number of events per person, a
good indication of the volume of data per person entered.

In review, I believe that the number of people in a database is the LEAST
important number to consider. The best numbers to examine are the citations
per event which shows how well documented the database is and the events per

person which shows how 'well rounded' the database is.

Comments would be greatly appreciated.

Tim Doyle

P.S. Bob: I had a tough time getting the numbers of events and citations
and eventually ran a report to disk and then used my editor to obtain a
count. Is there an easier way? I believe that there is a person on this
list that maintains a set of utilities for TMG - perhaps he could add a
function to examine a database and evaluate the quality of it based upon
these numbers? If these numbers were quickly available, perhaps even from
within TMG, people could quickly use them to evaluate the relative value
of databases.




This thread: