Archiver > GENEALOGY-DNA > 2010-02 > 1267375605

From: "Diana Gale Matthiesen" <>
Subject: Re: [DNA] Handling data in bigger projects. Was RED. Calltoparticipants
Date: Sun, 28 Feb 2010 11:46:45 -0500
References: <0B37177B19B64203B534711185F452B2@PC>
In-Reply-To: <0B37177B19B64203B534711185F452B2@PC>

I didn't say a project admin *couldn't* make use of a SQL database, I said I
could see no reason why an admin would *have* to make use of one, and I don't.
In any case, SQL is a dbms (database management software), not a spreadsheet
program. Whether you use a dbms or a spreadsheet depends on what you want to do
with the data. The only thing I've ever used either for in my projects, with
regard to test data, is to have Excel figure out the modal values.

One obvious way to deal with large projects -- surname or regional -- is to
break them down by haplogroup. People in different haplogroups haven't had a
common ancestor in thousands of years, so I see no earthly reason to keep them
in the same project, unless the project is small and easily managed as it is.

There are some large surname projects that have apparently reached the point
where the project admin is unable to even do something as simple as subgroup
their members. It's time for them to spin off haplogroup-specific surname
projects (e.g., Surname-I, Surname-J, Surname-R) to bring the project back to a
manageable size. People of that surname would still join the original project
to be tested, then move to the appropriate haplogroup-specific project once
their haplogroup is known. Rare or uncommon haplogroups would not be spun off.

Likewise with regional projects. Once they become very large, it makes no
sense, to me, to keep them together because people in different haplogroups have
no recent genetic connection.

All IMO, of course. And YMMV.


> -----Original Message-----
> From: On Behalf Of Lancaster-Boon
> Sent: Sunday, February 28, 2010 10:47 AM
> To:
> Subject: [DNA] Handling data in bigger projects. Was RED.
> Call toparticipants
> Kirsten,
> I agree. We will indeed be keeping an eye out for anyone who
> has ideas about
> this aspect of the project. Just to give an example of one
> problem we face:
> we have not even begun to try to work out how we would absorb
> data from
> other testing companies.
> Diana,
> I think it would be very interesting to hear more ideas about
> how to handle
> data. Certainly I have always used plain old fashioned
> spreadsheets whenever
> I needed to go beyond test company supply tables (which is
> increasingly
> important).
> I think there are two reasons for considering going beyond
> spreadsheets:
> 1. Is sheer size of database when you look at a case like the
> British Isles
> Project.
> 2. Is that with a spreadsheet, you can play with the data
> offline, extract
> and it format and post sections of it on the web, but when a
> project has
> more complex goals of allowing participants and others to
> play with the data
> themselves you can not simply post the whole database on a
> webpage, so you
> need some set up whereby a person can ask for certain data
> matching certain
> criteria.
> A really excellent example is the E-M35's database which
> Victor Villareal
> set up:
> I understand it to be SQL based.
> Best Regards
> Andrew
> ----
> From: "Diana Gale Matthiesen"
> SQL is a computer database query/maintenance language. I can
> think of no
> reason why a DNA project admin would have to know how to use it.
> ----
> From:
> With a project that large, you could use some help from
> people who are not
> as knowledgeable about the DNA aspects as you are just to
> help you with the
> mundane tasks. You might want to keep your eyes open for enthusiastic
> people, especially newbies, since most of them are not
> already administering
> projects. ;-)
> -------------------------------
> To unsubscribe from the list, please send an email to
> with the word
> 'unsubscribe' without the quotes in the subject and the body
> of the message

This thread: