Archiver > GENEALOGY-DNA > 2010-02 > 1267324445

Subject: Re: [DNA] Call to participants in ALLgeographicandhaplogroupprojects: fill in your ancestry
Date: Sun, 28 Feb 2010 02:34:05 +0000 (UTC)
In-Reply-To: <>

Hi Andrew

At least you know what SQL is. I don't!

Using haplogroup information would be one way to sort the data. It will help a little. Actually, it will help a lot for those who aren't in R1b. I'm not sure how much it will help out in R1b at this time, but it should be more of a help as more SNPs dividing R1b are found.

I'm not sure about the answers to some of the questions you have in your mind. FTDNA should be able to help with some of them, and you might find that some data miners out there will solve some of the problems of dealing with large data sets. I think some of them have very good computers. That R-L21 spreadsheet of Mike Walsh's is getting pretty large, and while my computer does not handle it all that well, Mike's computer apparently is doing just fine with it.

So I guess one trick would be to let others solve some of the problems. Some might do good work as data miners, and some might become co-administrators. With a project that large, you could use some help from people who are not as knowledgeable about the DNA aspects as you are just to help you with the mundane tasks. You might want to keep your eyes open for enthusiastic people, especially newbies, since most of them are not already administering projects. ;-)

Best Regards


----- Original Message -----
From: "Lancaster-Boon" <>
Sent: Saturday, February 27, 2010 3:05:31 PM GMT -05:00 US/Canada Eastern
Subject: [DNA] Call to participants in ALL geographicandhaplogroupprojects: fill in your ancestry

Hi Kirsten

Splitting up the data is one possibility I guess we have to constantly keep
in mind, but I understand that in the past this could not be sustained.
Maybe we can set up a control somehow to do this better and compartmentalize
the work somehow.

Actually it would be easy to do this if the database were not in itself so
big. Pasting 4400 individuals into a spreadsheet and then playing around
using sort and filter does not sound like a good idea. It gets mentioned to
me every now and then that big projects require a knowledge of SQL. Gulp.

Even then I think there are many practical challenges I hope FT DNA is
looking at. If you want to make your own big database then things like
furthest back ancestor, SNP tests, etc are not I think available in any
tabular form for admins? (CSV or otherwise.) It is quite often on the E-M35
project that people ask what the latest SNP results are for a new SNP for
example, but to answer such a questions means being able to sort out not
only positives but also negatives. But for many such jobs the only way to do
is to manually open each personal account and go through it.

Or are there tricks I have not learnt?

Best Regards

This thread: