GENEALOGY-DNA-L Archives

Archiver > GENEALOGY-DNA > 2003-03 > 1046578088


From: "Alastair Greenshields" <>
Subject: Re: [DNA] Haplogroup fields at Ybase
Date: Sun, 2 Mar 2003 04:08:08 -0000


Hi,

The non-standard measurement of markers is always a problem and DYS454 and
DYS455 are notable.

Collaboration is the best route toward standardization and it should be a
matter of course for any laboratory to obtain pre-typed DNA verified in
several labs with which to compare and/or to construct allelic ladders using
DNA of known sequences.

The current situation is detrimental and I would entrust that corrections
would be given out to past customers as soon as this is cleared up.

It does indeed cause a problem when cross-comparing haplotypes and, as Orin
pointed out, the number of overlapping markers is effectively reduced by
two.

A possible stop-gap is simply to exclude those markers from your Ybase
haplotype search.

Orin did bring up the idea of lab identification. I could have put data
fields in the submission form for this purpose from the start but there are
always some cases where people have been compared by more than one lab.

Also, the idea of Ybase was to share information across the board -
searching by one company would have defeated the object and possibly
polarised the issue. I am all for gathering as much useful info as
possible, but pinning down one lab per record has its problems.

An interesting nomenclature case is GATA-H4. As I understand it, this was
first published by Gonzalez-Neria (Forensic Science Interational 122 2001
19-26) and the suggested nomenclature given. Later the US National
Institute of Standards and Technology (of which Dr. John Butler is part of),
developed a mulitplex capable of amplifying 20 markers in one go.

However, primers can be pesky little things and the same primers used in one
multiplex cannot always be used for another multiplex.

This was the case (I presume) with the NIST 20 multiplex. The original
primers used in the Gonzalez-Neira mulitplex amplifies a larger region than
the NIST primers. Hence the difference in reported sizes for this marker.
Both nomenclatures are correct for their respective primers, and although
there is a strong case for going with the original nomenclature, the NIST
primers simply don't cover the extra DNA stretches that result in the higher
figures.

Hence the confusion.

But do you stop measuring the larger, and apparently, non-variable region,
or might this blinker the possibility that rare mutations could occur in
this extra stretch? Testing our close primate relatives can throw up
results that question the standards we adopt.

Haplogroups have also had to undergo nomenclature standardization. For
several years, there were many systems in use. In the end, a new system has
been adopted.

The markers GATA-A4 and DYS439 are also a case in point. The primers used
were effectively amplifying the same region, and the naming of GATA-A4 was
dropped.

As new information becomes available, naming systems are re-examined. So,
there is always a bit of play before standardization (or should that be
standardi's'ation) comes through. Even the standards adopted today will
evolve and have to be re-examined at a later date, which is why I am not
typing away at a Spectrum 16K computer.

I'm not disputing FTDNA's particular point on DYS454 and DYS455, but you can
see the present flux will eventually settle down.

The best solution is to push for the correct numbers for the above markers
to be given to each customer (whom, after all, have paid for correct
results).

As soon as this is done, the problem should fade (provided people update
their Ybase entries and various websites).

To answer the enquiry from Jim Hull regarding how many to put in the 'no. of
participants' field at Ybase, I would say that the more information the
better, but it is entirely up to you. It might, for example, show the modal
haplotype for a particular surname line. Whilst it does boost the number of
overall haplotypes counted at Ybase, it does have another benefit in which
people can see which are the largest (and most successful) projects.

The default is 1 and goes up to 20, but by far the most keep the default.

On a side note, when the statistics page was drawn up, all records were
considered to only have one haplotype, thus reducing the amount of
'skewing'.

Hope this answers some questions.

Kind Regards,
Alastair


This thread: