GENMSC-L Archives

Archiver > GENMSC > 2007-07 > 1184763918


From: Ian Goddard <>
Subject: Re: Beyond GEDCOM
Date: Wed, 18 Jul 2007 14:05:21 +0100
References: <slrnf9ol6e.cvf.usenet@goodwill.larseighner.com><1184689853.767732.28500@x35g2000prf.googlegroups.com><2h6ni.9164$zA4.5373@newsread3.news.pas.earthlink.net><slrnf9pttu.fnh.usenet@goodwill.larseighner.com><2t7ni.8570$Od7.2064@newsread1.news.pas.earthlink.net><9Kdni.12915$LH5.785@trnddc02><9Beni.9298$zA4.985@newsread3.news.pas.earthlink.net><iuvq93105be5ochbsi73dh7u0rforplirg@4ax.com><Hagni.8705$Od7.4359@newsread1.news.pas.earthlink.net>


Robert Melson wrote:

> In article <>,
> Denis Beauregard <> writes:
>> On Wed, 18 Jul 2007 02:07:01 GMT, (Robert
>> Melson) wrote in soc.genealogy.computing:
>>
>>>As I understand it, though, XML permits you to add/delete
>>>entities locally while still preserving the "universal"
>>>entities defined elsewhere. If that's true, then you
>>>should be able to add a "hippie marriage" entity in your
>>>local DTD that would satisfy your requirements.
>>
>> From a structural point of view, where is the difference
>> with GEDCOM ?
>>
>>
>> Denis
>>
>
> I'm not sure I can answer that question. Anything I think I
> know about XML comes from the O'Reilly "XML in a Nutshell"
> and is rusty as hell. (Hmmm, now I think of it, though, both
> V5.X and V6 are GEDCOM - the realization is different from one
> to the other, though.)
>
> I think the major difference between 5 and 6, however, is not in
> the representation but in the entities included and how they're
> defined. I haven't really looked at the BNF definitions of
> the GED language, so I can't really tell you specifically how
> the two versions differ.
>
> The advantage of XML, though, as I understand it, is that there can
> be a centrally defined stylesheet which can be imported into the

I think you mean DTD or schema. That's what specifies the data format. DTD
is pretty well moribund. Schemas are more powerfull. Stylesheets are what
you'd use to specify the transformations between, say a standard version
and a vendor-specific version.

> local environment and modified locally without affecting the base
> stylesheet. You could even, if I remember correctly, have multiple
> stylesheets - the base, a vendor version, an o/s specific version,
> one for your company and your personal version. In terms of the

Apart from the terminology this is correct. I did a number of projects for
a client who provided personalised print services where their clients would
specify the XML they were sending to be processed. I used stylesheets to
transform the various client schemas into into a common in-house schema so
that I could handle them all with a common application code-base.

> GED standard, you might have the central "standard" maintained at
> and by the LDS, a FTM version maintained by the vendor, and a
> local version you maintain yourself to fix problems you see in
> the other two.
>
> If there's anybody here who understands this, I'd love to see/hear
> what you have to say about the GED V.6 beta standard and what it
> will mean for us mere hackers.
>
> Bob Melson
>
I took a quick look at the proposed new standard a few weeks ago. As I
recall it had actually dropped some components of the original. This would
make it far less able to deal with some of the issues raised in a
long-running thread in s.g.methods earlier this year.

XML has several advantages.

Firstly its notions of well-formedness and validaty ensure that a corrupt
file or the product of a program generating off-spec XML is easily picked
up.

Secondly the use of XSLT allows transformation into XML of a different
schema, HTML or text formats, possibly even existing non-XML Gedcom.

Thirdly there are well-established libraries in all major languages for
parsing XML so there's no reason why any application developer should have
to roll their own and get it wrong.

I doubt it's a passing fad. It's the language I wish I'd had available
donkey's years ago for exchanging data.

But that's what it is - a language for data exchange. And the issues being
raised here are essentially about the data model behind Gedcom. In order
to produce a worthwhile replacement one would really have to produce a new
data model which is accepted as better than that in use.

As far as I can see one weakness of existing data models is that there is a
direct link between entities. In another post in this thread I gave an
example of how handling of uncertainties could be a property of an improved
system but as far as I can see this would be impossible to implement with
direct links - one would need additional link entities/classes/tables.

The development process would, then be:

1. Data model.

2. Data exchange specification (XML or otherwise).

3. Implementation.

--
Ian Goddard

Hotmail is for the benefit of spammers. The email address that I actually
read is igoddard and that's at nildram dot co dot uk


This thread: