---------- Forwarded message ----------
Date: Wed, 24 Apr 1996 11:05:37 CDT
From: Liam Quin <[log in to unmask]>
To: Multiple recipients of list TEI-L <[log in to unmask]>
Subject: Re: EAD

Stephen Davis <[log in to unmask]> brought up some important
philosophical issues.  Unusually for philosophical issues, these ones
have immediate practical import :-)

> Realistically, how often do we want to wrestle with why a particular
> element isn't defined within another element?  This seems to let the
> container drive the content, where it should really be the opposite.
> WHEREVER I need a <persname> I should be able to use it.  At this rate
> it looks as though it would be best to define every element as possibly
> appearing within any other element, in any order!  And, actually, why
> not?

I've seen two main uses for SGML.

The first (and the one that's always interested me the most, I ought
not to add) is the use of SGML to describe something -- whether it be a
data structure, a 17th C. dictionary, or my left ankle.
This I shall call descriptive SGML.

The second is to provide constraints on authors.  This latter use is
usually seen in corporate environments, and in the sorts of places
whence SGML first did spring.  This I shall call prescriptive SGML.

But these are not my own terms -- they are common in the SGML industry.

I have painted a black and white picture, but there are shades of grey
(and perhaps a little pink -- maybe I need different glasses, but I
_like_ the rose-tinted ones).

A certain amount of enforced structure can be a great help.  Taking it
too far can be a distraction and an annoyance.

In an SGML dictionary intended for commercial publishing, it is common
to see a requirement that every character be accounted for: every comma
must be there for a reason, or be expelled, expunged or prohibited.

On the other hand, if you're transcribing a document, you might only
be interested in adding a very few tags.  You might have no way to decide
why a comma is there, let alone have the editorial authority to delete it.

> Perhaps we will need to rethink a good part of the structure of SGML
> documents, e.g., to use broad hierarchies reflecting significant
> structural components of the text, and then simply defining an extended
> data dictionary that can be applied wherever needed under any of the
> hierarchical levels.  (Frankly, given that there are no "rules" for the
> content of a finding aid, how can all elements NOT be valid under all
> other elements, in any order desired??)  See below for a gross
> representation of what I'm getting at.

This may well make sense for a data dictionary.

> This kind of generalized strategy might have the added benefit of
> allowing the creation of a few, more generic blanket DTDs, reducing the
> DTD-proliferation we're starting to see.  A name is a name is a name, right?

Well, no.

Even in simple documents, <TITLE> is often used for the title on the title
page and spine of a book as well as a chapter title.

We've seen A used for Author and for Anchor.

And in the proceedings of the House of Lords, perhaps we might
even see
    <TITLE>The Right Worshipful Sir</TITLE> <Name>John Owen</Name>

There is nothing wrong with this.  A Data Dictionary is usually shared
only within a single organisation or domain for which it makes sense.

For the TEI, perhaps something more could be done, although I am tempted
to suggest that a registry of architectural forms might be more appropriate
than a list of element names.


Liam Quin, SoftQuad Inc +1 416 239 4801 [log in to unmask]