> [snip] EAD documents (or "instances" as I often see them called; why?)

"Instance" is the formal term used in the ISO 8879 SGML standard,
defined as "the data and markup for a hierarchy of elements that
conforms to a document type definition" [def 4.160]. Personally, I
quite like the term "instance" as it suggests a move from the
generalisation of the DTD to a "realisation" in the instance, and
emphasises the one-to-many relationship.

The standard defines a "document" rather more generally, as " a
collection of information that is processed as a unit. A document
is classified as being of a particular document type". [def
4.96] - another definition I like and seem to find myself citing
frequently, especially to try to shift people's focus away from
the printed renditions of those units.

My interpretation of this first distinction is one between:

(i) the "information unit" (document), which belongs to a type (i.e.
has a set of structural characteristics which have been previously
identified), and which may exist in various physical forms; and

(ii) a particular encoded form of that document which conforms to
the SGML standard and to a specific SGML DTD (which itself is a
particular formalisation of the structural characteristics which
identify the document type - characteristics which could in theory be
equally well described using some other syntax/notation).

So (IMHO!) an instance is always a document, but not all documents
are instances....

Further, the standard defines "SGML document" separately: "[first
sentence snipped]...An SGML document consists of data characters,
which represent its information content, and markup characters, which
represent the structure of the data and other information useful for
processing it. In particular, the markup describes at least one
document type definition, and an instance of a structure conforming
to the definition." [def 4.282]

This suggests to me a second distinction: that the instance excludes
the DTD(s), whereas the SGML document includes it/them (explicitly or
through the reference in a DOCTYPE statement).

Having said all this, the standard itself doesn't sustain the
distinction between "document" as information unit and "SGML
document" (and neither do I - or most SGML users -, I'm quite sure!),
as a note acknowledges, "the term [document] invariably means
(without loss of accuracy) an SGML document". [note to def 4.96]

Incidentally, I don't see the same use of "instance" in the XML
Recommendation. I guess the reason for that is since the use of the
DTD is optional in XML, you can have a (well-formed) XML document
which isn't an instance of a DTD and which can exist without
reference to one.

All just my reading of Dr Goldfarb, I hasten to add....

Pete Johnston
Pete Johnston (Effective Records Management Project)
Archives & Business Records Centre
University of Glasgow
77-81 Dumbarton Road
Glasgow G11 6PP   E-Mail: [log in to unmask]
Scotland, U.K.    URL:

Tel:  (UK) 0141 339 8855 ext. 2166 or (UK) 0141-330-4159
Fax:  (UK) 0141-330-4158