At 10:48 AM 4/14/96 -0400, Helena Zinkham wrote:
>Very interesting and helpful to have an overall reaction to EAD. (That's
>a hint; if other early implementors have general comments, it'd be great
>to hear them.)
>One thing came to mind for point (3) "indexes," and perhaps the DTD
>overall. There's a lot in EAD to help with converting data in existing
>finding aids, e.g., indexes that can't be built automatically because the
>data only appears in the index. Imagine literary manuscript collections,
>for which the container list says "Correspondence "A through An"; only in
>the index are the individual correspondents listed.
One of the things we're running up against here is the long-recognized
difference between "description" and "access," a problem that doesn't just
go away in the SGML world.
On the one hand, a finding aid could simply be considered a somewhat
structured, narrative description of the archive; on the other, it should
form the basis not only for narrative display, but also various kinds of
retrieval. Unfortunately, the latter objective will often require data that
is formatted as non-narrative, normalized access points (or links to them).
This issue, by the way, underscores again the fact that there is no real
"content standard" (cataloging rule) for a finding aid, only a developing
"container standard" (EAD) that is largely non-prescriptive, though with
some implications for content. (Won't somebody please write a content
standard, if only skeletal, to help determine what in fact needs to be
encoded??? Where have you gone, Steve Hensen?)
In any event, several possible scenarios might accommodate both normalized
and narrative data in finding aids, all of which must surely have at least
been noted at one of the BFAP/EAD workshops or conferences. In each case,
however, the finding aid author would have to include additional data in the
record to convey the "normalized form" of the name, location or subject.
Here's a starter list of possible approaches:
1. Embed an SGML-MARC cataloging record within the finding aid and
include normalized forms for all the names, locations, and subjects you can
afford to create. This allows for good indexing at the record level within
a recognized standard, but doesn't get you to the exact place in the finding
aid in which the name is mentioned. (Some kind of internal tag-reference
could possibly be contrived for this.) It would lead, however, to even more
gigantic catalog records than we've seen already, which, trapped in the SGML
world, would never again load smoothly into a MARC-based system.
2.1. Embed elements within the finding aid that are explicitly for
the normalized form of the name, location or subject, e.g.
<name>W. Averell Harriman<name.norm>Harriman, W. Averell (William Averell),
(where "name.norm" is the normalized form of the name, and dates
are defined within that element)
If there's a desire to say what normalizing scheme is being used, it would
probably be done best at the record level rather than redundantly for each
2.2. Same technique as 2.1, but using interpolated SGML-MARC
elements rather than inventing a new elements, e.g.,
<name>W. Averell Harriman<mfb100 I1=1 I2=0><mbs.a>Harriman, W.
(where the normalized form is encoded according to the MARC DTD,
if and when that is completed.)
I presume a way could be found to encapsulate MARC elements scattered about
the EAD record without having to add them in to the EAD DTD ...
3. Embed, instead, where feasible, a reference to a local, national
or international authority record number for the name, etc. An institution
would then presumably maintain an electronic 'register' of these numbers
linked to the standard form of name, with pointers to the specific locations
in finding aids in which that number appeared, e.g.,
<name>W. Averell Harriman<an type=DLC>n50057502</an>
(where "an" is an element for "authority record number" and "type"
is an attribute indicating which authority system was being used)
n50057502 Harriman, W. Averell (William Averell), 1891-1986
with a system index link to the location of the authority number in the
finding aid. The user would, when retrieving names, locations, and subjects
search, not the finding aid itself or its "raw" index, but the authorized
heading index that would then lead to all the places in all the candidate
finding aids in which that name/number appeared.
* * *
In the retrieval of names, locations, and subjects --as with telephone
numbers-- a near match is as good as a miss, except in a very limited, known
domain. The finding aid creator either has to accept some overhead for
authority control, preferably using a widely accepted authority system, or
resign h'self to messy and imperfect keyword-based retrieval. (One can't
just transcribe one's cake and effectively retrieve it too, howevermuch one
might long to.)
Stephen Davis email: [log in to unmask]
Director of Library Systems phone: (212) 854-8584
219M Butler Library fax: (212) 222-0331
New York, NY 10027 http://www.columbia.edu/~daviss/