I'm writing to respond to both Peter Verhayen's initial post and to Tom La
Porte's reply. Peter, when I first heard about the SGML initiative I was
skeptical whether it would bring much additional value to the full text
indexing of ASCII text finding aids such as we had mounted on Yale's gopher
several years ago. I was then, and remain now, skeptical about how much true
value will be gained by elaborate tagging and coding - particularly of proper
nouns (whether personal or corporate names, places, subjects, forms, or
genres). The problem as I see it that the cost of establishing authorized
entries for the descriptive sections of large archival collections and of
dealing with authorized forms in free text descriptive sections as well as in
folder headings will be too great to sustain. While a few collections with
small backlogs of unprocessed material and low acquisition activity MIGHT be
able to justify the labor expenses associated with such careful coding, I
think that most archival institutions are in a position of trying to "catch
up" with accumulations of unprocessed or badly under-described collections
and that the gains of tagged retrieval over general free text searching will
be relatively small. I should add that I am adamant that free text searching
engines be capable of doing proximity searching that is bi-directional - that
is to say that I should be able to ask the search engine to find Ezra within
3 words of Pound and retrieve files that contain the following kinds of
entries:
Ezra Pound
Ezra Loomis Pound
Ezra L. Pound
Pound, Ezra
Pound, Ezra L.
Pound, Ezra Loomis, etc.
In this regard my enthusiasm for SGML and EAD is different than Tom's. What
persuaded me of the value of SGML was the existence of a wide variety of
support tools that provide for "Navigators" (to use the Panorama term). I
have always felt that one of the weaknesses of paper form finding aids,
especially for large collections, is that they seem inevitaly to hide the
forest for the trees, by which I mean that by the time a researcher moves into
the container listing, the level of detail that needs to be provided usually
obscures the overall architecture of the finding aid. Machine readable finding
aids are even worse. Trying to scroll back and forth through long text files
is tiresome and confusing. When we mounted our files on the gopher we decided
to mimic our paper format so that we left page headers in place (about every
66 lines) in an effort to provide some "Continuation" messages locating
particular folders within the series, sub-series, heading structure of the
finding aid. But computer screens tend to display 25 lines of text at most
and so even here some scrolling has been inevitable. But this is where the
split screen navigators of the SGML world change the model. One can have the
full text (or some portion of it) in the right hand window while displaying
the architecture in the left window. Furthermore, one can expand the branches
of the architecture at will allowing the reader to adjust their reference
window as they like. Furthermore, they can point and click in the left to go
quickly to sections of the text they want to view rather than scrolling through
the text. This kind of functionality is not easily done in HTML although
I suppose that the developing frames technology might provide some analagous
capabilities. On the other hand, by the time you start to design a frames
based system you are talking about working on the bleeding edge of Web design
and in relatively uncharted waters. For all of these reasons, I have become
a big fan of employing true SGML tools that provide for direct delivery over
the web through WEB browsers. I wish Panorama had some competition because I
believe it would push SoftQuad's development efforts beyond their already
novel and pathbreaking efforts (consider how Netscape and Microsoft are
pushing each other's browser AND server development), but I am convinced that
SoftQuad has provided a breakthrough tool (or perhaps a concept) in designing
Panorama as a cheap (free even) Netscape helper app. Yes, we need a MAC
version, and we need them to address the tables problem as soon as possible,
but in the meantime, we can tell many, many users that all they need in order
to see finding aids in a new way is to download a free copy of Panorama and
add it to their Netscape, Mosaic, or other Web browser.
Finally, while I am not a big fan of elaborate subject and noun tagging, I
second Tom's points about the value of tagging the structural elements of a
finding aid using an internationally recognized standard (EAD) that employs
a highly sophisticated and internationally recognized language (SGML). As
Daniel Pitti has pointed out on numerous occasions, SGML is going to be a big
time standard for a long time - if only because of the U.S. Dept of Defense.
There are and will be more tools for coding and for translating SGML files
and this should greatly enhance the portability of such files over the long
haul. Instead of looking soley at the initial software costs, consider the
cost over the time of the files. Buying a $750 editor such as Author/Editor
from SoftQuad is a capital cost that will, in my opinion, pay for itself
several times over the years its used.
I've gone on a little longer than I planned, but I think that using HTML
instead of SGML is a case of "penny wise - pound foolish."
George Miles
Curator, Yale Collection of Western Americana
Beinecke Rare Book & Manuscript Library, Yale University
E-MAIL - [log in to unmask]
PHONE - 203-432-2958
FAX - 203-432-4047
|