Peter Verheyen raises some important general concerns. I put these remarks
together before seeing what Tom La Porte and George Miles wrote, and they've
largely beaten me to the punch, but here anyway--
> One of the things which quickly became apparent was that the easiest
> way to distribute these [finding aids] electronically is via the web.
> Of course we could (potentially) also have a copy of Panorama on our
> server that researchers could download, use to view whatever finding
> aids they were interested in (but can't print from it, and they have
> to use Windows. For myself I would have to admit that I would be
> initially quite leery of having to download, yet again, another piece
> of software just to be able to view something. I'm also playing the
> devil's advocate here, but only a bit.
It's a fair criticism, at least of the state of the art with SGML software.
For EAD applications, the main advantage of Panorama 1.5 over a web browser is
that the navigators are excellent: if your SGML is rationally implemented,
navigators are simple to set up, and quite readily represent the structure of
your finding aid. Alternative style sheets -- say, one which shows control
information, another which hides it for easier browsing -- are another nice
> In order to be able to convert the finding aids on the fly, so to
> speak, we would need to get Dynaweb, which I found out costs $22,000+
> (anyone know of a cheaper yet comparable product).
They don't have to be converted on the fly. If the texts are stable, they can
be converted once and set side-by-side on the server, the EAD SGML as an
archival version which offers better functionality, the HTML as a
lower-function browsing version for those who can't be troubled with Panorama.
This is the approach of a number of leading sites using TEI SGML to support
Humanities applications. (The Humanities Text Initiative at Michigan and the
Victorian Women Writers Project at Indiana are two good examples.)
Conversion like this is still not as easy as one would like. A good hacker can
do stuff like this in PERL or, I understand, OmniMark (whatever that is). [Now
we've heard about OCLC's FRED, another promising option.] The strong automated
tools, however, are a bit slow to come -- if any of this can be called "slow".
A standard currently emerging, DSSSL (Document Style Semantics and
Specification Language), will support, in addition to other specifications such
as style design, the conversion of data from one DTD to another. Given DSSSL,
it will be possible to develop "off the shelf" conversion tools.
Archival finding aids, at least of the traditional kind, are a good example of
texts that ought to be very stable. There are many other text applications
which would be more dynamic. One of the main attractions of SGML is that the
stability of the standard itself offers the chance to integrate systems more
fully between stable and dynamic texts and applications.
> The costs for the SGML editor we were made aware of, but raised the
> eyebrows of our administrators. Could any one give me any idea of
> what other hidden costs (outside of staff training) might still be
> out there; advise us of anything else we should be aware of, good or
> bad; generally advise us.
An SGML editor may be a bit pricey now, but it'll be far more versatile than
application-specific programs. Be aware of what you're buying: a way into using
a standard which no software company owns. [As has alreadly been pointed out]
the self-starter who understands systems and encoding can do SGML for free,
using a plain ASCII editor and some freeware tools.
The real investment to be made, however, is in learning. Ask questions like
you're asking. Buy one copy of an editor and a browser, do a pilot project, let
it raise the questions and prompt the answers. Do some comparison testing.
Experience is the best and only instructor. Inform yourself _why_ this encoding
and not that, so you can address the real issues intelligently and as free of
hype as can be managed in these garish, benighted times. Otherwise you have no
hope of spending the money wisely.
> In essence, what is the value to our institution to do all this, when
> we can just as easily make our finding aids available as preformatted
> html text. In either case they are about equally as searchable
> without the added expense of a search engine which works across
It'll be far less expensive in the longer term to use a DTD which will be
stable and which is optimized for your application. With smart scripting, you
should be able to convert the same EAD document into HTML 2.0 this year, HTML
3.2 next year, and HTML 7.65 three years after. Converting information that's
encoded in an inappropriate, non-descriptive tag set is costly, labor-intensive
and boring. ("What's this <H4> doing here, I forget, is it a series or a box?
Is that <I> a title we need to index? *yawn*") It'll be far easier to update a
script to follow accurate descriptive tagging in the source document and churn
out the latest web tags. And we may not even have to do that, for long. The
beta release of HotJava, for example, Sun's experimental web browser, has a
\DTDs subdirectory. Emphasize the plural.
Then too, properly descriptive SGML will support printing far better than HTML.
This is another area which isn't quite there yet at the small-to-moderate scale
-- WordPerfect/SGML is still something of a tough go; Corel Ventura's promised
SGML support hasn't materialized yet.
In general, this is because while the concept of SGML is sound, actually
implementing integration between native electronic media and older media such
as print, or hybrid media such as word processors, is tougher than we guess at
the outset. The World-wide Web has grown explosively because it could bypass
this integration. Nevertheless, the trends are clear.
Really, it's not an either/or thing. What we need to be thinking about is not
"HTML" or "EAD" but _encoding_, understand the way these things work, what we
are actually getting for our investments and what the implications are, down
the line, of our encoding decisions today.
It's very much analogous to buying a computer: hold off, and you're sure to get
more bang for your buck. Early implementors have the opportunity to shape the
future, but you may not want or need to participate in that. Different
institutions will want to be further out front than others, and the decision is
strategic and should account for many local factors and contingencies. If
you're reading this list, you're on the track.
If you want to start now, however, don't do HTML and leave aside SGML. Don't
just learn about the Model T: learn about internal combustion.
For further references, especially for those new to the SGML scene, let me
suggest my web page "SGML and TEI Resources" at
> Thank you,
> Peter Verheyen
Thanks for raising the issues. They're important.
CETH (Center for Electronic Texts in the Humanities)
[log in to unmask]