The Minnesota Historical Society is facing the same situation that David de
Lorenzo described in his post on this topic but are taking another tack
which might be informative.
APEX will be converting between 3-4,000 pages of finding aids for us this
year. We have been very happy to date with the accuracy of their double
entry keying. It lets us get a lot of conversion done that we could not
manage in-house, especially for older, shall be say messier-appearing,
inventories. Because we have a lot of inventories, estimates are between
15 and 20,000, we need to pay a lot of attention to the cost overhead of
such an operation.
And again, like David, we see the need for three versions: the EAD master
version that we receive from APEX, an HTML version for our Web customers,
and a nicely printed version for our in-house patrons.
We are treating the EAD-encoded version of inventories as the master file
for all the reasons that Daniel Pitti described in an earlier message:
portability, platform and software neutrality, and reuseability. In a
phrase that I think Bill Landis coined, we are treating our finding aids as
data and not as text. We have chosen to create and maintain our files in
The creation of a derivative html version is extremely simple using the
transformation capabilities of XSL stylesheets. The processing assistant
copies the xml file into the proper directory, activates the xsl processor
(we use James Clark's free XT program), types in the source file name, the
stylesheet file name, and the output file name, and presses the return key.
Once the processor has done it's thing, the output file is mailed to our
Webmaster for posting to the Web site. The whole proces takes no more than
2-3 minutes per collection. We think that we produce a fairly sophisticated
Web output, one that integrates completely into the look and feel of the
Society's overall Web site design. For an example see,
The final step is to produce nicely printed copies for our reading room. At
this very moment we are experimenting with two options. One is to simply
print out the html version. With a browser like IE 5.0 that supports CSS2,
we can even get page breaks where we want in the document. However,
because we want to add running headers and footers to the document, we
probalby will want to use MS Word. Word 97 and Word 2000 both can readily
import an HTML document (like the one generated in the our previous step)
and convert it into Word format. No need to strip out tags and reformat
anything. We use lots of tables in our HTML output to format the text.
These are converted directly into Word tables and retain the same
formatting, useful for container lists that have a highly indented
structure. It's simple- just go the menu bar in Word, open the HTML file,
and wait 10 seconds while Word does the work. Then it's a matter of adding
headers and footers, appropriate page breaks and other light editing to ge
the file looking the way you want. I suspect that the same functionality
will be available in WordPerfect 9 when it comes out later this month for
those of you of that persuation.
The overhead on each of the latter two steps is both minimal and cost
effective. They are possible, in fact, because EAD is about structure and
not content and is therefore reuseable for other purposes in a way that
word-processing, html, and pdf files are not.
Head of Processing
Minnesota Historical Society
345 Kellogg Blvd West
St. Paul MN 55102-1906
[log in to unmask]
**NOTE NEW AREA CODE EFFECTIVE JULY 12, 1998**