In addition to the NoteTab software to which Stephen Yearl refers, there is
a perl utility that strips out markup.
Additionally, it should be relatively easy to do with a macro within your
word processor (or even a search and replace with null). The syntax would
vary with the software.
But don't throw away those tags if you want to convert the sgml encoding to
html. I would strongly second Stephen's suggestion to use xsl. There are
a number of free transformers that use XSL syntax to convert SGML/xml
encoding to another encoding syntax such as html. The overhead to set this
up would be a fraction of what it would it would take to encode it all from
scratch in html.
Head of Processing
Minnesota Historical Society
345 Kellogg Blvd West
St. Paul MN 55102-1906
[log in to unmask]
**NOTE NEW AREA CODE EFFECTIVE JULY 12, 1998**
> From: David Delorenzo[SMTP:[log in to unmask]]
> Sent: Wednesday, April 21, 1999 9:38 AM
> To: Multiple recipients of list EAD
> Subject: HTML or ASCII?
> I need your advice on a problem I have encountered with our project to
> convert and encode our finding aids.
> None of our 3,000+ finding aids are available in electronic form. I have
> received a grant which I hope can kill several birds with one stone. My
> goals for the project are: 1) convert the finding aids into electronic
> form, 2) acquire an electronic text version (ASCII or something else
> that I can manipulate in a word processing software (MS WORD) and an
> HTML writer/editor (Netscape Composer)), and 3) acquire an EAD-encoded
> We have hired Apex (as we are an RLG member) to convert and encode the
> finding aids. We plan to send the EAD versions to Archival Resources.
> Because I want more flexibility for future uses of the finidng aids
> (whatever they later may be given advances in technology), I would like
> to maintain locally a text version (which for now could be manipulated
> using MS WORD). Because Archival Resources is available only
> fee-for-service, and I don't have the technical support necessary to
> maintain SGML documents, I am also planning on maintaining at our WEB
> site an HTML encoded version meeting our specifications for structure,
> Here is the problem. Apex has provided me with a first batch of one
> hundred EAD encoded finding aids. I had hoped to be able to use the
> encoded versions in other ways by stripping them of the coding BUT alas,
> with most grand ideas, I have been unsuccessful! Of course, for more
> money (which I'd like to spend on other issues), I am sure Apex would be
> happy to resolve this matter for me. Before pursuing this option with
> the remaining 2,900 finding aids, however, I wanted to know if there
> was a "de-babble-izer" that I could purchase to magically remove the
> I am happy to pay the vendor for the deliverables I need but I wanted to
> check with you all first! I look forward to hearing from you.
> David de Lorenzo 201 West Monument Street
> Library Director Baltimore, MD 21201-4674
> Maryland Historical Society (410) 685-3750 Ext. 309
> Library of Maryland History FAX: (410) 385-2105