In addition to the NoteTab software to which Stephen Yearl refers, there is
a perl utility that strips out markup.
Additionally, it should be relatively easy to do with a macro within your
word processor (or even a search and replace with null). The syntax would
vary with the software.
But don't throw away those tags if you want to convert the sgml encoding to
html. I would strongly second Stephen's suggestion to use xsl. There are
a number of free transformers that use XSL syntax to convert SGML/xml
encoding to another encoding syntax such as html. The overhead to set this
up would be a fraction of what it would it would take to encode it all from
scratch in html.
Michael
Michael Fox
Head of Processing
Minnesota Historical Society
345 Kellogg Blvd West
St. Paul MN 55102-1906
phone: 651-296-1014
fax: 651-296-9961
[log in to unmask]
**NOTE NEW AREA CODE EFFECTIVE JULY 12, 1998**
> ----------
> From: David Delorenzo[SMTP:[log in to unmask]]
> Sent: Wednesday, April 21, 1999 9:38 AM
> To: Multiple recipients of list EAD
> Subject: HTML or ASCII?
>
> Colleagues--
>
> I need your advice on a problem I have encountered with our project to
> convert and encode our finding aids.
>
> None of our 3,000+ finding aids are available in electronic form. I have
> received a grant which I hope can kill several birds with one stone. My
> goals for the project are: 1) convert the finding aids into electronic
> form, 2) acquire an electronic text version (ASCII or something else
> that I can manipulate in a word processing software (MS WORD) and an
> HTML writer/editor (Netscape Composer)), and 3) acquire an EAD-encoded
> version.
>
> We have hired Apex (as we are an RLG member) to convert and encode the
> finding aids. We plan to send the EAD versions to Archival Resources.
> Because I want more flexibility for future uses of the finidng aids
> (whatever they later may be given advances in technology), I would like
> to maintain locally a text version (which for now could be manipulated
> using MS WORD). Because Archival Resources is available only
> fee-for-service, and I don't have the technical support necessary to
> maintain SGML documents, I am also planning on maintaining at our WEB
> site an HTML encoded version meeting our specifications for structure,
> etc.
>
> Here is the problem. Apex has provided me with a first batch of one
> hundred EAD encoded finding aids. I had hoped to be able to use the
> encoded versions in other ways by stripping them of the coding BUT alas,
> with most grand ideas, I have been unsuccessful! Of course, for more
> money (which I'd like to spend on other issues), I am sure Apex would be
> happy to resolve this matter for me. Before pursuing this option with
> the remaining 2,900 finding aids, however, I wanted to know if there
> was a "de-babble-izer" that I could purchase to magically remove the
> encoding.
>
> I am happy to pay the vendor for the deliverables I need but I wanted to
> check with you all first! I look forward to hearing from you.
>
> David
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
> David de Lorenzo 201 West Monument Street
> Library Director Baltimore, MD 21201-4674
> Maryland Historical Society (410) 685-3750 Ext. 309
> Library of Maryland History FAX: (410) 385-2105
> http://www.mdhs.org
>
|