David Delorenzo wrote:
> if  there
> was a  "de-babble-izer" that I could purchase to magically remove the
> encoding.

You're on Windows, right? If so

1. Download NoteTab Light from http:\\ (its free!)
2. add the following clip to any of the clipbooks you find when the
program is unzipped and running on your system

H="strip all tags"
^!Replace "<.*>" >> " " RASW
^!IfError End
^!GoTo Start

This may look cryptic now, but after you have notetab installed and you
have spent a few minutes with it, it will make sense. If you have
problems, reply off list.

This is the fastest way I know to strip tags on Windows. The <.*> thing
is known as a regular expression and tells Notetab to match any
character (.) any number of times (*) within and including a start tag
(<) and an end
tag (>).

This regular expression will not work if the tag is split by a
newline (E.g. <!DOCTYPE ead PUBLIC "-//Society of American
Archivists//DTD ead.dtd
(Encoded Archival Description (EAD) Version 1.0)//EN" "ead.dtd"[

]> will not work), but it will work the *vast* majority of your tags
(on everything but !DOCTYPE probably).

If it helps, I am working (slowly) on NoteTab extenstions to map EAD
tags to RTF encoding for easy entry into a format that can be later
manipulated in M$ Word (the idea was to use RTF as an intermediary to
PDF). Equally, EAD tags can be mapped to HTML tags in NoteTab using same
^!Replace... syntax. But to convert EAD to HTML I would recommend you
consider using XSL.

For translations from HTML to RTF consider Ishtar
It does not handle tables too well, but it is a good first stop.
From RTF to HTML you are all set using "save-as-html" in Word



Stephen Yearl, Project Archivist
        [log in to unmask]
 Connecticut Historical Society
      1 Elizabeth Street
      Hartford, CT, 06105