You did not mention in what form you gave the finding aids to APEX.  I am
assuming that you gave them paper copies which they scanned, OCRed, and
converted.  If they are going to all that effort, they could surely give you
an electronic version without the EAD encoding, and the cost cannot be that
much greater since it's only a matter of one of the steps along the way
toward encoding.  I would pursue that route before attempting to strip the
coding out manually at your end.

However, related to this and for future planning, we are scanning
paper-based finding aids with an HP scanner (cost of $700) with Caere's
Omnipage 8.0 (an OCR program, cost about $400, less if it's an upgrade from
software included with the scanner; we paid only $100) with very good
results.  It saves the file as both a Word file and then also converts to
HTML. (You could also save it as an ASCII file too, once it was in Word.)
Last summer we employed a student to do this nearly fulltime and she was
able to scan, OCR, and convert nearly 100 pages per day.  This rate varied
according to each finding aid's "page density" but is based on 20 finding
aids totalling over 3500 pages which were completed in 37 working days.  I
mention this because the cost of the student labor is significantly less
than APEX's hourly rate, I suspect, and if you provide them with an
electronic version rather than paper, you may stretch your grant dollars
that much more.  Of course, you need to find a good student who is

Hope this helps.

David Delorenzo wrote:

> Colleagues--
> I need your advice on a problem I have encountered with our project to
> convert and encode our finding aids.
> None of our 3,000+ finding aids are available in electronic form. I have
> received a grant which I hope can kill several birds with one stone. My
> goals for the project are: 1) convert the finding aids into electronic
> form, 2) acquire an electronic text version (ASCII or something else
> that I can manipulate in a word processing software (MS WORD) and an
> HTML writer/editor (Netscape Composer)), and 3) acquire an EAD-encoded
> version.
> We have hired Apex (as we are an RLG member) to convert and encode the
> finding aids. We plan to send the EAD versions to Archival Resources.
> Because I want more flexibility for future uses of the finidng aids
> (whatever they later may be given advances in technology), I would like
> to maintain locally a text version (which for now could be manipulated
> using MS WORD). Because Archival Resources is available only
> fee-for-service, and I don't have the technical support necessary to
> maintain SGML documents, I am also planning on maintaining at our WEB
> site an HTML encoded version meeting our specifications for structure,
> etc.
> Here is the problem. Apex has provided me with a first batch of one
> hundred EAD encoded finding aids. I had hoped to be able to use the
> encoded versions in other ways by stripping them of the coding BUT alas,
> with most grand ideas, I have been unsuccessful! Of course, for more
> money (which I'd like to spend on other issues), I am sure Apex would be
> happy to resolve this matter for me. Before pursuing this option with
> the remaining 2,900 finding aids, however, I wanted to know if  there
> was a  "de-babble-izer" that I could purchase to magically remove the
> encoding.
> I am happy to pay the vendor for the deliverables I need but I wanted to
> check with you all first! I look forward to hearing from you.
> David
