Print

Print


>
> Both Word 2003 and Open Office <http://www.openoffice.org> will enable you
> to save your file as XML. So, if your Word document is well structured you
> might not be too far off your goal. Getting to XML is the easy part; getting
> to your particular flavour of XML, i.e. EAD, may be a little harder, again
> depending on how the *.doc file is structured. The good news, though, is
> that your 'just' an XSLT transform away.


Regarding file size, we have EAD instance documents as large as 4MB, and
dynamic application of a stylesheet absolutely breaks our application
(xsltproc is the XSLT engine): for any instance over 2MB, therefore, we
redirect to static HTML. A 4MB instance results in ca 2.5MB of HTML*, but
with non-significant whitespace removed this can come in at under 2MB (still
really too large for those with dialup connexions).

* the HTML is rather verbose, however, peppered as it is with <div
class="c01"> &c.

St.

Stephen Yearl
Systems Archivist
Yale University Library::Manuscripts and Archives




>>> [log in to unmask] 03/13/06 2:59 PM >>>
> I have a very large guide in Word format which I need to convert to EAD
> XML.
> I am especially concerned about the Series section as we have over 800
> boxes
> to tag.
>
>
>
> I am unable to find any way to do this (other than cutting and pasting
> from
> Word to and an XML editor) except for using Text conversion software shown
> on the EAD site http://www.loc.gov/ead/ag/agauthor.html#sec2c
>
> Is this kind of software the only answer or is there another clever
> way?  If
> commercial software is the solution, what products have been used and
> recommended?
>
>
>
> Also is there a recommendation of the length/size of an EAD guide?  I have
> some concern about download time for users.
>
>
>
> Thanks in advance for help.
>
>
>
> M.J. Figard
>
> Digital Initiatives Librarian
>
> McGovern Historical Collections and Research Center
>
> Houston Academy of Medicine - Texas Medical Center Library
>
> 1133 John Freeman Blvd.
>
> Houston, Texas  77030
>
> 713.799.7141 fax 713.790.7052
>
> NOTE new email address: [log in to unmask]
>
>
>
>
>