> > Both Word 2003 and Open Office <http://www.openoffice.org> will enable you > to save your file as XML. So, if your Word document is well structured you > might not be too far off your goal. Getting to XML is the easy part; getting > to your particular flavour of XML, i.e. EAD, may be a little harder, again > depending on how the *.doc file is structured. The good news, though, is > that your 'just' an XSLT transform away. Regarding file size, we have EAD instance documents as large as 4MB, and dynamic application of a stylesheet absolutely breaks our application (xsltproc is the XSLT engine): for any instance over 2MB, therefore, we redirect to static HTML. A 4MB instance results in ca 2.5MB of HTML*, but with non-significant whitespace removed this can come in at under 2MB (still really too large for those with dialup connexions). * the HTML is rather verbose, however, peppered as it is with <div class="c01"> &c. St. Stephen Yearl Systems Archivist Yale University Library::Manuscripts and Archives >>> [log in to unmask] 03/13/06 2:59 PM >>> > I have a very large guide in Word format which I need to convert to EAD > XML. > I am especially concerned about the Series section as we have over 800 > boxes > to tag. > > > > I am unable to find any way to do this (other than cutting and pasting > from > Word to and an XML editor) except for using Text conversion software shown > on the EAD site http://www.loc.gov/ead/ag/agauthor.html#sec2c > > Is this kind of software the only answer or is there another clever > way? If > commercial software is the solution, what products have been used and > recommended? > > > > Also is there a recommendation of the length/size of an EAD guide? I have > some concern about download time for users. > > > > Thanks in advance for help. > > > > M.J. Figard > > Digital Initiatives Librarian > > McGovern Historical Collections and Research Center > > Houston Academy of Medicine - Texas Medical Center Library > > 1133 John Freeman Blvd. > > Houston, Texas 77030 > > 713.799.7141 fax 713.790.7052 > > NOTE new email address: [log in to unmask] > > > > >