Michele R.,

1. We have a finding aid manual for data collection methods, display,
and standards of bibliographic description. There is also an index of
example pages from finding aids.

2. Data collection is usually done in word processors (WordPerfect).
However, there are some collections that are done in Access
databases. Special characters (multi-byte) are chosen from the
character-map utility. Arial Unicode MS is used as the font for mdb
files. Times New Roman is used for WordPerfect. If MS Word is used,
Arial is the font.

3. Each finding aid goes through a review process of experienced
archivists in order to comply to the standards (see 1).

4. Coding can be done only an experienced finding aid creator and
collection processor can interpret such a document. I code the
finding aids. I have over 10 years experience with archival
processing and I am one of the co-editors of the finding aid manual.
I also process the word processor documents into their final form.
The hard copy is now a PDF (created using XSL-FO from the EAD 2002
XML document). I write the stylesheets and have created the workflow
for the finding aids.

5. I have developed macros that tag the finding aid container list in
an XML tag set I developed based on MARC 21 standards. This method is
quick and efficient. The finished container list then is converted
using Unicode tools into UTF-8 (0020 to 007E). Other steps are taken
for cross references, etc. The MARC record is brought into the
workflow and constitutes the front matter, excepting any finding aid
information (scopecontent | bioghist, etc.) A stylesheet then renders
the XML into the current EAD tag set. Once in EAD 2002 XML there is
one stylesheet for HTML and one for XSL-FO (PDF). These stylesheets
are based on the Library of Congress best practices. I have created
them to work with 99% of the valid EAD 2002 XML finding aids.

I have created a demo of this process (though a bit dated):
  Maximize browser window
  Paste in URL
  Temporary window opens for 20 seconds
  Click on "Next Slide" bottom right.
  Refresh to begin again

6. Critique of data collection:

a) My preference is that multi-format material be described in a word
processor. The word processor I would choose (because of budget,
availability and experience of data collectors) is Notepad. If you
have an updated Windows operating system (2000, XP, etc.), then this
will work. You should choose font as Arial Unicode MS, and use the
character-map utility for muti-byte characters. Everything else is
the same. Sure you don't get the fancy formatting options, but data
collection does not really need them.

b) Databases. I think databases can do "almost" anything. Your
database is usually only as good as your support staff has designed
it to be. I think that most offices have only workstation database
software. These programs are really toys. They are very unstable and
can be easily corrupted. (Are they on general use PC's?) Working over
a network can really screw up your database unless you are configured
correctly. Such configuration takes time and support and mostly
money. Thus, we have only workstation databases.

c) Learning curve. Learning bibliographic description according to
AACR2 and SAA is difficult enough, but then learning to use the
database as well? Strictly speaking, data collection is easily
relegated to word processors. Easily entered, edited, printed, viewed
and the format (tabs, bold, italic, etc.) is more universally
understood. Databases usually result in a mess, unless you have staff
to QC all the button pushing and other user errors (not to mention
the nasty auto grammar issues: auto caps; replace ellipses with
special characters, etc. You fixed those... Whoops they updated the
software...). Databases are a lot of time to set up and configure
(who has admin rights?), and more time to correct and clean up.
Further, their are many parts of a finding aid that simply cannot be
carried into any reasonable practice in a database (scopecontent,
etc.). (Finding aids are usually hierarchical, databases flat.)

d) Coding. I would completely separate coding from data collection.
The coder should be coding datatypes using a standard (MARC 21). The
coder should clearly understand the elements of the EAD tag set as
they relate to these datatypes. The coder should attempt to establish
consistent methods in approaching difficult if not impossible
conversion issues (double titles when uniform titles are given; use
of attributes PERSNAME/@NORMAL, etc.). Mostly, the coder needs to be
aware of the stylesheet that will transform the EAD into some display
markup in order to get the most out of the coding methods developed.
This is not a weekend experience for anyone, even experience
archivists. A best practices is really needed. The Library of
Congress best practices document is awesome.

e) Archiving documents. This document workflow requires a backup
system (separate originals from working copies). My goal was to
create a single master document for each finding aid. I have achieved
that in my workflow now that I can produce PDFs from the EAD. These
documents will need revision, correction, and sometimes additions.
How will this be done, and by whom? Who will hold the archived
electronic documents, and where will they be stored? I regularly burn
CD-ROMs of these finding aids (XML). My experience is that most
institutions don't have a backup system at all. If there is one, it
is hardly comprehensible to anyone outside the immediate backup
person. Finally, if that "person" is "not here today", there is no
paper trail, nor any second person that can produce any archive
copies for uploading (in case of virus, etc.).

These are reflections on my experience as a coder for the past 3-4

Again, I profited greatly from Daniel Pitti's class.

Mike Ferrando
Library Technician
Music Division
Library of Congress
Washington, DC

--- MicheleR <[log in to unmask]> wrote:
> Hello -- another question, this one regarding process of encoding
> finding
> aids.  I'm interested in (a) who does the encoding and (b) at what
> point in
> the process the encoding is done.  For example, I can see at least
> three
> options right off the bat:
> a) finding aids are originally written in EAD using authoring
> software (e.g.
> XMeTaL)
> b) finding aids are written in regular form (MSWord, etc) and then
> encoded
> at the end by the processor
> c) finding aids are written in regular form (MSWord, etc) and then
> encoded
> at the end by a dedicated encoder (either in-house or out-sourced)
> How do most people do it?  What pitfalls have you encountered with
> the way
> you chose?
> Any and all information is appreciated.  We're investigating
> starting this
> process and would like to benefit from combined wisdom as much as
> possible!
> Thanks
> Michele Rothenberger
> ---
> Outgoing mail is certified Virus Free.
> Checked by AVG anti-virus system (
> Version: 6.0.778 / Virus Database: 525 - Release Date: 10/20/2004

Do you Yahoo!?
Yahoo! Mail Address AutoComplete - You start. We finish.