I just finished reviewing the Rostovzeff finding aid which
Berkeley posted on its ftp site last week. Thanks Daniel (and
Alvin, Campbell, and Gabriela) for giving us a more complex
example to ponder. I mentioned in a separate message to Daniel
that I had some questions about the use of specific tags, and he
encouraged me to post my comments on the list. I don't think
that either he or I anticipated such a long list of comments, but
here it is--the good, the bad, and the trivial.
Mostly I confined my remarks to issues of tag selection and usage
and decided to leave questions relating to <tspec>s, Panorama,
and other formatting issues for later discussion. (Daniel
promises to give us a report on Berkeley's experiments with
different approaches.) I realize that the absence of a tag does
not mean that the Berkeley folks were unaware of its existence.
They are undoubtedly experimenting with levels of tagging in the
same way that I and others at LC are doing. Some of my comments
are definitely "nit-picking," but others are intended to
highlight the options available in various circumstances. I hope
that this is of interest to the list.
Janice E. Ruth
Library of Congress
A. EAD Header
1. I was surprised that you elected not to include more
information in the header. You included only the title
of the finding aid, and you were required to open four
tags in order to capture this one piece of information.
I probably would have included more of the information
from the title page (and from elsewhere in the finding
aid), especially since I would have already opened the
<titlestmt> and <filedesc> elements to include the
<titleproper>. Although the dtd currently requires
only a <titleproper>, it strikes me that we may want to
develop some consensus about the type of information
that everyone will provide in their headers. For
example, I might have included the following for
Rostovzeff (this of course assumes that I have
understood correctly the use of the <eadheader>
<titleproper> Register of the Papers of
Michael Ivanovitch Rostovzeff
<author> Daniel Daily assisted by Sallie
Locke and Brent Johanson </author>
<publisher> Special Collections Library,
Duke University </publisher>
<address> <addressline> Durham, North
Carolina </addressline> </address>
<date> November 20, 1992 </date>
<creation> Finding aid encoded by [Alvin
Pollack? Campbell Crabtree? Gabriela
Montoya?], University of California,
Berkeley, 1996. </creation>
<langusage> Finding aid is written in
<language> English. </language> The
archival materials in the Rostovzeff
Papers are written in <langmaterial>
English, Latin, French, German, Spanish,
Russian, and Italian. </langmaterial>
2. Where do you record the name of the person who encoded
the finding aid and the date when the encoding was
done? Should the November 22, 1992, date be part of
the author statement, and the <publicationstmt> <date>
be reserved for the date of encoding? Or should we
leave as written above, and put details about the date
and name of encoder under <profiledesc> <creation>, or
perhaps under <revisiondesc> <change>?
B. Title Page
1. I am still a little unclear about the role of the title
page since much of the same information is captured in
the <eadheader>. I thought that the purpose of the
<titlepage> element was to facilitate formatting of the
information for printing or other means of display. If
that is correct, don't you need to use linebreaks <lb>
to achieve a desired look?
2. I noticed that you moved information about the author
and date from the last page of the finding aid to the
title page. You retained the headings "Processed by:,"
"Assisted by:," and "Date completed:" by enclosing them
within <label> tags inside a <list>, inside a <p>.
Would it have been possible to achieve the same result
by typing those labels as text contained within
<author> and <date> content tags? For example:
<author> Processed by: Daniel Daily <lb>
Assisted by: Sallie Locke and Brent Johanson <lb>
<date> Date completed: November 20, 1992 </date>
3. I suppose the identification of authorship will vary
from one repository to another, but I wonder whether
Duke would prefer to list individual staff members as
authors and the institution as the publisher.
Regardless of that decision, however, I would only use
one <author> element for a corporate entity, i.e., I
probably would not list, as you did, Special
Collections Library as an <author> and Duke University
also as an <author>. I would type <author> Special
Collections Library, Duke University </author>.
4. What does "©:" mean? It appears just before the
<titlepage> element is closed.
C. Descriptive Identification (<did>)
I just wanted to point out that the use of the label
attribute is optional. The tagged text may appear without
the heading or label preceding it.
D. Administrative Information (<admininfo>)
1. Again, the use of headings is optional. I imagine that
my division might use a head for each major section of
the finding aid, e.g., Administrative Information,
Biographical Note, Scope and Content Note, etc., but
skip the use of headings for paragraphs within those
sections, e.g., Provenance and Copyright under
2. Do you really want a period after Administrative
Information in the heading?
E. Biographical Note
1. You used the <chronlist> just as I imagined it would be
2. I noticed that you did not tag published works as
<title>s nor indicate that they should be rendered in
italics. Was this a conscious decision to minimize
tagging, a reflection of the bug in the <emph> element,
or an oversight?
F. Scope and Content Note
1. The word "note" was omitted from the heading.
2. The titles of the series and subseries appeared in
boldface in the original printed version. I presume
that this could have been replicated through the <emph>
element if <emph> accepted pcdata as intended. Also, I
presume that you could put in <ref> tags to link a
section of the Scope and Content Note to the related
section of the Container List.
G. Combined Series Description/Container List
1. If desired, you could have added level attributes: <c01
level=series> and <c02 level=subseries>.
2. I believe that "n.d." should appear within the
<unitdate type="inclusive"> tags.
For example, you typed <unitdate type="inclusive">
1918-1968, </unitdate> <unitdate type="bulk"> n.d.
(bulk 1926-1954). <unitdate>
I think it would be more accurate to type <unitdate
type="inclusive"> 1918-1968, n.d. </unitdate> <unitdate
type="bulk"> (bulk 1926-1954). <unitdate>
3. I dislike having to open a new <p> to use the
<arrangement> element within the <scopecontent>
element. The arrangement statement often consists of
no more than a single short sentence or phrase. As
originally written, the scope notes in the series
descriptions of the Rostovzeff finding aid are one
paragraph each. In the encoded version, they have been
divided into at least two paragraphs so that the
<arrangement> element could be used. The result is
several one-sentence paragraphs. To remedy this, I
guess that we would have to add <arrangement> to the
list of elements that may be used within a <p>.
4. I am not sure I agree with the use of <origination>
within the folder headings for the named individuals in
the Individuals Subseries (Boxes 1-2), but I think I
understand why you chose it. The definition for
<origination> is "the individual or organization
responsible for the creation or assembly of the
materials. . ." Since the Individuals Subseries
includes incoming as well as outgoing letters, the
<origination> element seems inaccurate. Also, I would
think that you would want to use the tag <unittitle>
for consistency with the other folder headings. I
assume that you chose not to use <unittitle> because
the only elements available under <unittitle> are
<unitdate> and the Basic Phrase-Level Elements (cross
reference elements, linking and formatting,
abbreviation, and expansion). In other words, the dtd
currently does not allow the following, which I happen
to think is a more accurate content description:
<drow> <dentry> <unittitle> <persname> Arangio-Ruiz,
Vincenzo, </persname> <unitdate> 1938, Feb.-1953, May
</unitdate> </unittitle> </dentry> </drow>
To remedy this, we may want to make available under
<unittitle> All the Phrase-Level Elements.
5. Container numbers were mistagged <extent> rather than
<unitloc type="container" on two occasions. See the
entries for "Wilammitz" in Box 2 and "Russian" in Box
6. Did you intend to render the <title>, The Cambridge
Ancient History, in "bolditalics" or did you want
7. I noticed that in the original finding aid, the
container number is only recorded once, even if it
spans over several series. In the encoded version, a
container location was given each time a new series was
described. Was this something the dtd required or
simply a formatting preference of the encoder? We have
not yet tried the combined <dsc> in my division, so I
have not run into this situation before.
8. For consistency with the other series, you may want to
add a type attribute (value= "inclusive") to the
<unitdate> in the <did> under "Miscellaneous Series."
9. When would we use the <dogrp> element? I wonder if the
entityref="rostovt3" and entityref="rostovt4" make up a
<dogrp> or just two separate <do>s? (See Pictures
10. Perhaps we need another value under the <unitdate> type
attribute for "undated". Currently, the values are
"inclusive," "bulk," "single," and "questionable." I
don't think that we intended "questionable" to be used
for undated material, but rather for items with a
supplied or questionable date.
H. Oversize Materials
I was very interested to see how you handled the oversize
materials. For reasons which are not altogether clear to
me, I and others in my division usually treat oversize
materials as a separate series and list them as such in the
series description and container list. Intellectually this
is a bit of a problem, and I can foresee tagging
complications where a <c01> series in the main container
list becomes a <c02> subseries in the Oversize list because
we tag the <unittitle> "Oversize" as <c01>.
By removing the Oversize Materials from the combined <dsc>
for Rostovzeff, you avoided this problem but still ran into
a tagging problem. You opened a <c01> element but do not
appear to have identified it. It strikes me that the <c01>
tags in the Oversize list should correspond to the <c01>
tags in the combined <dsc> since they are indeed the same
series. Thus, the Oversize list would contain only <c01>s
(Correspondence Series and Miscellaneous Series) and <c02>s
(Individuals Subseries, Biographical Materials Subseries,
Writings Subseries, Financial Papers Subseries, and General
Subseries) since the description does not extend beyond the
subseries headings. There are no <c03>s listed.
That's all folks!