This is a delayed response to Michael Fox's post of March 10 (appended below). After a lot of thought and discussion about the <physloc> and <container> issue with the other members of Harvard's Digital Finding Aids Project (DFAP), I offer the following feedback, and a new proposal. The first point is about equating box and folder numbers with information about the contents of a file such as date or form/genre. In the examples of content that Michael describes, he compares: Correspondence, 1900-1910 with: Box 1 1 Adams We would argue that while, in some ways, these are similar types of information, in a fundamental way they are not. Namely, the two parts of "correspondence, 1900-1910" will always be linked. "Correspondence" will never become "flyers," for example. On the other hand, "Box 1" is merely a description of where certain materials happen to be housed at a specific moment in time. If WE separate "Box 1" from "Adams" by reboxing the collection, "Adams" will be separated from "Box 1" and become associated with "Box 2" instead. Secondly, DFAP participants discussed both of the options Michael proposes for markup. The second option, that of adding the box number to every single file description in the finding aid, is exactly the type of extra keystroking that we have decided not to employ. Especially in light of the firestorm about how "EAD is forcing us to do a lot of extra keying in creating our finding aids," we would like to avoid that option. The first option (including <container> within a file's <did>) doesn't seem quite right, either. From the tagging Michael posted WE would surmise that Box 1 contains only one file: <c LEVEL="file"><did><container>Box 1</container> <unitid>1.</unitid><unittitle>[Adams, 1934]</unittitle></did></c> <c LEVEL="file"><did><unitid>2.</unitid><unittitle>Albany Literary Gazette [1934]</unittitle></did></c> The tag library for the Beta version of EAD says that <did> "bundles eight elements identifying fundamental descriptive information needed to identify the ...<c> being described...." Since the note about the contents of a box is not a *component of* the unit being described (it is a container that houses it), it seems to me that the use of <did> here is not valid (although admittedly it will validate). Isn't it misleading to place container information within a <c> that describes one file, when the container contains many files? Finally, we do not think the discussion on this topic so far has addressed all of DFAP's concerns about the issue. Michael says that "In having to chose between the two, EAD has privileged intellectual structure over the physical ...." While we agree with the basic decision, we wonder whether there might be a way to describe the physical that doesn't conflict with the intellectual, rather than having to "choose" between the two. We have looked into the possibility of using the <odd> tag for the purpose of inserting information about containers wherever they happens to fall in the finding aid. Currently the DTD requires that <odd> *must* follow the <did> and precede any nested <c>s. Perhaps altering the DTD slightly to allow the use of <odd> interspersed with the various levels of <c>'s would solve the problem. It would allow, for example, for the following: <DSC><HEAD>INVENTORY</HEAD> <C01 level="collection"><DID></DID> <ODD><P>Box 1</P></ODD> <C02><DID><UNITID>1.</UNITID><UNITTITLE>Things</UNITTITLE></DID></C02> <C03><DID><UNITID>2-4:</UNITID><UNITTITLE> Contents of wooden box</UNITTITLE></DID> <C04><DID><UNITID>2.</UNITID><UNITTITLE>Stuff</UNITTITLE></DID></C04> <ODD><P>Box 2</P></ODD> <C04><DID><UNITID>3.</UNITID><UNITTITLE>Tiny stuff</UNITTITLE></DID></C04> <C04><DID><UNITID>4.</UNITID><UNITTITLE>Love note</UNITTITLE></DID></C04></C03> <ODD><P>Box 3</P></ODD> <C02><DID><UNITID>5.</UNITID><UNITTITLE>Stuff</UNITTITLE></DID></C02> </C01></DSC> In this example, we have a variety of <c>s, some nested, but all are in the <c01> which represents the entire group. The current DTD will validate only Box 1 (because it immediately follows a <did>), but not the others. [If you find the above example difficult to follow, I suggest you take a look at one of our finding aids (available in either HTML or SGML), such as Helen Buttenwieser, available in the Beta version of the DTD from http://findingaids.harvard.edu (look under Schlesinger Library).] We propose the following change to the DTD to allow for the above tagging. Change from: <!ELEMENT c ((head?, did, (%m.desc.elems;)*, (thead?, c+)*) | (drow+, c*))> To: <!ELEMENT c ((head?, did, (%m.desc.elems;)*, (thead?, c+, odd?)*) | (drow+, c*)) > This change would allow one to include the information that one needs to include about physical location (that is not intellectually wedded to descriptive data) into the EAD document relatively easily, without compromising the integrity of the DTD (in light of the above-mentioned decision about intellectual vs. physical). Hopefully there will be some discussion on this list about our suggestion; we do expect to submit it formally and we value the input of any and all EAD users out there. Susan von Salis Schlesinger Library Radcliffe College [log in to unmask] ============================= Original post: Leslie Morris asks a very important question. How to encode the following example. Example A: > File List >Box 1 > 1. [Adams, 1934] > 2. Albany Literary Gazette [1934] > 3. Alden > 4. American Council >Box 2 > 5. American and Foreign Anti-Slavery Reporter [1934] > 6. Amesbury Villager [1934-36] This issue is extremely important because it goes directly to a fundamental structural concept in EAD. There is an inherent tension in container listings between hierarchies of intellectual order (collection, series, file, item) and hierarchies of physical organization (boxes and folders). This topic was extensively analyzed during the developement of EAD, has been the topic of numerous communications on this list, is raised at every EAD workshop, and, I hasten to convey to Leslie, was carefully reconsidered by the EAD Working Group during its meeting last Fall when changes to the DTD for version one were considered. In having to chose between the two, EAD has priviledged intellectual structure over the physical for many good reasons that need not be rehashed here. But that is not to suggest that there is no relationship between the two. Box and folder numbers are, after all, characteristics of a particular file just as the title and date are. >Harvard's desire to be able to insert container numbers AT ANY POINT WITHIN THE FINDING AID suggests that this data is just some sort free-floating, disembodied information that has no structural relationship to the rest of the inventory description. This is not correct. Container data relates precisely and significantly to other descriptive data. In fact, such container information makes no sense at all except in relation to other descriptive elements. Consider this recasting of Leslie's sample. Example B: Container Id Contents >Box 1 1 Adams >Box 1 2 Albany Literary Gazette >Box 1 3 Alden >Box 1 4 American Council Box 2 5 American and Foreign Anti-Slavery Reporter Box 2 6 Amesbury Villager There are two differences between examples A and B. One has to do with presentation on the page. The other is more interesting and significant. In example A, the researcher is asked to infer that Adam and what follows is in Box 1 until one comes to another implicit statement that what follows after American Council is in Box 2. The structural relationship between the box number and the ID and title data that follows is exactly the same in both examples. Except that in one it is implicit and in the other it is spelled out. The only real difference is in presentation. This is what EAD is about- content and structure, not presentation. Inventories are full of examples of such implicit inheritence. Example C: Correspondence 1900-1910 1911-1915 1916-1920 Subject Files 1911-1912 1913-1917 1918-1920 This really means the same as Example D: Correspondence, 1900-1910 Correspondence, 1911-1915 Correspondence, 1916-1920 Subject Files, 1911-1912 Subject Files, 1913-1917 Subject Files, 1918-1920 There is a fundamental, structural relationship between the <container> element and other descriptive data such as <unittitle>. Page presentation tends to mask that association, but it is there. In our discussions about encoding here at the Minnesota Historical Society, most of our problems have been in analyzing and understanding legacy finding aids, in sorting out the kinds of implicit understandings that we have tried to convey to the user through what are to us very obvious but what must be to others often very subtle distinctions about the relationships of different materials based on physical evidence on the finding aid page. Finally, let me respond by offering two examples of encoding of Leslie's example. The first was written by Kris Kiesling. Example E: <dsc TYPE="in-depth"> <head>File List</head> <c LEVEL="file"><did><container>Box 1</container> <unitid>1.</unitid><unittitle>[Adams, 1934]</unittitle></did></c> <c LEVEL="file"><did><unitid>2.</unitid><unittitle>Albany Literary Gazette [1934]</unittitle></did></c> <c LEVEL="file"><did><unitid>3.</unitid><unittitle>Alden</unittitle></did>< /c> <c LEVEL="file"><did><unitid>4.</unitid><unittitle>American Council</unittitle></did></c> <c LEVEL="file"><did><container>Box 2</container><unitid>5.</unitid> <unittitle>American and Foreign Anti-Slavery Reporter [1934]</unittitle></did></c> <c LEVEL="file"><did><unitid>6.</unitid> <unittitle>Amesbury Villager [1934-36]</unittitle></did></c> <c LEVEL="file"><did><unitid>7.</unitid><unittitle>etc. etc.</unittitle></did></c> Here's another option that some people who have attended our workshops seem to like. Example F: <dsc TYPE="in-depth"> <head>File List</head> <c LEVEL="file"><did><container>Box 1</container> <unitid>1.</unitid><unittitle>[Adams, 1934]</unittitle></did></c> <c LEVEL="file"><did><container>Box 1</container> <unitid>2.</unitid><unittitle>Albany Literary Gazette [1934]</unittitle></did></c> <c LEVEL="file"><did><container>Box 1</container> <unitid>3.</unitid><unittitle>Alden</unittitle></did></c> <c LEVEL="file"><did><container>Box 1</container> <unitid>4.</unitid><unittitle>American Council</unittitle></did></c> <c LEVEL="file"><did><container>Box 2</container><unitid>5.</unitid> <unittitle>American and Foreign Anti-Slavery Reporter [1934]</unittitle></did></c> <c LEVEL="file"><did><container>Box 2</container><unitid>6.</unitid> <unittitle>Amesbury Villager [1934-36]</unittitle></did></c> <c LEVEL="file"><did><container>Box 2</container><unitid>7.</unitid><unittitle>etc. etc.</unittitle></did></c> The reason for the explicit markup of container numbers in Exaple F has to do with an anticipation of issues that might arise with retrieval and display of the inventory. If a search finds a match in the item "Amesbury Villager," the system can retireve the necessary descriptive data from the <c> that wraps up that item's information except for its location which it inherits implicitly in examples A and E from a sibling. This is very different from examples C and D where the dates inherit data from their explicitly encoded parents. Now some of the new linking aspects of version 1.0 of EAD will make it possible to make the connections in Examples A and E with a bit of encoding and programming, but it seems to many to be clearer to explicitly code the information even if one uses the stylesheet to suppress the actual display of all but the first instance. Of course, if one were to make containers free-floating and unconnected to the item descriptions as Harvard's proposal would do, would make it impossible to pull this information together at all. Michael Fox