Geoff, I'm in agreement with many aspects of your proposal. Details below. On Fri, 29 Mar 2002, Geoff Mottram wrote: > I would like to propose an alternative declaration for the name, subject > and title elements that tackles the various issues raised with the > current implementation. It requires the general design philosophy that > if an element may contain sub-elements, it should not also contain > regular content. To make an analogy to MARC, you can have fixed fields > that only have data content and you can have variable fields that only > contain subfields, but you cannot mix the two models. This philosophy > should be applied throughout the MODS schema and will lead to data that > is much easier to process with standard software components (including > relational databases). Agree with this structural change to MODS--what other fields currently are defined that have both subelements and content? > > ---------- > NAME FIELD > ---------- > For the name field, I propose renaming it to "creator" with the > following structure: > > creator > ID (attribute) > role (attribute) > authority (attribute) > link (attribute) > typeAuthority (attribute) > typeLink (attribute) > > affiliation (element) > displayForm (element) > name (element) > type (attribute) > description (element) > type (element) > > All attributes are optional (to make record creation as easy as possible > but to allow for more sophisticated cataloging if desired) and there > must be at least one "name" element which is repeated for each level of > a hierarchical (structured) name. Dashes are not permitted and all names > must be broken out according to their hierarchical structure. The > "description" element is used for cases like "Abrams, Michael (American > artist, 20th c.)": > > <creator> > <type>personal</type> > <name>Abrams, Michael</name> > <description>(American artist, 20th c.)</description> > </creator> > > A corporate name would look like this: > > <creator> > <type>corporate</type> > <type>government</type> > <name>Library of Congress</name> > <name>National Digital Library Program</name> > <displayForm>Music Division, Library of Congress</displayForm> > </creator> > > The "link" attribute would allow a link to an authority record for the > whole name. This would satisfy Andrew's concerns about typo's in the > "name" elements and also allow for cases where the authority entry does > not match exactly the version in this field (because of additional > authority subheadings, punctuation, etc.). > > You will notice that I have made "type" an element and it is also > repeating. This allows for a multi-level approach to describing the > type of name without limiting an implementation to a fixed number of > levels. > > The "type" element would support the following values: "personal", > "corporate", "conference", etc. The list would not be closed and users > would be free to add to it. Subtypes could be defined by adding a second > "type" field that might include the following values for a "corporate" > type name: "profit", "nonprofit", "government", etc.). Subtypes could > also be used to distinguish between forms of a personal name > ("forename", "surname", "family name") but I have another suggestion for > that problem, below. > > The "typeAuthority" and "typeLink" attributes allow for the definition > of an authority list for a hierarchy of type terms and a link to a > particular entry. > > There is also a "type" attribute for the "name" element for cases where > a user would like to make some sort of distinction here. This has > greater applicability in the subject field. > > The "name" field would always be entered as it should be sorted. If > this is different from how it is displayed, a "displayForm" element > should be included. This solves the first name, last name debate and > also allows for other language related sorting situations that we have > not accounted for. For example, in Dutch phone books, the "van" at the > start of a last name is ignored in the sorting sequence because so many > people in Holland have a last name starting with "van". This is not > unlike non-filing characters in titles. My point is, this proposed > technique is the most flexible in terms of user needs without having to > anticipate what those needs are. It also avoids having to sub-divide > names any more than necessary. I think there might be both a requirement for a displayForm and also for a nonfiling technique. The displayForm might be used to give the name in non-inverted order, a shortened form, a name from an author statement, etc. while the nonfiling technique might be used to skip over the "van" in the example above, or the "al-" in an Arabic name, etc. It might be more consistent to have one non-filing technique for all MODS elements. Also, open-ended lists of "types" makes me a bit nervous about being able to validate a MODS record or a name in a MODS record. > > ---------- > SUBJECT FIELD > ---------- > The subject has an identical design, except the "name" element is called > "term", as follows: > > subject > ID (attribute) > role (attribute) > authority (attribute) > link (attribute) > typeAuthority (attribute) > typeLink (attribute) > > affiliation (element) > displayForm (element) > term (element) > type (attribute) > description (element) > type (element) > > All attributes and elements are optional except for "term", of which > there must be at least one. The "type" element might include the > following values: "personal", "corporate", "conference", "topic", > "title", "geographic", "temporal" and "classification". Subtypes might > include those needed for each of the main types. For "geographic" this > might include: "city", "continent", "country", "county", "island", > "province", "region", "state" and "territory". For "classification" it > might include: "lcc", "ddc", etc. > > It is in the subject field that the utility of a multi-level type > becomes apparent. It illustrates why even two levels of types will be > insufficient for some users. In the case of Dewey numbers, there might > be three type elements, as follows: > > <type>classification</type> > <type>ddc</type> > <type>Edition 19</type> > > The "part" element contains an optional "type" attribute to distinguish > between different types of terms, if desired. For example: > > <subject authority="lcsh"> > <term type="topic">Journalism</topic> > <term type="topic">Political aspects</topic> > <term type="geographic">United States.</geographic> > </subject> This might work well. Will it be confusing to have both an element "type" and an attribute "type"--maybe the element could be called something else--"category?" --ugh? In the case of subjects where a class number is given instead of a term, is "term" the right element or should there be another element name for classification schemes that aren't composed of words. > > ---------- > TITLE FIELD > ---------- > With regards to the title field, it needs to be redesigned to support > non-sorting characters in a manner that would be easy to implement with > off-the-shelf software. The idea of a pair of non-sorting character > codes as defined in MARC-21 is an interesting solution but one that is > highly specific to the library market. I would like to suggest an idea > that would have worked just as well in pre-MARC-21: a separate subfield > for the leading article. Thus the title field would be defined as > follows: > > title > ID (attribute) > role (attribute) > authority (attribute) > link (attribute) > typeAuthority (attribute) > typeLink (attribute) > > part (element) > nonsort (element) > type (element) > > The "title" element contains most of the same attributes as the "name" > and "subject" fields with the same meanings. The "part" element is > required and every title must contain one or more "part" elements. The > "nonsort" element is used to surround the non-filing portion of the > title as in the following example: > > <title> > <nonsort>The</nonsort> > <part>Unbearable Lightness of Being</part> > </title> > > Notice how easy it is to sort titles in this format -- you just sort the > "part" elements. However, when displaying the title, you don't suppress > this data. Notice also, that you can sprinkle the "nonsort" element > throughout a title (although I can't think of an application for this > yet). > > An alternative definition for the title field would use a "displayForm" > element (as in the name and subject fields) instead of the "nonsort" > element. This would have the advantage of consistency, if nothing else, > and will also support other strange sorting situations we cannot > anticipate. > > ---------- > UNIVERSAL FIELD > ---------- > A final possibility would be to create a single field definition for > creators, subjects and titles as follows: > > creatorSubjectTitleType > ID (attribute) > role (attribute) > authority (attribute) > link (attribute) > typeAuthority (attribute) > typeLink (attribute) > > affiliation (element) > displayForm (element) > part (element) > type (attribute) > description (element) > type (element) > > There would still be separate "creator", "subject" and "title" elements > but they would all share the same list of attributes and elements. Note > that I have use the generic term "part" to contain the name, subject, or > title, as in: > > <creator> > <type>corporate</type> > <part>United States</part> > <part>Dept. of Agriculture</part> > <part>Economics, Statistics, and Cooperatives Service</part> > </creator> > <title> > <part>Asia agricultural situation; review and outlook</part> > </title> > <subject authority="lcsh"> > <part type="topic">Agriculture</part> > <part type="topic">Economic aspects</part> > <part type="geographic">Asia</part> > <part type="topic">Periodicals</part> > </subject> I don't see any great advantage of the UNIVERSAL field approach over what's been outlined for the three fields already. > > Whether you like this proposal or not, would you please comment on it. > LC can't gauge the popularity of any of the comments on the MODS list > without more participation. > > Thank you. > Geoff Mottram > [log in to unmask] > And thank you, Geoff for your thoughtful and thought-provoking contributions. Dick Thaxter *=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*==*=*=*=*=*=*=*=*=*=*=*=*=*=* * Dick Thaxter [log in to unmask] 202 707-7208 * * Automation Specialist * * Motion Picture, Broadcasting & Recorded Sound Division * * Library of Congress * * The usual disclaimers apply * *=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=