On Sat, 2003-12-20 at 10:55, Bruce D'Arcus wrote:

> I guess my question then becomes, what is the logic that gets from MARC
> to MARCXML to MODS (which is ultimately what I'm interested in)?  In
> this example:
> 100 1#$aChurchill, Winston,$cSir,$d1874-1965.
> ...would this end up as so in MODS?
> <name type="personal">
>         <namePart type="termOfAddress">Sir</namePart>
>         <namePart>Churchill, Winston</namePart>
> </name>

Presumably, yes. Although when parsed from MARC the term of address may
well come after the name, since that's how it's done in the library
catalog. MARC names provide not only the name and name parts, but the
elements are arranged in the order in which the names will be sorted. I
have a personal gripe with forcing the sort order into the MARC record,
but that's how it is.

Oh, and note that the MARC record would have a comma after "Winston",
<namePart>Churchill, Winston,</namePart>, but only because it is
followed by another subfield containing "Sir". Otherwise it would end in
a period. Including the punctuation in the subfields is another area of
MARC that gets criticism. The European version of MARC, called UNIMARC,
doesn't include the punctuation in the text but generates it
automatically based on the subfielding.

> Also, I wonder if there ought to be support in MODS for "numeration"?

Like "III" meaning "the third"? Those go into $c along with things like
"Sir." Least when that's an add-on to a name, like Tom Jones, III.
Generally, things like III and Jr. are left off of names in library
catalogs because that isn't really a legal part of the name. The
numeration is used for Pope John Paul II or Charles IV, where the
numeration is an actual part of the name.

In other words, names as recorded in library catalogs don't always
reflect general practice. They are "authoritative" and have been
manipulated based on a certain set of rules. You won't find a one-to-one
correspondence between names in a library catalog and names in a general

> I'm just trying to understand how to get names like the following
> properly coded to they can be reliably formatted for
> citation/bibliographies:
> Dr. Jennifer Jones
> Jane Smith Jr.
> John Q. Whoever III
> Baron von Hausman
> The first is easy.  The other less so.

Yes, and to do it correctly would take artificial intelligence. You can
write algorithms to parse out some of the elements, like "Jr" and "Mr.",
you can link a "von" to the family name, but there are some names that
only a person, with knowledge of the context (including the language
being used), can figure out. It turns out that figuring out names takes
up to half of the time used by library catalogers when cataloging a

If you have a limited context you can do pretty well. But a universal
name parser may be on the order of machine translation of poetry. ;-)


> Bruce
Karen Coyle
Digital Library Specialist
Ph: 510-540-7596 Fax: 510-848-3913