Print

Print


Roy, this is *totally* sensible in terms of sorting, and a solution I
would much prefer. There's a recent change to MARC, which I believe the
LC folks were trying to accommodate with nonSort, but I think your
solution works for it as well. The MARC change states that in the future
(i.e. when we all get around to implementing it, which no one has yet,
as far as I know) there will be two distinct characters that you place
around non-sort elements in a field, i.e. (and I'm using the braces but
the characters are new values and do not have a display form...), and
non-sort will no longer be limited to the beginning of the field. 

  {The }way we were.
  Hamlet, {the }Prince of Denmark

This could easily be turned into your sort field based on those
characters. It couldn't as easily be turned back into the MARC field,
but since MODS is not as detailed as MARC the round-trip aspect has
already been lost, hasn't it? 

Also note that in the early implementations of the online catalog at U
of California we used a table to determine the non-sort beginnings of
titles (coordinated with the language code in the MARC record) and it
was quite accurate. So when people say: I don't want to have to key the
title twice (sort and non-sort) I always answer than an algorithm can do
it for them and it get right in most cases, and they'd just have to make
modifications for the really odd titles or when the language isn't
known. So much of this now can be done "on the fly", including inverting
names, that we shouldn't be afraid of repetition. And it is much easier,
as you say, to go out and grab a sort title than to rummage through
attributes, and I bet it will be much less prone to error.

As for the punctuation, you may have heard a cheer go up among the
Unimarc folks when they read your post. This is a long-standing conflict
between how Unimarc and MARC handle fields. Unimarc does more detailed
coding and inserts the punctuation at the time of display; MARC includes
the punctuation in the field. The former is much more sensible; the
latter is a huge "legacy" problem. (Here we could bring out the old saw
of "the future is longer than the past" but when that came up at the
recent MARC standards committee a voice in the back of the room chimed
in: "Not for some of us." ;-)

kc

On Thu, 2004-01-22 at 16:14, Roy Tennant wrote:
> Call me crazy, and probably one or the other of us will regret me 
> wading in here, but for the life of me I can't figure out why something 
> like this wouldn't work, and it strikes me as being less prone to 
> problems (but I could be overlooking some which I'm sure someone will 
> point out).
> 
> <titleInfo>
>         <title>The Best of Times</title>
>         <titleSort>Best of Times</title>
>         <titleSub>An Essay on Entertaining</titleSub>
>         <titleAbbrev>Best of Times</titleAbbrev>
>         <titleTrans lang="ger">German title here</titleTrans>
> </titleInfo>
> 
> <titleInfo> could be required to have <title>, with everything else 
> being optional. The benefit of this from a processing viewpoint is that 
> it is very easy -- if you are sorting, and a <titleSort> exists, just 
> grab and go; if it doesn't, grab <title> and go. I think constructs 
> like this:   <mainTitle nonSort="The ">Best of times</mainTitle> are 
> just asking for trouble. As much as possible, I think we need to strive 
> for a spec that makes it darn difficult to flub it up -- either from 
> the "data entry" viewpoint (realizing that many of these won't 
> necessarily be keyed in), or the processing viewpoint.
> 
> Also, punctuation, such as a colon between the title and subtitle 
> SHOULD NOT be there, unless you wanted to do something like 
> <titleDisplay>Title: Subtitle</titleDisplay>. The problem with 
> including punctuation is that there are times when you don't want it, 
> then what do you do? For example, see the way we handle titles and 
> subtitles on this book: <http://ark.cdlib.org/ark:/13030/ft1s20045x/>. 
> Because MARC includes the punctuation, we have to use the title and 
> subtitle from a different source. I think that's one problem from MARC 
> days we would do well to leave behind. Allow punctuation to be put in 
> at the time of processing, unless it is unalterably a part of the 
> information being captured.
> Roy
> 
> On Jan 22, 2004, at 3:46 PM, Karen Coyle wrote:
> 
> > On Thu, 2004-01-22 at 09:48, Bruce D'Arcus wrote:
> >
> >> Can you give a marked up example Karen?  One of the nice things about
> >> the current situation is that you can have:
> >>
> >> <titleInfo>
> >>     <title>A Long Title that Could be Shortened</title>
> >>     <subTitle>Subtitle</subTitle>
> >> </titleInfo>
> >> <titleInfo type="abbreviated">
> >>     <title>A Long Title</title>
> >> </titleInfo>
> >
> > I'm thinking along these lines:
> >
> > Create a <title> complex element that has
> >   <mainTitle>
> >   <subTitle>
> >   <partNumber>
> >   <partName>
> >   <nonSort> [although below I'll talk about how I would rather see
> > nonSort handled]
> >
> > Create a <titleOther> element that extends <title> and adds the
> > attributes "abbreviated", "translated", etc.
> >
> > Then if there was a desire to make <title> a required element, that
> > could be done by changing the definition of modsType or someone could
> > extend modsType to make it more "strict" (as Andy has pointed out).
> >
> > It probably ends up being 6 of one and half dozen of another, but for
> > some reason it feels cleaner to me this way.
> >
> > As for nonSort, just to be contrary, THAT I would like to see as an
> > attribute. I'm uneasy with nonSort just floating around amid a bunch of
> > other elements. My definition would limit it to the beginning of the
> > string. So:
> >
> >   <title>
> >     <mainTitle nonSort="The ">Best of times</mainTitle>
> >     <subTitle nonSort="an ">essay on entertaining</subTitle>
> >   </title>
> >
> > This is all very debatable. Some folks want to do nonSort in the middle
> > of a string (and I'm making up all of these examples, so don't get too
> > hung up on them):
> >
> >    <nonSort>an </nonSort>
> >    <subTitle>essay</subTitle>
> >    <nonSort> on </nonSort>
> >    <subTitle>entertaining</subTitle>
> >
> > I'd rather see that as:
> >    <subTitle nonSort="an ">essay</subTitle>
> >    <subTitle nonSort=" on ">entertaining</subTitle>
> >
> > although in fact I'd prefer that people not do nonSort designations
> > within an element. I think we get into all kinds of dangerous ground
> > there.
> >
> > Among the reasons that the free-floating nonSort worries me is that
> > implementations may not retain the spaces in the elements (whereas they
> > are more likely to in a quoted string), and I think it's easier to 
> > input
> > (just my gut). Note that a nonSort element is not always a full word 
> > and
> > doesn't always get spaces, such as in 17th and 18th century works in
> > French where the apostrophe was not used: Lhistoire.... In this case,
> > the nonSort is "L" and there are no spaces; or in Arabic, where the
> > nonSort is "al-", as in: al-ʻArabah al-dhahabīyah lā taṣʻad.
> >
> > I'm assuming of course that for display you are wanting to put the
> > nonSort back together with the title, so you'll get:
> >   The Best of times an essay on entertaining
> > (and depending on your rules, you may put punctuation between a title
> > and a subtitle -- I have no idea what people are doing about that. MARC
> > records include the punctuation in the data element.)
> >
> > More on names:
> >
> >> While we're at it, I've been thinking about name-markup a lot, because
> >> it's so critical for citations.  Leaving aside that I wish MODS used
> >> element names instead of attributes for family, given, etc., I do have
> >> a suggestion: an attribute to indicate abbreviation on the namePart
> >> element.  I also think element order is going to be important
> >>
> >> Examples:
> >>
> >> <name type="personal">
> >>     <namePart type="given">Jane</namePart>
> >>     <namePart abbrev="yes">Q</namePart>
> >>     <namePart type="family">Doe</namePart>
> >> </name>
> >>
> >> The Q above is of course commonly (in the U.S.) understood as a middle
> >> initial.
> >
> > I was about to suggest that you can consider a single letter or a 
> > letter
> > followed by a period to be an initial if that is important for your
> > processing, when I thought about "Wm.". In any case, I'm still not sure
> > what the extra mark-up is going to get you that you can't divine
> > algorithmically, which is how you would probably be arriving at the
> > coding to begin with.
> >
> >>
> >> A corporate name:
> >>
> >> <name type="corporate">
> >>     <namePart abbrev="yes">FBI</namePart>
> >>     <namePart>Federal Bureau of Investigation</namePart>
> >> </name>
> >
> > I think this is unclear -- you don't know if you have a name with two
> > parts - one that's abbreviated and one that isn't, i.e.
> >   U. S. Department of Commerce
> >   <namePart abbrev="yes">U. S.</namepart>
> >   <namePart>Department of Commerce</namePart>
> >
> > or two versions of the same name, as you have. This kind of situation 
> > is
> > better handled with authority records rather than in the bibliographic
> > record because there is a way to associated variations on a name. The
> > other option would be to have an attribute for "name variation", which
> > to me is clearer than having two name parts that may or may not
> > represent the whole name.
> >
> > [Note: I have done a first pass at a version of the authority format in
> > what I hope is a MODS-compatible schema, and have given it to LC for
> > review. That might be a solution for some of the problems that are
> > coming up around names.]
> >
> >
> >>
> >> In the end, I suppose this is how I'd do things if I was designing a
> >> new schema:
> >>
> >> <creator role="editor">
> >>     <person ID="doej">
> >>       <name>
> >>         <termOfAddress>Sir</termOfAddress>
> >>         <given>John</given>
> >>         <other abbrev="yes">Q</other>
> >>         <articular>van</articular>
> >>         <family>Doe</family>
> >>         <termOfAddress>Duke of X</termOfAddress>
> >>         <full>Sir John Q. van Doe, Duke of X</full>
> >>       </name>
> >>       <note>some notes ...</note>
> >>     </person>
> >> </creator>
> >>
> >> The advantage is that role is separated from the person, and person
> >> from name, allowing additional elements to be wrapped in there as well
> >> that are apart from "names."  This is a bit beyond the realm of MODS,
> >> though.
> >
> > Yes, but I like this structure. It gets role out of the name area. That
> > actually makes sense in a system with a separate authority file for
> > names because the same person (read: same name) will be in different
> > roles in different bibliographic records, but is always him/herself as 
> > a
> > person.
> >
> > -- 
> > -------------------------------------
> > Karen Coyle
> > Digital Library Specialist
> > http://www.kcoyle.net
> > Ph: 510-540-7596 Fax: 510-848-3913
> > --------------------------------------
> >
-- 
-------------------------------------
Karen Coyle
Digital Library Specialist
http://www.kcoyle.net
Ph: 510-540-7596 Fax: 510-848-3913
--------------------------------------