Print

Print


I'd like to add another bit of info to Caroline's very clear statement
of support for the sort title element: machine sorting is only really
easy when you limit your title to the basic ASCII characters. It gets
more complex as you add accented characters, and then ramps up in
complexity through non-Latin alphabets (i.e. Arabic and Hebrew) and the
non-alphabetic languages (Chinese, etc.).

The library system I worked on until recently did the usual normalized
of fields for sorting, but it also created a sort order for Chinese,
Japanese and Korean (which cannot simply be sorted in their unicode byte
order). To do this it created a special sort field that carried a
translation of the ideogram-ic language into an ASCII-based phonetic
equivalent known as "pinyin." Although it was reasonably efficient to do
this translation once for each record as it went into the database, as
we get more into ad hoc exchange of bibliographic records it would
probably be best to include these sort fields in the record as it is
exchanged. So although it may seem unnecessary to carry a separate sort
field in order to drop "The " off of the front of a title, more
challenging problems in sorting are in our future.

kc

On Sat, 2004-01-24 at 07:30, Caroline Arms wrote:
> I'd like to support an explicit element for the title to be used for
> sorting (when present), as in the example Bruce provided (with minor
> adjustments to end tags) based on Roy's suggestion.
>
> <titleInfo>
>     <title>A shield in space?</title>
>     <titleSub>technology, politics, and the strategic defense initiative:
> how the Reagan Administration set out to make nuclear weapons "impotent
> and obsolete" and succumbed to the fallacy of the last move</titleSub>
>     <titleSort>shield in space?</titleSort>
>     <titleAbbrev>A shield in space?</titleAbbrev>
> </titleInfo>
>
> To me this is simple and unambiguous.
>
> Rebecca says:
>
> > Roy's suggestion of including a
> > sort title is a possibility, although, as already expressed, requires
> > redundant keying or extra programming.
>
> True, but, in practice, I see no likelihood that it would be achieved by
> redundant keying other than in exceptional cases.  In most cases, it would
> be achieved by simple programming (extra, maybe, but worthwhile when
> considering the overall economy -- including building systems that use the
> records to help users find the content described and the productivity of
> those users).  If done explicitly at data entry, the worst case would be
> copy-and-paste and automated population of the titleSort element for the
> bulk of cases should be feasible, especially if the language of the title
> is known.  Transformation from MARC can take advantage of the existing
> explicit coding.
>
> I used the phrase "overall economy" deliberately.  The newly published
> draft STATEMENT OF INTERNATIONAL CATALOGUING PRINCIPLES from IFLA includes
> the following:
>
> Economy. When alternative ways exist to achieve a goal, preference should
> be given to the way that best furthers overall economy (i.e., the least
> cost or the simplest approach).
>
> It is not clear as stated, that "overall" should include the use of the
> records and not just their creation, but I would personally argue that if
> that is not the intent it should be.
>
>     Caroline Arms                                    [log in to unmask]
>     Office of Strategic Initiatives
>     Library of Congress
>
> Views are my own.
--
-------------------------------------
Karen Coyle
Digital Library Specialist
http://www.kcoyle.net
Ph: 510-540-7596 Fax: 510-848-3913
--------------------------------------