Rick
this certainly solves the problem from an xml point of view in an elegant
way - but Im not sure it deals with Rebecca's underlying issue (which if Im
interpeting it correctly) is asking whether or not MODS should allow both 2
and 3 letter codes or somehow try to mandate something more restrictive (and
thus more interoperable) - if it is decided that both the 2 and 3 letter
codes need to be there it would be nice to be able to have the distinction
clearly defined in the xml
Mark H Needleman
Sirsi Corporation
Product Manager - Standards
1276 North Warson Road
P.O. Box 8495
St Louis, MO 63132-1806
USA
Phone: 800 325-0888 (US/Canada)
314 432-1100 x318
Fax: 314 993-8927
Email: [log in to unmask]
---------- Forwarded message ----------
Date: Wed, 11 Dec 2002 09:34:40 -0800
From: Rick Beaubien <[log in to unmask]>
Reply-To: Metadata Object Description Schema List <[log in to unmask]>
To: [log in to unmask]
Subject: Re: [MODS] language: comments please
Given that the MODS language element supports both ISO 639-2 and RFC3066, I
feel that any provision for language attributes should as well, just for
the sake of consistency. However, to make the authority explicit and to
avoid having two parallel language attributes to contain the language
value, you might want to consider defining a language attribute group that
included both a LANG and LANGTYPE attributes along the lines of the
following:
<xsd:attributeGroup name="LANGUAGE">
<xsd:attribute name="LANG" type="xsd:string"
use="optional"/>
<xsd:attribute name="LANGTYPE" use="optional">
<xsd:simpleType>
<xsd:restriction base="xsd:string">
<xsd:enumeration value="RFC3066"/>
<xsd:enumeration
value="ISO639-2b"/>
</xsd:restriction>
</xsd:simpleType>
</xsd:attribute>
</xsd:attributeGroup>
Such handling would, I think, be most consistent with the current language
element.
Rick Beaubien
At 11:30 AM 12/11/2002 -0500, you wrote:
>There was much discussion on the list about being able to designate
>language, script and transliteration for MODS elements, not just at the
>record level. As we are incorporating these changes, I find that language
>is a difficult one for various reasons and would like opinions.
>
>The MARC language codes have been used in library cataloging since 1968.
>The 3-character (bibliographic) language code based on MARC language codes
>became part 2 of ISO 639 in 1998. This is an official ISO standard.
>
>We all know that the Internet world has specified use of the ISO
>2-character code. RFC 1766 specified this but was revised recently as
>RFC3066 to incorporate use of the 3-character code where a 2-character one
>does not exist. (ISO 639-1, the 2-character code list, only has some 170+
>languages defined while ISO 639-2 has 450+ languages, so the latter is a
>much more granular list.) xml:lang references RFC3066 (in an erratum), so
>that means that you could see 2 or 3 character codes using the spec.
>
>All of the 2-character codes in ISO 639-1 have equivalent 3-character
>codes, and these are to be considered synonyms. (If you want more
>information about the relationship between these lists see:)
>http://lcweb.loc.gov/standards/iso639-2/faq.html
>
>The question is what to allow for in MODS. In the language element itself,
>there is an authority attribute that specifies whether the language code
>is from RFC3066 or ISO 639-2. Certainly MARC records would use the ISO
>639-2 code.
>
>If designating language at the element level (which is not currently in
>MARC, although a mechanism to do this has been discussed) what should be
>allowed? Options are:
>
>1. define xml:lang (to include what is specified in RFC3066) and lang (to
>allow for the 639-2 code) for each element
>2. define only lang and let the application decide which code to use
>(there wouldn't be any clashes since there is a one-to-one mapping)
>3. define only xml:lang and not allow for the 639-2 code at the element
>level
>
>The problem I see in only defining xml:lang is that the 3-character code
>is better known in the library world. If converting from MARC records,
>the 3-character code would be in the language field; would it then be
>strange to use the 2-character one at the element level? Also, are there
>other options I haven't thought of?
>
>Please comment.
>
>Rebecca
>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>^^ Rebecca S. Guenther ^^
>^^ Senior Networking and Standards Specialist ^^
>^^ Network Development and MARC Standards Office ^^
>^^ 1st and Independence Ave. SE ^^
>^^ Library of Congress ^^
>^^ Washington, DC 20540-4402 ^^
>^^ (202) 707-5092 (voice) (202) 707-0115 (FAX) ^^
>^^ [log in to unmask] ^^
>^^ ^^
>^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-----------------------------------------------------
Rick Beaubien
Lead Software Engineer: Research and Development
Library Systems Office
Rm 386 Doe Library
University of California
Berkeley, CA 94720-6000
510-643-9776
|