Ack, speaking of not realizing things, I thought Internet language
specifications were based on ISO 639-1 with an optional country code
(e.g. en-US). Checking RFC 3066, I see this is not the case; xml:lang
can be used for two-letter or three-letter language codes and more:
- All 2-letter subtags are interpreted according to assignments found
in ISO standard 639, "Code for the representation of names of
languages" [ISO 639], or assignments subsequently made by the ISO
639 part 1 maintenance agency or governing standardization bodies.
(Note: A revision is underway, and is expected to be released as
- All 3-letter subtags are interpreted according to assignments found
in ISO 639 part 2, "Codes for the representation of names of
languages -- Part 2: Alpha-3 code [ISO 639-2]", or assignments
subsequently made by the ISO 639 part 2 maintenance agency or
governing standardization bodies.
- The value "i" is reserved for IANA-defined registrations
- The value "x" is reserved for private use. Subtags of "x" shall
not be registered by the IANA.
- Other values shall not be assigned except by revision of this
So I must agree, the MODS lang attribute can go.
>>> [log in to unmask] 01/17/05 1:31 PM >>>
On Jan 17, 2005, at 12:46 PM, Andrew E Switala wrote:
> MODS guidelines say the lang attribute's value comes from ISO 639-2
I didn't realize this.
> (The former is valid for the xml:lang attribute; why MODS has two
> different language attributes is another matter.)
But an important one. I don't understand this sort of thing. Why not
just use THE xml standard for language coding, instead of once again
relying on library-specific stuff?