I have already been working on relationships between individual language, macrolanguage, and collective language group codes. I am down to the problem codes (I have some 350 to go on the 600+ code elements that I could not map via automated queries). Then I need to verify the draft mappings. Perhaps I will be through that in the next 2 weeks (it is not the only project on my plate.) I will send those to you at that point.

I am also very interested to read about the MARC code set in XML. I have just started modeling an RDF-based expression of ISO 639. I have looked into SKOS in this regard, as well. I'd be interested in working with you all on that (sooner rather than later). I just downloaded the languages.xml file, so I will look into that and take up that conversation separately with you.

Re: decision required: "other" collections

Having reread the recent posts on this issue, I think that Peter has been
convincing in his arguments that the distinction between Other and
Languages is no longer clear, since we have added languages that once were
only under a collective language group. I am prepared to accept the
suggestion that we change the names of those called (Other) to Languages.
I would be in favor of making available documentation that shows which
individual languages have their own codes under the language groups in ISO
639-2 and which use the collective code. We already have documented the
individual languages that use a collective code in MARC, but we don't show
under a collective code those languages that fall in the group that have
their own codes.

We have recently made the MARC language code list available in XML which
will allow us to transform the list to more expressive syntaxes. Since the
same set of codes is used in MARC, this should help us for the 639-2
documentation. We (i.e. my office at LC) are looking into establishing
registries for some of our controlled lists of values, perhaps using
semantic web technologies, such as RDF/OWL or SKOS. Relationships between
entities in the list are an important feature of these schemes-- e.g.
showing which languages are subparts of another more general name.

Obviously we will need to coordinate this work with the ISO 639 as
database effort. If we want to document the individual languages that have
their own codes and are not part of the "remainder group", we may need
some help identifying these. Will SIL be able to help with this sort of

So if we were to change the (Other) to Languages, would that help with the
IETF concerns?


On Thu, 20 Dec 2007, Peter Constable wrote:

> > From: ISO 639 Joint Advisory Committee [mailto:[log in to unmask]] On
> > Behalf Of Håvard Hjulstad
> > I think that this will have to be an important part of the maintenance
> > of 639-2 (and other defined subsets of 639) in the future. All this
> > needs to be put into place when 639-5 is finished. I think we can live
> > with an "imperfect" solution in the meantime.
> One of the ISO 639 customers with an ISO 639 implementation is IETF
> (IETF Language Tags), and the working group working on revising that
> spec to incorporate 639-3 asked me if I could get this matter resolved
> within the JAC. They have referenced 639-2 in the past, and need to
> continue to do so because the 639-2 collections have been supported.
> (There is the option to change to 639-5 in the future, but that is
> outside the plan of current work.) The instability caused by "(Other)"
> for some 639-2 collections is of concern for them, so they'd like to
> see that change in 639-2 sooner than later. (If it were done now, that
> would allow them to cover the stability issues in this version.)
> Peter