> From: ISO 639 Joint Advisory Committee [mailto:[log in to unmask]] On
Behalf Of
> Milicent K Wewerka
> I have a few comments from the historical perspective since I
have been
> involved with the development of 639-2 from the beginning.
Thanks for the very helpful historical review. I'll offer some comments
in response in support of a position that adding S-C to 639-2 is
feasible; I hope in doing so I don't leave an impression that I discount
any of your concerns.
> Parts 639-1 and 639-2 are intended to supply only one choice
for a particular
> text or other application.
I believe that intent is largely realized, but as you note there is an
exception in the case of Norwegian; some might argue that
Moldavian/Romanian and Akan/Fanti/Twi are also exceptions.
> Part 639-1 has entries only for individual languages. Part
> 639-2 has entries for some individual languages and collective entries
for other
> languages.
I agree that this has been the intent, though I think this too has
exceptions: one of the pending issues to resolve is the treatment of
"qu" (Quechua), which really must be reanalyzed as a macrolanguage or a
collection.
> The addition of Serbo-Croatian to 639-2 would produce
confusion in selection of
> a code for Serbian, Croatian, and Bosnian. Some users would apply the
individual
> codes; others would apply the code for Serbo-Croatian.
We need to distinguish two issues here:
- When someone is tagging an information object, will they have problems
knowing what tag to use?
- When someone is searching for an information object with particular
language properties, will the query retrieve all the items of interest
for them?
I think the first is reasonably straightforward: if it is unclear which
specific variety is used or if the application scenario does not require
distinguish between specific varieties, then the more generic category
is used; if the specific variety is known and the distinctions are
useful in the application scenario, then use the more specific
categories.
The second is less straightforward: it isn't a difficult problem, but it
does require investment. This issue requires software processes to be
designed so as to provide appropriate matching. E.g., if a user requests
"Serbo-Croatian" items, then items tagged as being in "Serbian" or
"Bosnian" or "Croatian" will match; if the user asks for "Croatian",
then items tagged as "Serbo-Croatian" may or may not be considered match
depending upon particular application implementation choices (or
possibly a user-selectable option).
> The existence of a code for
> Serbo-Croatian would imply that there are some language or dialect
forms that are
> not included in the codes for Serbian, Croatian, and Bosnian.
Not necessarily; I'd say no, in fact: Serbo-Croatian would be considered
a macrolanguage, and there would be normative macro-language mappings in
639-3 indicating that the macrolanguage Serbo-Croatian encompasses the
three individual languages Bosnian, Croatian and Serbian. I realize that
these mappings are not part of 639-2; but I think we have been looking
ahead toward a point at which the alpha-3 space now divided between
639-2, 639-3 and 639-5 will be treated as a whole.
> The situation of Norwegian, which has been cited as similar,
has caused
> difficulties in applying the code in the library world. The Library
of Congress had to
> issue an announcement that it would use "nor" and that the other
codes would not
> used.
Presumably any application of ISO 639-2, such as the MARC language code
list (I'm assuming it's reasonable to characterize it as such) can
choose to prohibit the use of any given entry in 639-2 in that
application, including "nor" or Serbo-Croatian.
> For the library community adding a code for Serbo-Croatian in
639-2 would be
> problematic.
In view of the above, it's not clear to me to what extent it would be
problematic. Also, I recall Rebecca on some occasion commenting that
librarians had problems with the lack of a code for Serbo-Croatian since
even native speakers could not always determine how to catalogue items.
It would seem that the addition of S-C might solve some problems for
librarians, though in the short term it might raise issues that needed
to be dealt with.
> The best solution seems to be to deprecate the entry for
Serbo-Croatian in 639-
> 1 as was originally intended.
I don't know to what extent this was discussed when 639-2 was being
developed. I hope it would be considered only with great caution as I
think there is a reasonable likelihood that "sh" may be in current usage
somewhere.
Peter Constable
|