Dear colleagues,

if you remember the discussions in the past, the issues involved in the question "same or two different languages" had been discussed, but has to be re-discussed again in view of 
- impact on ISO 639-1
- impact on ISO 639-2
- problems to be solved for further standards.

In view of the emerging m-commerce, where the conversion speech-written and written-speech will play a crucial role, the issues discussed here are less of academic nature, but of VERY PRACTICAL nature - not only languages have to be identified, but also dialects etc. 

I agree with Havard that we need to set/identify rules (even the unavoidable conflicting or mutually contradicting ones) and then establish priorities in 'weighting' the rules. Should we not - after the long and well-proceeding email discussions - foresee a meeting, preferable in conjunction with the ISO/TC 37/SCs meetings in Oslo at the end of August?

Best regards

-----Ursprüngliche Nachricht-----
Von: ISO 639 Joint Advisory Committee [mailto:[log in to unmask]]Im Auftrag
von Håvard Hjulstad
Gesendet: Dienstag, 01. April 2003 23:56
An: [log in to unmask]
Betreff: Re: Boundaries are in the eye of the beholder

I agree 100 % with you, John (see his message below). What this is about is
NOT the concrete examples (Moldanian vs Romanian, Valencian vs Catalan,
Serbo-Croatian vs Serbian vs Croatian vs Bosnian, English vs English, etc.).
What we need to discuss is the rules and the criteria. I am sorry that very
few JAC members have "dared" throw themselves into the discussion. I tried
to trigger a discussion, but it is obviously very difficult to discuss
principles without getting too focussed on details of the examples.

We have the concepts of "indiviual language", "language group", "language
variant", etc. We have a number of criteria by which to assess what we are
dealing with in each individual case, but we constantly have the same kinds
of problems.

Some of the criteria are:

- Purely linguistic on the level of phonology and morphology. These are
normally fairly straight-forward to deal with. It would be possible to
"measure" phonological and morphological differences.

- Writing system, including orthographic principles. A high level of
orthographic stability makes it simpler to "count languages". Unfortunately
many orthographies are quite unstable and/or allow for considerable

- Vocabulary. In some cases neighbouring and closely related
languages/variants have had different cultural influences that may weigh
when we are "measuring" the difference.

- Legal or de-facto regulation. Many languages have some sort of legal
"protection", which also needs to be considered.

- Cultural split or unity. I think this is an important factor, but it is
quite difficult to deal with.

I am sure that we cannot come up with a formula that can be used objectively
to determine whether a "speak" (or a "write") is an "individual language".
But we need to put some effort into the question. May be some of our
"individual languages" would end up having "meta-names" as their primary
names, like "Romanian+Moldavian", "Catalan+Valencian+Balear", etc. Both
Ethnologue and Linguasphere have a number of such cases. These "meta-names"
would have to be a separate category, and "real" names would be included in
addition. I am certain that the current list of
identifier+English-name+French-name+indigenous-name needs to be changed.

There are many ways forward. And there are many decisions to be made. Among
them are: (1) How can we improve our criteria for assessing where the
"individual language" boundaries go? (2) Which elements of additional
information are needed to enhance our tables? (Don't think "table"; it is
going to be a database and/or a complex XML structure anyway.)


Håvard Hjulstad    mailto:[log in to unmask]
Chairman ISO/TC37 (Terminology and other language resources)
Convener of ISO/TC37/SC2/WG1 (Language coding)
Acting chairman of ISO 639 RA-JAC
  Solfallsveien 31
  NO-1430  Ås, Norway
  tel: +47 64963684
  fax: +47 64944233
  mob: +47 90145563

-----Original Message-----
From: ISO 639 Joint Advisory Committee [mailto:[log in to unmask]]On Behalf
Of John Clews
Sent: 1. april 2003 23:26
To: [log in to unmask]
Subject: Boundaries are in the eye of the beholder

Boundaries are in the eye of the beholder

Hi all

In message  <[log in to unmask]>
[log in to unmask] writes [Re: A question about a language]:

> John Clews:
> > The official language of Moldova is now of course Romanian, in Latin
> > script, and I imagine that there may be similar cross-border
> > initiatives to allow for standardization while recognizing potential
> > different local uses, as still exist between the Netherlands and
> > Belgium for standardizing the use of Dutch.

I stand by that description, but it's not worth arguing over a lot:
it occurs to me that it's another of these "where's the boundary"
issues, which will never get satisfactorily resolved.

    Many people will say (not just Michael) that the two are
    essentially the same language ...

> To quote the Moldovan constitution, Article 13: Limba de stat a Republicii
> Moldova este limba moldoveneasca, functionind pe baza grafiei latine. (I
> sure there should be some diacritics here and there.)

... while others, to quote the above will also state that they are

Actually, the constitution only describes the Moldavian language and
doesn't say how Romanian differs (or doesn't differ) from Moldavian.

> I don't see that neither John Clews nor the JAC has any authority
> over the Moldovan constitution (or over the laws of any region).

Nor did I suggest that JPC or JAC should have any authority beyond
that :-)

This was just an example of a similar thing, which some view

Others are arguably
Scots and English
Valencian and Catalan,
Bosniak and Croatian, and arguably also
Nynorsk and Bokmal.

In relation to the last, certainly when I was in West Norway last
year in the heart of Nynorsk country, the kommune librarian
considered that Nynorsk was essentially the same as Bokmal, while the
Nynorsk enthusiast I met the next day was understandably incensed
when I told him that (and I have to agree with him). The three
Norwegian - Nynorsk - Bokmal codes don't make it any easier for those
who are less familiar with the languages concerned to deal with it,
and I note that the Library of Congress is only applying the code for
Norwegian, if I have my JAC history correct.

Again, I repeat that boundaries are in the eye of the beholder, and
it's almost inevitable that those most familiar with the "dominant"
one of a pair will tend to see mostly similarities, while those most
familiar with the least "dominant" one of the same pair will tend to
see differences as being more significant.

The above isn't a criticism of any approach, just a statement that
boundaries are difficult, and perhaps we should just acknowledge

Best regards

John Clews

John Clews,
Keytempo Limited (Information Management),
8 Avenue Rd, Harrogate, HG2 7PG
Tel:    +44 1423 888 432
mobile: +44 7766 711 395
Email:  [log in to unmask]

Committee Member of ISO/IEC/JTC1/SC22/WG20: Internationalization;
Committee Member of ISO/TC37/SC2/WG1: Language Codes