Christian's email [originally quoted in Re: Walloon: resolution
needed] has raised some very useful points. Most importantly, he has
suggested that it is important to distinguish languages from

NB: that has prompted me to identify 4 criteria (a) - (d) which
I think can be consistently applied both to existing codes, and
potential new codes, fairly simply.

In my view, any linguistic entity which meets _all_ of criteria (a) -
(d) (described below) should be accepted as a separate language, and
ideally allocated a code, while if less than all four are met, the
linguistic entity in question probably should not be accepted as a
separate language, or allocated a code.

If JAC members could look at the discussion below, and think of any
areas where there are cases where there are legitimate languages
which do not meet all of (a) - (d), I'd be grateful.

> May I reformulate my arguments:
> I did not approve inclusion in ISO 639-1 for the following reasons:
> If we include Walloon, we also have to include all kinds of variants for
> major languages, like
> enUS, enAU, enNZ, enUK etc.
> deAT, deDE, deCH, etc.
> not to mention Chinese, French, etc...
> On the other hand, we have cases, like Bosnian, Slowakian, etc. If
> I am wrong, please correct me.

I think you are wrong, so I am correcting you :-)

Let me explain why. I think that two things are being confused.
Hopefully the criteria proposed below will provide a means of
overcoming this.

Talking principally of written languages, and ignoring spoken
languages, helps solve this problem, and this is the approach taken
in both parts of ISO 639.

Using the proposed criteria, it is possible to distinguish

1. Separate languages, which have
   (a) an established orthography,
   (b) a separate usage,
   (c) a separate language name and
   (d) a body of works using that orthography over a significant
   period of time, against

2. Dialects of any of the above, which don't have all four  of
   Critera (a) - (d).

Taking those criteria, this works so that


1. Bosnian, Slovak, Nynorsk, Bokmaal, and Walloon all fit criteria
   (a) - (d) while (a) - (d) do not apply to group 2 below.


2.1 What you describe as enUS, enAU, enNZ, enUK etc., are all
    described as English by their users, and it is difficult to pick
    out even any language variants from a short sample. Even for
    English as used in the US, the major differences are only some of
    spelling (color/colour etc) and usage (carpark/parking lot) and
    there is no mutual unintelligibility.

    Similarly, what you describe as deAT, deDE, deCH, etc., are all
    described as German by their users, and it is difficult to pick
    out even any language variants from a short sample. Even for
    German as used in Switzerland, the major differences are only some
    of spelling (use of ESSZET/SHARP ESS or not) and usage
    (Kartoffel/Erdapfel, etc) and there is no mutual

> not to mention Chinese, French, etc.

    For Chinese, users of Hakka Chinese, Mandarin Chinese, Cantonese
    Chinese etc. all think of themselves as writing Chinese, and as
    Chinese people. NB: it is normal practice in the People's
    republic of China to subtitle TV historic dramas etc, so that the
    drama can be followed in whatever part of China viewers are
    watching. They are reading Chinese, even if they speak the same
    written words using different pronunication and different
    synonyms which predominate in their own (very large) areas.
    Criteria (a) - (d) do not apply here.

    French is a slightly different kettle of fish, but the same
    criteria apply. There are various languages of France, some of
    which have dialects. I would refer you to the JAC document N19
    (February 2002) which lists several different language families.
    It lists the main related _languages_ of metropolitan France as
    Franco-provencal, Occitan, and French, and also lists various
    dialects of each. Criteria (a) - (d) apply to each of
    Franco-provencal, Occitan, and French.

    However, criteria (a) - (d) do not apply to each of the dialects
    listed (see below), though they do apply to at least Walloon.

NB: There is some work to do here, in both ISO 639-1 and ISO 639-2.

Current Occitan and current Franco-provencale needs separate codes,
as they currently share only one code. Older provencale is not used
currently, but has a large written repertoire, and should retain the
code it has. Occitan is much less influenced by Italian than is

For current languages, there should be three codes for three
languages (Occitan, Franco-Provencale and French). Users may need
guidance on distinguishing Occitan and Franco-Provencale, which may
be done by providing links to sample texts in those languages.


I have not yet looked into the language/dialect status of linguistic
entities related to Occitan (JAC N19 lists Gascon, Languedocian,
Provencal, Auvergnat-Limousin, Alpin Dauphinois) but the (a) - (d)
criteria should be useful in sorting them out.

Similarly, I have not yet looked into the language/dialect status of
linguistic entities related to French (JAC N19 lists as langues d'oil
the entities Franc-Comtois, Walloon, Picard, Norman,
Poetevin-Saintongeais), Bourguignon-Morvandiau, and Lorraine). Again,
the (a) - (d) criteria should be useful in sorting them out.

Walloon is listed in 1, not in 2, as it meets criteria (a) - (d).
The JAC's recent decision on Walloon also fits in with this.

But anyway using (a) to (d) as criteria should enable the JAC to
apply consistent benchmarks that also fit in with existing practice
of ISO 639, ISO 639-2 and the various registrations already made both
both RAs and the JAC.

Are there any problems with that? I'd be glad to see comments.

Best regards

John Clews

