I worked through the next draft code table for ISO 639-3 today, revising the previous draft to incorporate the revisions for the 15th edn of Ethnologue (soon to go to press), the proposed resolution of items in the “Issues” document, and some existing usage of alpha-3 codes in RFC 3066.
Along the way, I discovered one typo in my disposition of Milicent’s comments doc: for 5.45, I say I accept her comments, but then say the proposed solution will be changed to 2, which is what it had been before. It should have said “1”.
Also along the way, working through the new Ethnologue data had a few implications for some of the proposed solutions. In 2 or 3 cases where a macrolanguage was proposed, Ethnologue had a new split that resulted in bumping up the number of constituent-member languages by 1 or 2. In one case (Dogri / Kangri), the split necessitated adding one more “issue” and proposing another macrolanguage. That’s been added at the end of section 5 of the “Issues” document. Also, I discovered a problem involving overlap between the proposed solutions for Rajasthani and Marwari, which led me to make some adjustments there.
Anyway, I’ve made this handful of changes to the issues document. You can get the new version if you want at the same location: http://scripts.sil.org/PCUnicodeDocs (link at the bottom of the page).
You might be interested in some stats related to the code table:
The database Gary and I have been working from has 17576 rows.
ISO 639-2/T 471
Distinct ISO 639-2/B codes 23
ISO 639-2 local-use codes 520
SIL additions 6991
Linguist List additions 226
There are 7561 individual language entries that will be included in ISO 639-3. There are 56 macrolanguages and 364 macrolanguage mappings that will be included in ISO 639-3.
There are 68 collections; these will not be included in ISO 639-3. 56 of the collections have names in ISO 639-2 that look like collections. 12 are items that are named in ISO 639-2 like individual languages. Of the twelve, one ("North American Indian") is clearly intended to be a collection but not named like a collection. For the other 11, there is nothing in ISO 639-2 suggesting they are collections; they have been reanalyzed (or that's the pending proposal) as collections.
Alignment of Ethnologue with ISO 639 (assuming certain open issues in ISO 639 are resolved in particular ways, and comparing with the Ethnologue 15th edition, which is soon to go to the publisher) has required changing 732 Ethnologue three-letter codes.