BCP 47 offers a lot of options, some more useful than others.  We do need best practices for applying it to library metadata.  I created a draft of some practices last year:




It is time for a standards body (PCC?) to take up the issue and move it forward.



Joe Kiegel



From: Bibliographic Framework Transition Initiative Forum [mailto:[log in to unmask]] On Behalf Of Hakala, Juha E
Sent: Sunday, December 17, 2017 11:13 PM
To: [log in to unmask]
Subject: Re: [BIBFRAME] CC:AAM Statement in Support of the Internationalization of BIBFRAME




since BCP 47 has been published as RFC 5646, it is already included as a code source in http://www.loc.gov/standards/sourcelist/language.html. In order to clarify this, the source list could be modified to say “A language identifier as specified by the Internet Engineering Task Force Best Current Practice 47 (RFC5646)”.


It is in principle quite simple to use ISO 639 – and especially ISO 639-2 - for cataloguing (being a member of ISO 639/JAC, I know only too well that maintaining ISO 639-2 can be difficult at times). But if libraries start using BCP 47 instead or in addition to ISO 639, it might be necessary to provide guidelines on how it should be used and how the resulting codes should be indexed. It is easy for the library systems to deal with ISO 639-2 codes, but the full functionality of BCP 47 is not trivial to support well. There are situations when it would be useful to specify the region where the language variant is spoken and/or the script in which the text is written, but to decrypt the resulting BCP 47 codes for OPAC display could be a challenge. Some examples of language tags from BCP 47:


en-US represents American English


de-CH represents Swiss German; note that the existing ISO 639-2 code gsw for Schwiizerdütsch does not need to be used


de-CH-1901 means Swiss German using the 1901 variant ortography


Private extensions are possible with “x-“:


de-CH-x-phonebk might mean Swiss German used in phone books.


Mandarin Chinese can be encoded in two ways: cmn (the code for Mandarin from ISO 6393) or zh-cmn (macrolanguage code for Chinese from ISO 639-1 and 639-3). With script and region information added the BCP 47 tag might become zh-cmn-Hans-CH (Chinese, Mandarin, Simplified script, as used in China).


es-419 represents Spanish in UN-defined Latin America and Caribbean region


en-scotland-fonipa represents a text in Scottish dialect, written in International Phonetic Alphabet.   


sr-Latn-RS represents Serbian (sr) written using Latin script (Latn) as used in Serbia. Note that two-letter language code from ISO 639-1 must be used in one exists.


sl-rozaj-biske-1994 represents San Giorgio dialect of Resian dialect of Slovenian, in standardized Resian ortography


Using BCP 47 in its simplest level (two letter codes from ISO 639-1) would not bring mch added value to the current ISO 639-2 usage. Having both two- and three-letter codes of languages in MARC records might be confusing. If and when BCP 47 is used, it should be done when it provides added value, for instance to describe a resource written in a dialect that cannot be described in ISO 639 (note that ISO 639-6 which was supposed to cover dialects have been discontinued) or in a script which is not typical for the language or the region. But just asking the cataloguers to start using BCP 47 without any additional guidelines may not be the ideal solution given the rich functionality supported by the specification.         


Best regards,





From: Bibliographic Framework Transition Initiative Forum [mailto:[log in to unmask]] On Behalf Of Andrew Cunningham
Sent: 18. joulukuuta 2017 2:09
To: [log in to unmask]
Subject: Re: [BIBFRAME] CC:AAM Statement in Support of the Internationalization of BIBFRAME




Using iso-639-1, iso-639-2/B or iso-639-3 is insufficient. 


BCP47 includes subtag for language, script, region and variant as well as a series of extension mechanisms such as -t- sequences.


BCP47 also addresses duplication between iso-639-1, iso-639-2/T and iso-639-3.

On Saturday, 16 December 2017, Rebecca Guenther <[log in to unmask]> wrote:

The need to be able to use ISO 639-1 and 639-3 language codes was recognized by the MARC community some time ago (actually in 2001), and there is a mechanism to record them in field 041, e.g.


041 07 $a en $a fr $2 iso639-1


I’m not sure how widely implemented it is. Subfield $2 contains the source of the code and the sources are in the list at http://www.loc.gov/standards/sourcelist/language.html. At the time, the RFCs were being used (and as you can see from the list of source codes, they were revised a lot). BCP 47 could be added to the source code list so that it could be used in MARC records for now.




Rebecca Squire Guenther
215 W. 75th St. Apt. 16H
New York, NY 10023
[log in to unmask]

On Dec 14, 2017, at 2:48 AM, Osma Suominen <[log in to unmask]> wrote:


Most existing MARC data incorporates use of the language codes found in ISO 693-2/B. While the codes in this standard are useful, it may be necessary in implementation to accommodate the codes from ISO 639-1 (2-letter codes) and ISO 639-3 as well.

Adhering to BCP 47 would already provide a mechanism for expressing ISO 639-1 and 639-3 language codes, so I don't see why the implementation-level consideration is not simply to use BCP 47 as was already stated under General considerations.


Robert J. Rendall kirjoitti 13.12.2017 klo 22:30:

Colleagues -
The ALA/ALCTS Committee on Cataloging: Asian and African Materials (CC:AAM) has voted to approve a Statement in Support of the Internationalization of BIBFRAME, containing recommendations on character encoding, the representation of original script and romanization, normalization, and language tags:
Robert Rendall
Chair, CC:AAM 2017-2018
Robert Rendall
Principal Serials Cataloger
Original and Special Materials Cataloging, Columbia University Libraries
102 Butler Library, 535 West 114th Street, New York, NY 10027
tel.: 212 851 2449  fax: 212 854 5167

Osma Suominen
D.Sc. (Tech), Information Systems Specialist
National Library of Finland
P.O. Box 26 (Kaikukatu 4)
Tel. +358 50 3199529
[log in to unmask]



Andrew Cunningham

[log in to unmask]