Hi Joseph,

There are two approaches to tagging transliterated strings:

1) Using a variant subtag. Currently only alalc97 has been registered.  But
others could potentially be, or

2)  Using the -t- extension, i.e, -m0-alaloc-<date>

But both approaches really require detailed versioning information for the
romanisation tables.


On 11 January 2018 at 09:19, Joseph Kiegel <[log in to unmask]> wrote:
> - 2-char or 3-char codes, or BCP47? Which are used, and under what
> The larger goal is to make library data play well on the web. Thus we
should use internet standards, which in this case is BCP47. In my view, “2
character vs 3 character codes” is not the right question. For example,
BCP47 uses both 2 and 3 letter codes, so it is not a matter of doing it one
way or the other.
> BCP47 generally works well for library data but is not completely suited
for our needs. The most significant gap concerns language tagging of
transliterated strings. BCP47 has a code for ALA/LC romanization tables
(alalc97) but it is for an old version and does not reflect how tables are
now maintained online. Aside from pinyin, there does not seem to be support
for other romanization schemes used in libraries today, e.g. French or
German romanization of Cyrillic, or transliteration into non-roman scripts.
This issue requires some thought and then working with internet standard
setters to improve BCP47.