Print

Print


Hi Joseph,

There are two approaches to tagging transliterated strings:

1) Using a variant subtag. Currently only alalc97 has been registered.  But others could potentially be, or

2)  Using the -t- extension, i.e, -m0-alaloc-<date> 

But both approaches really require detailed versioning information for the romanisation tables.

Andrew

On 11 January 2018 at 09:19, Joseph Kiegel <[log in to unmask]> wrote:
>
> - 2-char or 3-char codes, or BCP47? Which are used, and under what circumstances?
>
> The larger goal is to make library data play well on the web. Thus we should use internet standards, which in this case is BCP47. In my view, “2 character vs 3 character codes” is not the right question. For example, BCP47 uses both 2 and 3 letter codes, so it is not a matter of doing it one way or the other.
>
> BCP47 generally works well for library data but is not completely suited for our needs. The most significant gap concerns language tagging of transliterated strings. BCP47 has a code for ALA/LC romanization tables (alalc97) but it is for an old version and does not reflect how tables are now maintained online. Aside from pinyin, there does not seem to be support for other romanization schemes used in libraries today, e.g. French or German romanization of Cyrillic, or transliteration into non-roman scripts. This issue requires some thought and then working with internet standard setters to improve BCP47.
>
>