Hi Karen

On 1 July 2014 06:51, Karen Coyle <[log in to unmask]> wrote:

>  Although I also have regularly encountered two-character tags in RDF
> statements, the RDF concepts document [1] clearly does not preclude the use
> of 3-character tags or even complex tags like "zh-yue" or
> "tlh-Kore-AQ-fonipa" (phonetic transcription of Klingon using Korean script
> :-)).
In BCP-47 terms it should be "yue" rather than "zh-yue"

As for tlh-Kore-AQ-fonipa, you could have a document that is simultaneously
using the -Kore and -fonipa subtags

tlh-Latn-AQ-fonipa  or tlh-Kore-AQ but not tlh-Kore-AQ-fonipa

The biggest problem with library data is actually romanisations and the
inability to tag romanisation data according to the romanisation scheme
being used. For most cases that is

> The RDF document states that any valid language tag (referring to the
> relevant IETF doc, BCP47 [2]) can be used. That IETF document instructs one
> to tag languages at the level at which the information is useful, but not
> beyond. That obviously makes good sense. The fact is that there are
> languages (MANY!) that have no 2-letter code, at which point a three-letter
> code, or a tag and subtag, must be used. I suspect that the prevalence of
> two-letter codes has to do with who is providing linked data. Stats,
> however, show that some three-letter codes are being used. [3]
The key is "valid language tag" by BCP47 definition.

And BCP47 gives a preference for the two letter code, rather than one of
the three letter codes.

The tags as you indicate should be short and only indicate what is needed
to be indicated. E.g.
The language tag for arabic, would be "ar" (three letter codes would only
be needed to distinguish between colloquial varieties of Arabic, 'ar' tag
would be sufficient identifier for Modern Standard Arabic written in the
Arabic script)

A language tag for romanised Arabic based on the ALA-LC romanisaation
tables as published in 1997 would be ar-Latn-alalc97

It is not possible to construct a language tag for current ALA-LC Arabic
romanisation scheme, since there is no appropriate subtag registered. A
language tag ar-Latn ... is insufficient since there are many widely
different romanisation schemes for Arabic, and the language tag does not
have enough specificity

> kc
> [1]
> [2]
> [3]
> On 6/30/14, 11:41 AM, Simon Spero wrote:
> This falls under the general problem of the use of strings instead of
> IRIs; different forms of code that are associated with the same "language"
> could be associated with an IRI referring to that "language" .
> Alternatively,  two Identifiers could be declared and asserted to be
> sameAs ,  but that approach is more complicated.
> Simon
> "Language" left unpacked to avoid issues of extended language tags
> On Jun 29, 2014 4:26 PM, "Stuart Yeates" <[log in to unmask]> wrote:
>> On 06/28/2014 01:25 AM, Jody L. DeRidder wrote:
>>> I just saw this posted on Twitter.
>>> Rob Sanderson is concerned about the ways in which Bibframe does NOT
>>> worked in the linked data environment, and is trying to effectively
>>> communicate the issues.  He's asking for feedback:
>> My biggest issue (that's not covered in the doc, but which I've already
>> fed to the doc's authors) is that BIBFRAME mandates three-letter language
>> codes, where available, while core RDA mandates two-letter language codes,
>> where available.
>> This requires every app that wants to interoparate BIBFRAME with any
>> thing else (and indeed any app that wants to compare BIBFRAME language
>> codes with the language codes on RDF plain-text labels) to have extensive
>> lookup tables.
>> cheers
>> stuart
> --
> Karen [log in to unmask]
> m: 1-510-435-8234
> skype: kcoylenet

Andrew Cunningham
Project Manager, Research and Development (Social and Digital Inclusion)
Public Libraries and Community Engagement
State Library of Victoria
328 Swanston Street
Melbourne VIC 3000

Ph: +61-3-8664-7430
Mobile: 0459 806 589
Email: [log in to unmask]