Print

Print


On 6/30/14, 7:40 PM, Andrew Cunningham wrote:
> Hi Karen
>
> On 1 July 2014 06:51, Karen Coyle <[log in to unmask] 
> <mailto:[log in to unmask]>> wrote:
>
>     Although I also have regularly encountered two-character tags in
>     RDF statements, the RDF concepts document [1] clearly does not
>     preclude the use of 3-character tags or even complex tags like
>     "zh-yue" or "tlh-Kore-AQ-fonipa" (phonetic transcription of
>     Klingon using Korean script :-)).
>
>
> In BCP-47 terms it should be "yue" rather than "zh-yue"
>
> As for tlh-Kore-AQ-fonipa, you could have a document that is 
> simultaneously using the -Kore and -fonipa subtags
>
> tlh-Latn-AQ-fonipa  or tlh-Kore-AQ but not tlh-Kore-AQ-fonipa

Andrew, these are all examples from the IETF document.

kc

>
> The biggest problem with library data is actually romanisations and 
> the inability to tag romanisation data according to the romanisation 
> scheme being used. For most cases that is
>
>     The RDF document states that any valid language tag (referring to
>     the relevant IETF doc, BCP47 [2]) can be used. That IETF document
>     instructs one to tag languages at the level at which the
>     information is useful, but not beyond. That obviously makes good
>     sense. The fact is that there are languages (MANY!) that have no
>     2-letter code, at which point a three-letter code, or a tag and
>     subtag, must be used. I suspect that the prevalence of two-letter
>     codes has to do with who is providing linked data. Stats, however,
>     show that some three-letter codes are being used. [3]
>
>
> The key is "valid language tag" by BCP47 definition.
>
> And BCP47 gives a preference for the two letter code, rather than one 
> of the three letter codes.
>
> The tags as you indicate should be short and only indicate what is 
> needed to be indicated. E.g.
> The language tag for arabic, would be "ar" (three letter codes would 
> only be needed to distinguish between colloquial varieties of Arabic, 
> 'ar' tag would be sufficient identifier for Modern Standard Arabic 
> written in the Arabic script)
>
> A language tag for romanised Arabic based on the ALA-LC romanisaation 
> tables as published in 1997 would be ar-Latn-alalc97
>
> It is not possible to construct a language tag for current ALA-LC 
> Arabic romanisation scheme, since there is no appropriate subtag 
> registered. A language tag ar-Latn ... is insufficient since there are 
> many widely different romanisation schemes for Arabic, and the 
> language tag does not have enough specificity
>
>
>
>
>     kc
>
>     [1] http://www.w3.org/TR/rdf11-concepts/
>     [2] http://tools.ietf.org/html/bcp47
>     [3] http://stats.lod2.eu/languages
>
>
>     On 6/30/14, 11:41 AM, Simon Spero wrote:
>>
>>     This falls under the general problem of the use of strings
>>     instead of IRIs; different forms of code that are associated with
>>     the same "language" could be associated with an IRI referring to
>>     that "language" .
>>
>>     Alternatively,  two Identifiers could be declared and asserted to
>>     be sameAs ,  but that approach is more complicated.
>>
>>     Simon
>>     "Language" left unpacked to avoid issues of extended language tags
>>
>>     On Jun 29, 2014 4:26 PM, "Stuart Yeates" <[log in to unmask]
>>     <mailto:[log in to unmask]>> wrote:
>>
>>         On 06/28/2014 01:25 AM, Jody L. DeRidder wrote:
>>
>>             I just saw this posted on Twitter.
>>
>>             Rob Sanderson is concerned about the ways in which
>>             Bibframe does NOT
>>             worked in the linked data environment, and is trying to
>>             effectively
>>             communicate the issues.  He's asking for feedback:
>>
>>             https://docs.google.com/document/d/1yyVKeYQkBucZqSoQ2qY17vrER46-S6Tw6lY8uqA5xxQ/edit#heading=h.sp1548qks85h
>>
>>
>>         My biggest issue (that's not covered in the doc, but which
>>         I've already fed to the doc's authors) is that BIBFRAME
>>         mandates three-letter language codes, where available, while
>>         core RDA mandates two-letter language codes, where available.
>>
>>         This requires every app that wants to interoparate BIBFRAME
>>         with any thing else (and indeed any app that wants to compare
>>         BIBFRAME language codes with the language codes on RDF
>>         plain-text labels) to have extensive lookup tables.
>>
>>         cheers
>>         stuart
>>
>
>     -- 
>     Karen Coyle
>     [log in to unmask]  <mailto:[log in to unmask]>  http://kcoyle.net
>     m:1-510-435-8234  <tel:1-510-435-8234>
>     skype: kcoylenet
>
>
>
>
> -- 
> Andrew Cunningham
> Project Manager, Research and Development (Social and Digital Inclusion)
> Public Libraries and Community Engagement
> State Library of Victoria
> 328 Swanston Street
> Melbourne VIC 3000
> Australia
>
> Ph: +61-3-8664-7430
> Mobile: 0459 806 589
> Email: [log in to unmask] <mailto:[log in to unmask]>
>
> http://www.openroad.net.au/
> http://www.mylanguage.gov.au/
> http://www.slv.vic.gov.au/

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet