Print

Print


The maintenance agency for transliteration is the Unicode Consortium; reading the Unicode docs on transliteration may be helpful. They care about more than emoji 😁

On Jul 1, 2016 11:48 PM, "Andrew Cunningham" <[log in to unmask]> wrote:
Although -t- mechanism will create quite complex language tags ... and for versioning would be dependant on romanisation tables beong appropriately versioned and archived.

sr-Cyrl-t-sr-Latn-m0-alaloc-<year/version>

or maybe

sr-Cyrl-t-und-Latn-m0-alaloc-<year/version>

But I could have syntax wrong. I would need to double check when I am at a computer.



On Saturday, 2 July 2016, Simon Spero <[log in to unmask]> wrote:
> Transliteration can be indicated using the t mechanism described in RFC 6497 —
> https://tools.ietf.org/html/rfc6497
>
> On a side note, SPARQL query  supports basic language filters. OWL 2 supports extended matching when applying facets to PlainLiterals.
>
> (Script suppression is a pain in the neck, because it doesn't get reversed before extended matching, making it impossible to use a suppressed script to exclude (eg)  English in braille.)
>
> Simon
>
> Simon
>
> On Jul 1, 2016 6:51 PM, "Andrew Cunningham" <[log in to unmask]> wrote:
>>
>> Although
>>
>> rdfs:label "Ljiljana Vukić"@sr-Latn ;
>>
>> Would be more correct than
>>
>> rdfs:label "Ljiljana Vukić"@en ;
>>
>> This tag would imply that it is a anglicised version of the name, rather than transliterated version.
>>
>> Serbian does have an alternative Latin orthography.
>>
>> Or maybe
>>
>> rdfs:label "Ljiljana Vukić"@sr-Latn ;
>> rdfs:label "Љиљана Вукић"@sr-Cyrl ;
>>
>> Would be more accurate since Serbian doesn't  have a suppress  script field.
>>
>> But it is probably more importantto differentiate for languages like Arabic where there are  range of romanisation schemes in use.
>>
>> Language tages are used in lots of different ways for different purposes.
>>
>> Yes browsers use language tags. For some languages it is used for font fallback, ie CJK text, by default most broswers automatically use a simplified Chinese font for displaying CJKV data. Unless explicit fonts are specified in the relevant elements font stack or the elements are appropriately tagged.
>>
>> Library catalogues rarely have html language tagging, so Traditional Chinese, Japanese kanji and Korean hanja are often displayed using a Simplified Chinese font.
>>
>> Some language tags are a linked to corresponding opentype language typographic system tags (which aren't language tags) and will use appropriate locl features in Opentype fonts for correct rendering of a language.
>>
>> Search tools, accessibility tools, etc all use language tagginig.
>>
>> On 2 Jul 2016 4:53 am, "Young,Jeff (OR)" <[log in to unmask]> wrote:
>>>
>>> Aside,
>>> At least some of transliterations (particularly "names") can be treated using existing BCP-47 tokens.
>>> For example, in OCN:100011210: 
>>> 700 [0] 1_   [33$6] 880-04   [1$a] Vukić, Ljiljana.   
>>> 880 [0] 1_   [33$6] 700-04/(N   [1$a] Вукић, Љиљана.   
>>> Mapping these fields (and using other clues in the record) can reasonably produce:
>>>
>>> rdfs:label "Ljiljana Vukić"@en ;
>>> rdfs:label "Љиљана Вукић"@sr ;
>>>
>>> Romanization rules may have been used to generate the 700 form, but capturing that fact doesn’t seem very important.
>>> This mechanism doesn’t work well for non-names, which end up being more about capturing phonetics as opposed to “language". In those cases, the literals would have to be translated instead of transliterated in order to attach more useful “languagy" language-tags.
>>> I would also note that web browsers are configured to use BCP-47 tokens which servers can leverage for display. It’s unlikely that someone would choose extension language tags to control displays of the data.
>>> Jeff
>>> From: Bibliographic Framework Transition Initiative Forum <[log in to unmask]> on behalf of Andrew Cunningham <[log in to unmask]>
>>> Reply-To: Bibliographic Framework Transition Initiative Forum <[log in to unmask]>
>>> Date: Friday, July 1, 2016 at 12:48 PM
>>> To: "[log in to unmask]" <[log in to unmask]>
>>> Subject: Re: [BIBFRAME] Language tags
>>>
>>> But if record is being consumed by system that requires bcp47, or being output to web, you still have the issue of passing along or generating an appropriate language tag.
>>>
>>> On 2 Jul 2016 1:58 am, "Young,Jeff (OR)" <[log in to unmask]> wrote:
>>>>
>>>> Another approach might be to use SKOS-XL instead of language tags like so:
>>>> :A1 a skosxl:Label;
>>>> rdfs:label “Ελληνική Δημοκρατία”@gr
>>>> skosxl:literalForm “Hellēnikē Dēmokratia”;
>>>> bf:romanizationRule <https://www.loc.gov/catdir/cpso/romanization/oriya.pdf>;
>>>> .
>>>> It’s a little heavy, but a construct like this would tie the key pieces together. A custom language tag on the transliteration can’t tie in the native term.
>>>> Jeff
>>>> From: Bibliographic Framework Transition Initiative Forum <[log in to unmask]> on behalf of Andrew Cunningham <[log in to unmask]>
>>>> Reply-To: Bibliographic Framework Transition Initiative Forum <[log in to unmask]>
>>>> Date: Friday, July 1, 2016 at 11:39 AM
>>>> To: "[log in to unmask]" <[log in to unmask]>
>>>> Subject: Re: [BIBFRAME] Language tags
>>>>
>>>>
>>>> On 1 Jul 2016 2:19 am, "Joseph Kiegel" <[log in to unmask]> wrote:
>>>> >
>>>> > The approach of creating variant subtags for specific editions of the ALA-LC romanization tables is outmoded, since the tables are now on the Web. 
>>>> >
>>>> >  
>>>>
>>>> It is less than ideal, yes, but it was the only one that was proposed as a variant subtag. And no one has proposed a way to tag anything else.
>>>>
>>>> It doesn't help that romanisation tables lacked good versioning information, etc.
>

--
Andrew Cunningham
[log in to unmask]