Print

Print


Technically they are the maintence agency for the -t- and -u- extension
mechanisms within BCP47. Other agecies are responsible for each
transliteration schema

On Sunday, 3 July 2016, Simon Spero <[log in to unmask]> wrote:
> The maintenance agency for transliteration is the Unicode Consortium;
reading the Unicode docs on transliteration may be helpful. They care about
more than emoji 😁
>
> On Jul 1, 2016 11:48 PM, "Andrew Cunningham" <[log in to unmask]>
wrote:
>>
>> Although -t- mechanism will create quite complex language tags ... and
for versioning would be dependant on romanisation tables beong
appropriately versioned and archived.
>>
>> sr-Cyrl-t-sr-Latn-m0-alaloc-<year/version>
>>
>> or maybe
>>
>> sr-Cyrl-t-und-Latn-m0-alaloc-<year/version>
>>
>> But I could have syntax wrong. I would need to double check when I am at
a computer.
>>
>>
>>
>> On Saturday, 2 July 2016, Simon Spero <[log in to unmask]> wrote:
>> > Transliteration can be indicated using the t mechanism described in
RFC 6497 —
>> > https://tools.ietf.org/html/rfc6497
>> >
>> > On a side note, SPARQL query  supports basic language filters. OWL 2
supports extended matching when applying facets to PlainLiterals.
>> >
>> > (Script suppression is a pain in the neck, because it doesn't get
reversed before extended matching, making it impossible to use a suppressed
script to exclude (eg)  English in braille.)
>> >
>> > Simon
>> >
>> > Simon
>> >
>> > On Jul 1, 2016 6:51 PM, "Andrew Cunningham" <[log in to unmask]>
wrote:
>> >>
>> >> Although
>> >>
>> >> rdfs:label "Ljiljana Vukić"@sr-Latn ;
>> >>
>> >> Would be more correct than
>> >>
>> >> rdfs:label "Ljiljana Vukić"@en ;
>> >>
>> >> This tag would imply that it is a anglicised version of the name,
rather than transliterated version.
>> >>
>> >> Serbian does have an alternative Latin orthography.
>> >>
>> >> Or maybe
>> >>
>> >> rdfs:label "Ljiljana Vukić"@sr-Latn ;
>> >> rdfs:label "Љиљана Вукић"@sr-Cyrl ;
>> >>
>> >> Would be more accurate since Serbian doesn't  have a suppress  script
field.
>> >>
>> >> But it is probably more importantto differentiate for languages like
Arabic where there are  range of romanisation schemes in use.
>> >>
>> >> Language tages are used in lots of different ways for different
purposes.
>> >>
>> >> Yes browsers use language tags. For some languages it is used for
font fallback, ie CJK text, by default most broswers automatically use a
simplified Chinese font for displaying CJKV data. Unless explicit fonts are
specified in the relevant elements font stack or the elements are
appropriately tagged.
>> >>
>> >> Library catalogues rarely have html language tagging, so Traditional
Chinese, Japanese kanji and Korean hanja are often displayed using a
Simplified Chinese font.
>> >>
>> >> Some language tags are a linked to corresponding opentype language
typographic system tags (which aren't language tags) and will use
appropriate locl features in Opentype fonts for correct rendering of a
language.
>> >>
>> >> Search tools, accessibility tools, etc all use language tagginig.
>> >>
>> >> On 2 Jul 2016 4:53 am, "Young,Jeff (OR)" <[log in to unmask]> wrote:
>> >>>
>> >>> Aside,
>> >>> At least some of transliterations (particularly "names") can be
treated using existing BCP-47 tokens.
>> >>> For example, in OCN:100011210:
>> >>> 700 [0] 1_   [33$6] 880-04   [1$a] Vukić, Ljiljana.
>> >>> 880 [0] 1_   [33$6] 700-04/(N   [1$a] Вукић, Љиљана.
>> >>> Mapping these fields (and using other clues in the record) can
reasonably produce:
>> >>>
>> >>> rdfs:label "Ljiljana Vukić"@en ;
>> >>> rdfs:label "Љиљана Вукић"@sr ;
>> >>>
>> >>> Romanization rules may have been used to generate the 700 form, but
capturing that fact doesn’t seem very important.
>> >>> This mechanism doesn’t work well for non-names, which end up being
more about capturing phonetics as opposed to “language". In those cases,
the literals would have to be translated instead of transliterated in order
to attach more useful “languagy" language-tags.
>> >>> I would also note that web browsers are configured to use BCP-47
tokens which servers can leverage for display. It’s unlikely that someone
would choose extension language tags to control displays of the data.
>> >>> Jeff
>> >>> From: Bibliographic Framework Transition Initiative Forum <
[log in to unmask]> on behalf of Andrew Cunningham <
[log in to unmask]>
>> >>> Reply-To: Bibliographic Framework Transition Initiative Forum <
[log in to unmask]>
>> >>> Date: Friday, July 1, 2016 at 12:48 PM
>> >>> To: "[log in to unmask]" <[log in to unmask]>
>> >>> Subject: Re: [BIBFRAME] Language tags
>> >>>
>> >>> But if record is being consumed by system that requires bcp47, or
being output to web, you still have the issue of passing along or
generating an appropriate language tag.
>> >>>
>> >>> On 2 Jul 2016 1:58 am, "Young,Jeff (OR)" <[log in to unmask]> wrote:
>> >>>>
>> >>>> Another approach might be to use SKOS-XL instead of language tags
like so:
>> >>>> :A1 a skosxl:Label;
>> >>>> rdfs:label “Ελληνική Δημοκρατία”@gr
>> >>>> skosxl:literalForm “Hellēnikē Dēmokratia”;
>> >>>> bf:romanizationRule <
https://www.loc.gov/catdir/cpso/romanization/oriya.pdf>;
>> >>>> .
>> >>>> It’s a little heavy, but a construct like this would tie the key
pieces together. A custom language tag on the transliteration can’t tie in
the native term.
>> >>>> Jeff
>> >>>> From: Bibliographic Framework Transition Initiative Forum <
[log in to unmask]> on behalf of Andrew Cunningham <
[log in to unmask]>
>> >>>> Reply-To: Bibliographic Framework Transition Initiative Forum <
[log in to unmask]>
>> >>>> Date: Friday, July 1, 2016 at 11:39 AM
>> >>>> To: "[log in to unmask]" <[log in to unmask]>
>> >>>> Subject: Re: [BIBFRAME] Language tags
>> >>>>
>> >>>>
>> >>>> On 1 Jul 2016 2:19 am, "Joseph Kiegel" <[log in to unmask]> wrote:
>> >>>> >
>> >>>> > The approach of creating variant subtags for specific editions of
the ALA-LC romanization tables is outmoded, since the tables are now on the
Web.
>> >>>> >
>> >>>> >
>> >>>>
>> >>>> It is less than ideal, yes, but it was the only one that was
proposed as a variant subtag. And no one has proposed a way to tag anything
else.
>> >>>>
>> >>>> It doesn't help that romanisation tables lacked good versioning
information, etc.
>> >
>>
>> --
>> Andrew Cunningham
>> [log in to unmask]
>

-- 
Andrew Cunningham
[log in to unmask]