Print

Print


The maintenance agency for transliteration is the Unicode Consortium;
reading the Unicode docs on transliteration may be helpful. They care about
more than emoji 😁
On Jul 1, 2016 11:48 PM, "Andrew Cunningham" <[log in to unmask]>
wrote:

> Although -t- mechanism will create quite complex language tags ... and for
> versioning would be dependant on romanisation tables beong appropriately
> versioned and archived.
>
> sr-Cyrl-t-sr-Latn-m0-alaloc-<year/version>
>
> or maybe
>
> sr-Cyrl-t-und-Latn-m0-alaloc-<year/version>
>
> But I could have syntax wrong. I would need to double check when I am at a
> computer.
>
>
>
> On Saturday, 2 July 2016, Simon Spero <[log in to unmask]> wrote:
> > Transliteration can be indicated using the t mechanism described in RFC
> 6497 —
> > https://tools.ietf.org/html/rfc6497
> >
> > On a side note, SPARQL query  supports basic language filters. OWL 2
> supports extended matching when applying facets to PlainLiterals.
> >
> > (Script suppression is a pain in the neck, because it doesn't get
> reversed before extended matching, making it impossible to use a suppressed
> script to exclude (eg)  English in braille.)
> >
> > Simon
> >
> > Simon
> >
> > On Jul 1, 2016 6:51 PM, "Andrew Cunningham" <[log in to unmask]>
> wrote:
> >>
> >> Although
> >>
> >> rdfs:label "Ljiljana Vukić"@sr-Latn ;
> >>
> >> Would be more correct than
> >>
> >> rdfs:label "Ljiljana Vukić"@en ;
> >>
> >> This tag would imply that it is a anglicised version of the name,
> rather than transliterated version.
> >>
> >> Serbian does have an alternative Latin orthography.
> >>
> >> Or maybe
> >>
> >> rdfs:label "Ljiljana Vukić"@sr-Latn ;
> >> rdfs:label "Љиљана Вукић"@sr-Cyrl ;
> >>
> >> Would be more accurate since Serbian doesn't  have a suppress  script
> field.
> >>
> >> But it is probably more importantto differentiate for languages like
> Arabic where there are  range of romanisation schemes in use.
> >>
> >> Language tages are used in lots of different ways for different
> purposes.
> >>
> >> Yes browsers use language tags. For some languages it is used for font
> fallback, ie CJK text, by default most broswers automatically use a
> simplified Chinese font for displaying CJKV data. Unless explicit fonts are
> specified in the relevant elements font stack or the elements are
> appropriately tagged.
> >>
> >> Library catalogues rarely have html language tagging, so Traditional
> Chinese, Japanese kanji and Korean hanja are often displayed using a
> Simplified Chinese font.
> >>
> >> Some language tags are a linked to corresponding opentype language
> typographic system tags (which aren't language tags) and will use
> appropriate locl features in Opentype fonts for correct rendering of a
> language.
> >>
> >> Search tools, accessibility tools, etc all use language tagginig.
> >>
> >> On 2 Jul 2016 4:53 am, "Young,Jeff (OR)" <[log in to unmask]> wrote:
> >>>
> >>> Aside,
> >>> At least some of transliterations (particularly "names") can be
> treated using existing BCP-47 tokens.
> >>> For example, in OCN:100011210:
> >>> 700 [0] 1_   [33$6] 880-04   [1$a] Vukić, Ljiljana.
> >>> 880 [0] 1_   [33$6] 700-04/(N   [1$a] Вукић, Љиљана.
> >>> Mapping these fields (and using other clues in the record) can
> reasonably produce:
> >>>
> >>> rdfs:label "Ljiljana Vukić"@en ;
> >>> rdfs:label "Љиљана Вукић"@sr ;
> >>>
> >>> Romanization rules may have been used to generate the 700 form, but
> capturing that fact doesn’t seem very important.
> >>> This mechanism doesn’t work well for non-names, which end up being
> more about capturing phonetics as opposed to “language". In those cases,
> the literals would have to be translated instead of transliterated in order
> to attach more useful “languagy" language-tags.
> >>> I would also note that web browsers are configured to use BCP-47
> tokens which servers can leverage for display. It’s unlikely that someone
> would choose extension language tags to control displays of the data.
> >>> Jeff
> >>> From: Bibliographic Framework Transition Initiative Forum <
> [log in to unmask]> on behalf of Andrew Cunningham <
> [log in to unmask]>
> >>> Reply-To: Bibliographic Framework Transition Initiative Forum <
> [log in to unmask]>
> >>> Date: Friday, July 1, 2016 at 12:48 PM
> >>> To: "[log in to unmask]" <[log in to unmask]>
> >>> Subject: Re: [BIBFRAME] Language tags
> >>>
> >>> But if record is being consumed by system that requires bcp47, or
> being output to web, you still have the issue of passing along or
> generating an appropriate language tag.
> >>>
> >>> On 2 Jul 2016 1:58 am, "Young,Jeff (OR)" <[log in to unmask]> wrote:
> >>>>
> >>>> Another approach might be to use SKOS-XL instead of language tags
> like so:
> >>>> :A1 a skosxl:Label;
> >>>> rdfs:label “Ελληνική Δημοκρατία”@gr
> >>>> skosxl:literalForm “Hellēnikē Dēmokratia”;
> >>>> bf:romanizationRule <
> https://www.loc.gov/catdir/cpso/romanization/oriya.pdf>;
> >>>> .
> >>>> It’s a little heavy, but a construct like this would tie the key
> pieces together. A custom language tag on the transliteration can’t tie in
> the native term.
> >>>> Jeff
> >>>> From: Bibliographic Framework Transition Initiative Forum <
> [log in to unmask]> on behalf of Andrew Cunningham <
> [log in to unmask]>
> >>>> Reply-To: Bibliographic Framework Transition Initiative Forum <
> [log in to unmask]>
> >>>> Date: Friday, July 1, 2016 at 11:39 AM
> >>>> To: "[log in to unmask]" <[log in to unmask]>
> >>>> Subject: Re: [BIBFRAME] Language tags
> >>>>
> >>>>
> >>>> On 1 Jul 2016 2:19 am, "Joseph Kiegel" <[log in to unmask]> wrote:
> >>>> >
> >>>> > The approach of creating variant subtags for specific editions of
> the ALA-LC romanization tables is outmoded, since the tables are now on the
> Web.
> >>>> >
> >>>> >
> >>>>
> >>>> It is less than ideal, yes, but it was the only one that was proposed
> as a variant subtag. And no one has proposed a way to tag anything else.
> >>>>
> >>>> It doesn't help that romanisation tables lacked good versioning
> information, etc.
> >
>
> --
> Andrew Cunningham
> [log in to unmask]
>
>