On 2/15/13 1:08 PM, Andrew Cunningham wrote:
[log in to unmask]" type="cite">

The more I consider Internationalization and bibframe, the more I realise that adopting RDF places bibframe in a different data ecosystem, inheriting a lot of internationalisation features from the Unicode and W3C sides.

So slabs of stuff like collation may not and should not be part of bibframe ever, but should be addressed in other forums like CLDR

I like the idea that collation, collocation, and order are application issues, not data issues. However, that is quite a leap from previous library practice where the goal of heading creation was precisely creating an ordered list of identifiers. If this non-collocation concept were to be accepted, wouldn't that also set bibframe apart from RDA (which I believe still has quite a bit of textual heading creation in its rules)?

I'm concerned that we seem to be heading in a few different directions, with no clarity as to how those may or may not ever work together. Quite honestly, moving to RDA at a time when we don't even know *when* (and perhaps *if*) we will be able to accommodate it in a machine-readable form [1] doesn't sound like a great idea.

[1] And, no, I don't think that coding "RDA in MARC" is anything more than lipstick on a ... well, on a whatever. It seems like a square-peg, round-hole exercise, more pain than gain.

[log in to unmask]" type="cite">
On Feb 16, 2013 12:54 AM, "Tom Emerson" <[log in to unmask]> wrote:
Andrew Cunningham writes:
> * sorting/collation in the Unicode context occurs within either a
> languageless multilingual context, ie DUCET. Which is/has just undergone a
> few very interesting changes, and locale specific collations identified in
> CLDR where one or more collations are defined per locale

The engineering complexity for systems like EBSCOhost and EBSCO Discover
Service is that many collections are multilingual and are sold
internationally. A customer in Sweden will want their data in Swedish
sort order, while another in Egypt will want a tailoring that uses
English with a preference for Arabic. Supporting all these possibilities
in a scalable fashion is a real challenge.

> * matching is more problematic, since it brings in both the need for
> normalisation and matching grapheme clusters. Although ideally for some
> languages these would need to be custom rather than default grapheme
> clusters.

Indeed... internationalized sorting is a very tricky thing to get
right. You're pretty much guaranteed to annoy everyone.

Obligatory BibFrame Tie-in: I think collation is way out of scope for
this project. Obviously filing rules are necessary in any cataloging
system, and will need to be addressed, anything beyond that is not worth
discussing. Adopting RDF has pretty much insured that we are moving to
Unicode (and good riddance to MARC-8).


Tom Emerson
Principal Software Engineer, Search
EBSCO Publishing
[log in to unmask]

The opinions in this message do not necessarily constitute those of
EBSCO Publishing.

Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet