Print

Print


I'm saying that you have data in some character encoding (MARC-8,
ISO-8859-1, etc.), that needs to go into Bibframe.

The conversion would, presumably, happen at the time the data goes in or
before.

Output is based on the standard used to serialize. Presumably the process
would use tools/libraries designed to handle this serialization to ensure
this.

Bibframe just needs to use consistent encoding internally, right?

I guess what I mean is, this seems like an issue any conversion process
would have, what is specific to Bibframe?

-Ross.

On Monday, January 7, 2013, Andrew Cunningham wrote:

> I think we'll have to agree to disagree then, since I think the issues lie
> with the conversion between MARC and Bibframe and specifically information
> that doesn't exist in MARC, while you believe its a RDF or HTML issue with
> no need for it to enter the Bibframe realm.
>
> A.
>
>
> On 8 January 2013 15:01, Ross Singer <[log in to unmask]> wrote:
>
> N-Triples is plain-text serialization of RDF, nothing more. This is RDF's
> problem.
>
> JSON's RFC (4627) says "JSON text SHALL be encoded in Unicode. The default
> encoding is UTF-8."
>
> Again, why wouldn't you be using a JSON parser/serializer to handle this?
> And if it's RDF/JSON (or JSON-LD), again, it's RDF's problem.
>
> "RDF in HTML guise" (I assume this is referring to RDFa?) would defer to
> the charset declaration of the page, as would any other HTML-based
> serialization.
>
> HTML is certainly likely to be the most error-prone (since it's also the
> most democratic), but it, again, is an "HTML problem". If the character
> encoding matches the declared charset and it's valid HTML... what else can
> you do?
>
> -Ross.
>
> On Monday, January 7, 2013, Andrew Cunningham wrote:
>
>  if RDF is going to be used exclusively, but if you have N-triples and
> JSON in the mix as container formats ... different issue.
>
> And no, they aren't RDF's problems, RDF in XML format inherits a lot of
> features from XML and can leverage off ITS etc. Likewise RDF in HTML5 guise
> can leverage off internationalisation features in HTML5 or HTMLNext (Living
> HTML or whatever its called this week). The issues are more related to
> bibframe and the conversion process from MARC formats to Bibframe
> regardless of the container.
>
> For instance RDF and XML would use one system for language tagging a
> record, and MARC and possibly. Bibframe use a different system for tagging
> the language of the item/object being described.
>
> Two different functions and two different language tag schemes ... BCP47
> vs ISO-639-2 (B)
>
> And in theory the record might consist of multiple "language" tags since
> in script and romanisation would be different language tags in the BCP-47
> sense.
>
> This level of complexity would become an issue when records are being
> transformed into XML or HTML5 formats to be used by user agents, since
> accessibility requirements will kick in in various jurisdictions.
>
> It will also impact on font rendering of content, in IE10 and latest
> versions of Firefox content marked up with a language tag of "tr" will kick
> in The Turkish language system in the font if present in the fonts OT
> tables, etc.
>
> At least that's my high level tack on it.
>
> In MARC-8 and MARC-21 you didn't have to concern yourselves with this,
> they essentially lived in isolation. In theory internationalisation was
> based on a 40+ year old model.
>
> But RDF in XML, RDF in HTML5, N-triple and JSON each bring their own
> requirements to Bibframe;
>
> As do the programming languages used;
>
> Accessibility requirements;
>
> etc.
>
> Bibframe is movement into a model where there are many inter-dependencies
> and external requirements on the model. Going from an isolated industry
> standard to leveraging off international standrads
>
> Andrew
>
>
>
>
>
>
> On 8 January 2013 13:36, Ross Singer <[log in to unmask]> wrote:
>
> Why are they issues, though?  They're RDF's problems, not Bibframe's.
> Isn't that part of the point of using existing standards?
>
> -Ross.
>
>
> On Monday, January 7, 2013, Andrew Cunningham wrote:
>
> Although those legacy encodings specific to the library industry would not
> exist in Bibframe
>
> Ultimately the issues are more related to how the parsed content is going
> to be consumed or going to be used. If it is to be human editable or
> presented to user agents then more complex processing that inserts markup
> or formatting control characters that are not present in the MARC records
> would sometimes be required.
>
> A lot of this is just tip of the iceberg ... esp if t
>
> --
> Andrew Cunningham
> Project Manager, Research and Development
> Social and Digital Inclusion Team
> Public Libraries and Community Engagement
> State Library of Victoria
> 328 Swanston Street
> Melbourne VIC 3000
> Australia
>
> Ph: +61-3-8664-7430
> Mobile: 0459 806 589
> Email: [log in to unmask] <javascript:_e({}, 'cvml',
> [log in to unmask]);>
>
> http://www.openroad.net.au/
> http://www.mylanguage.gov.au/
> http://www.slv.vic.gov.au/
>