LISTSERV mailing list manager LISTSERV 16.0

Help for BIBFRAME Archives


BIBFRAME Archives

BIBFRAME Archives


BIBFRAME@LISTSERV.LOC.GOV


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Monospaced Font

LISTSERV Archives

LISTSERV Archives

BIBFRAME Home

BIBFRAME Home

BIBFRAME  November 2011

BIBFRAME November 2011

Subject:

Re: Introduction (@W3C)

From:

Tom Emerson <[log in to unmask]>

Reply-To:

Bibliographic Framework Transition Initiative Forum <[log in to unmask]>

Date:

Wed, 9 Nov 2011 16:31:18 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (21 lines)

On Nov 9, 2011, at 3:39 PM, Riley, Charles wrote:
> Bibliographic data is largely built on the MARC-8 character set, in essence a subset of UTF-8; thus a loss of data for the preponderance of materials in non-Latin scripts has already occurred by the time data becomes bibliographic.

I don't think MARC-8 is properly a "subset" of UTF-8: I'm not sure what that means. MARC-8, as I understand, is more similar to ISO-2022 where you can switch between multiple character sets within a single text stream. UTF-8 is an encoding form of Unicode: a different beast entirely.

I would hope that Unicode would be used for any future bibliographic representation: the choice of encoding then depends on the particular serialization format used. There is little we can do if the original data has been lost, but having the foundation to represent the world's current and historical scripts is a vital requirement, and Unicode fits the bill here.

In addition to specifying language (whether ISO 639-2/B or 639-3 I don't have a preference) we should also consider specifying script details. ISO 15924 works well for this, e.g., to distinguish a title in Simplified Chinese vs. one in Traditional.

    -tree

P.S. All opinions are my own and do not necessarily represent my employer.

Tom Emerson
Principal Software Engineer --- Search
EBSCO Publishing
10 Estes Street
Ipswich, MA 01938, USA
Phone: +1-978-356-6500 x2185
[log in to unmask]

Top of Message | Previous Page | Permalink

Advanced Options


Options

Error during command authentication.

Error - unable to initiate communication with LISTSERV (errno=111). The server is probably not started.

Log In

Log In

Get Password

Get Password


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager