LISTSERV mailing list manager LISTSERV 16.0

Help for UNICODE-MARC Archives


UNICODE-MARC Archives

UNICODE-MARC Archives


[email protected]


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

UNICODE-MARC Home

UNICODE-MARC Home

UNICODE-MARC  December 2005

UNICODE-MARC December 2005

Subject:

Re: MARC Filing and Unicode Exclusion

From:

Daniel Lovins <[log in to unmask]>

Reply-To:

UNICODE-MARC Discussion List <[log in to unmask]>

Date:

Thu, 8 Dec 2005 17:09:31 -0500

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (46 lines)

Dear group,

I have question about whether certain Unicode characters--in my case, 
certain Hebrew ones--are represented by double bytes in UTF-8, and if so, 
whether this would explain the following situation:

While testing the Unicode release of Endeavor Voyager, a member of my team, 
Jerry Anne Dickel, found that Hebrew script titles beginning with definite 
(and for Yiddish, also indefinite) articles, no longer indexed properly. 
The titles were failing to show up in browse displays. Jerry Anne was able 
to implicate the second indicator of the 245 field (= number of non-filing 
characters) in this: Ordinarily, with the Hebrew article "ha" [a one 
character prefix], the second indicator of the 245 would be 1, but it was 
only when Jerry Anne changed it to a 2 that the title once again indexed 
correctly. The same thing happened with the Yiddish definite article "der" 
(3 letters plus a space as in the Latin script), where the numeral 4 
(representing the three letters plus space) would normally be used in the 
second indicator; in Voyager Unicode, however, the title would only index 
if the 4 were replaced by a 7 (i.e., doubling the Hebrew characters (3x2) 
but not the space).

We replicated the problem in LC's Unicode-compliant Voyager and in OCLC 
WorldCat.

Interestingly, there did not seem to be a problem in RLIN21.

Did RLG anticipate (what I'm assuming is) the doubled bytes and apply a 
fix? Alternatively, do you think it might be something other than byte 
number that's causing the problem?

Thank you very much for your help.

Daniel


>------------------------------------
Daniel Lovins
Hebraica Team Leader
Catalog Department
Sterling Memorial Library
Yale University
PO Box 208240
New Haven, CT 06520
tel: 203/432-1707
fax: 203/432-7231  

Top of Message | Previous Page | Permalink

Advanced Options


Options

Log In

Log In

Get Password

Get Password


Search Archives

Search Archives


Subscribe or Unsubscribe

Subscribe or Unsubscribe


Archives

April 2018
February 2016
September 2013
March 2013
September 2008
December 2007
October 2007
September 2007
August 2007
July 2007
June 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005

ATOM RSS1 RSS2



LISTSERV.LOC.GOV

CataList Email List Search Powered by the LISTSERV Email List Manager