Print

Print


Hello - are there any institutions that are using UTF-8 characters that 
aren't supported by OCLC, such as many of the characters in the "Latin 
Extended B" set?

We have staff working with African materials that would like to use 
characters such as the reversed E, and our local policy has been to 
discourage use of characters that are not supported by OCLC, even though 
our ILS (Aleph) supports these.

I'm interested in hearing about the policies at other libraries. In the 
past, records with these characters would be rejected by OCLC, but my 
understanding is that OCLC's more recent batch processing programs 
accept these characters and turn them into NCRs. If staff retrieve one 
of these records they convert the character into its UTF-8 value. An 
example of an OCLC record with NCRs is #436225032, which contains 
ѧ in the Cyrillic parallel 245 tag.  (I believe the OCLC 
treatment of characters may vary depending on whether it's encountered 
in an 880 or not).

Now that the MARC standard allows the full UCS repertoire (1), I wonder 
to what extent libraries are using it.

Thank you,

Corinna Baksik

Systems Librarian
Harvard University Library
Office for Information Systems
90 Mt. Auburn St.
Cambridge, MA 02138

617.495.3724



(1) "To facilitate the movement of records between MARC-8 and Unicode 
environments, it was recommended for an initial period that the use of 
Unicode be restricted to a repertoire identical in extent to the MARC-8 
repertoire. In 2007, however, such a restriction is no longer 
appropriate. The full UCS repertoire, as currently defined at the 
Unicode web site, is valid for encoding MARC 21 records, subject only to 
the constraints described below."  
http://www.loc.gov/marc/specifications/speccharucs.html


**

-- 
Corinna Baksik
Harvard University Library
Office for Information Systems
90 Mt. Auburn St.
Cambridge, MA 02138

617.495.3724