Print

Print


Hello - are there any institutions that are using UTF-8 characters that aren't supported by OCLC, such as many of the characters in the "Latin Extended B" set?

We have staff working with African materials that would like to use characters such as the reversed E, and our local policy has been to discourage use of characters that are not supported by OCLC, even though our ILS (Aleph) supports these.

I'm interested in hearing about the policies at other libraries. In the past, records with these characters would be rejected by OCLC, but my understanding is that OCLC's more recent batch processing programs accept these characters and turn them into NCRs. If staff retrieve one of these records they convert the character into its UTF-8 value. An example of an OCLC record with NCRs is #436225032, which contains ѧ in the Cyrillic parallel 245 tag.  (I believe the OCLC treatment of characters may vary depending on whether it's encountered in an 880 or not).

Now that the MARC standard allows the full UCS repertoire (1), I wonder to what extent libraries are using it.

Thank you,

Corinna Baksik

Systems Librarian
Harvard University Library
Office for Information Systems
90 Mt. Auburn St.
Cambridge, MA 02138

617.495.3724



(1) "To facilitate the movement of records between MARC-8 and Unicode environments, it was recommended for an initial period that the use of Unicode be restricted to a repertoire identical in extent to the MARC-8 repertoire. In 2007, however, such a restriction is no longer appropriate. The full UCS repertoire, as currently defined at the Unicode web site, is valid for encoding MARC 21 records, subject only to the constraints described below."  http://www.loc.gov/marc/specifications/speccharucs.html



-- 
Corinna Baksik
Harvard University Library
Office for Information Systems
90 Mt. Auburn St. 
Cambridge, MA 02138

617.495.3724