Print

Print


If MODS elements have a language attribute to indicate the language of the field used, it will provide both of flexibility for contents, and better system search/retrieval capacities.

For example, we could have a book in English but have tittles in different languages:

   <title lang="en">Good Morning, New York</title>
   <title lang="fr" type="translated">Bonjour New York</title>
   <title lang="zh" type="alternative">早上好,纽约</title>
    (do not worry, if your email system cannot display Chinese character above)
   <title lang="zh" type="transliteration">zhao shang hao, New York</title>

The languagetype elements of MODS is to identify the language used for the entire book/journal, it should allow any cataloger to use their native language to describe the object, therefore we need to allow different language to be used at element level.

I agree that Pinyin is not a language but a transliteration of Chinese words. The needs of transliteration: 1) for systems that cannot store and display the language, and 2) help people learn the language's pronunciation. With systems that support Unicode and XML that uses Unicode for data transfer, the need for transliteration is greatly reduced in bibliographic data.

some localized systems need to use pinyin and/or radical-stroke to help building index, this is out of the scope of this discussion.

Foster Zhang
===============================================                                    
Systems Department         (650) 725-7924
Green Library East, 2nd Fl.(650) 723-3038 (fax)
Stanford University        [log in to unmask]
Stanford, CA  94305-6069   library.stanford.edu




-----Original Message-----
From: Metadata Object Description Schema List [mailto:[log in to unmask]]On Behalf Of Karen Coyle
Sent: 2002年11月5日(星期二) 7:31
To: [log in to unmask]
Subject: Re: [MODS] MODS revisions


At 06:23 PM 11/4/2002 +0100, Yves Pratter wrote:

> >Allowing researchers to use Pinyin at the keyboard rather than
> >forcing them through an alternate keyboard is still considered a "service"
> >by some.
>if you want to provide such facility, ok you could provide a optionnal field
>that say that original datas where written in Pinyin (Pinyin is a language,
>not a charset ?).

That's exactly what I want, an optional field. But Pinyin is neither a
language nor a character set, it's a transliteration standard. Here's a
Pinyin field for a Chinese book title:
    Ho Ching-ming ts?ung k?ao /  Pai Jun-te chu
In the vernacular, instead of those latin characters you would see chinese
characters. The Chinese has been rendered more or less phonetically to put
it into the latin character set. That's not a two-way street, however.

>But all MODS data should be in unicode.

It is. Pinyin, or other transliterations, are written in latin characters,
which are represented in Unicode. The entire MODS record can be in Unicode
and use only letters A-Z,a-z. Let's not confuse "Unicode" with "scripts".


> >The fact is that people using MARC *do* have both kinds of fields in their
> >records, so by not including them we run the risk of making MODS less
> >useful and therefore less used.
>Currently, MARC softwares doesn't support yet MODS.
>So when softwares engineers will provide import/export modules for MODS,
>they will provide automatic (if possible) transliteration from/to unicode.

With Western European languages it is (often) possible to translate from a
character set like ISO 8859-1 to the Unicode equivalent. But it is not
possible to translate from a *transliteration* of Chinese or Russian to the
vernacular characters of the original language. So my concern is not with
languages that use a latin-based script but with ones that do not.
*********************************************
Karen Coyle           [log in to unmask]

            http://www.kcoyle.net
**********************************************