In reply to Foster and Karen :
 
 
I read some articles in french and english to understand the problem of transliteration (romanization) of languages like chinese, arabic, hebrew...
 
First, for people like me who are not fluent with this "concept", here is the definition of transliteration from wikipedia : http://www.wikipedia.org/wiki/Transliteration
Transliteration is a systematic way to represent the sounds of words in one language using the writing system of another language.
Transliteration is not a translation :
北京 is romanized bei3 jing1 or Beijing in pinyin, but the translation in french is Pékin.
 
As Foster noticed, transliteration is very usefull to "help people learn the language's pronunciation" . In china, a lot of people don't read writes vernacular chinese because it's too difficult, but they could learn easily pinyin.
 
>>But all MODS data should be in unicode.
>It is. Pinyin, or other transliterations, are written in latin characters,
>which are represented in Unicode. The entire MODS record can be in Unicode
>and use only letters A-Z,a-z. Let's not confuse "Unicode" with "scripts".
 
I understood that the problem is a bit more complex :
The chineese word 拼音 is romanized Pinyin. Ok, it use "only" roman characters, so no problem with unicode.

But in fact the tone marks are missing.
The exact romanization should be PĪN YĪN with unicode chars that display the macron ("bar"), that represent the high level tone.
With "only" ascii characters, the tones are represented by a number, so here the romanized version is Pin1 Yin1.
 
 
See wikipedia for more explanation about pinyin : http://www.wikipedia.org/wiki/Pinyin
 
So the problem is how to specify the transliteration used ?
Pinyin (unicode with tones, ascii with numbers, ascii without tones), bopomofo, wade-giles ...
Should we use always unicode version in MODS ?
 
   <title lang="en">Good Morning, New York</title>
   <title lang="zh" type="alternative">早上好,纽约</title>
   <title lang="zh" type="transliteration">zhao shang hao, New York</title>

The proposal of Fost to use attributes for a MODS elements is a good way.
But with the knowledge of subtilities of transliteration, i think that the attribute should be like this :
 
<title lang="zh">北京</title>
<title lang="zh" transliteration="pinyin-ascii">bei3 jing1</title>
<title lang="zh" transliteration="pinyin">beĭjīng</title>
 
or
 
<title lang="zh" transliteration="beĭjīng">北京</title>
 
or
 
<title lang="zh">
    北京
    <transliteration type="pinyin">beĭjīng</transliteration>
</title>
 

>Providing more options allows users to make their own choices (in this case making both vernacular and transliteration data elements available, and either or both can be used).

I understood that transliteration is not a gadget, but it could be very usefull for authorities (personal names, geographical names).
For exemple, a pseudo authority for the "leader of the Chinese Communist Party" :
 
<name type="personal" authority="lcsh" id="78087649" transliteration="pinyin">Mao Zedong</name>
<name type="personal" authority="lcsh" see="78087649" transliteration="wade-giles">Mao Tse-tung</name>
<name type="personal" authority="lcsh" see="78087649" >毛澤東</name>
...
 
 
see
http://authorities.loc.gov/cgi-bin/Pwebrecon.cgi?AuthRecID=1417334&v2=1&HC=1&SEQ=20021106102939&PID=13845
http://www.wikipedia.org/wiki/Mao_Zedong
Yves
 
PS: in my "unicode" examples, i put the tone on the vowel and may be it's not right.
I use the html format with unicode characters, so i could display chineese characters.
In order to see correctly the tone signs, i use big font size.
So if your mail client use only ascii chars, it will be more difficult to understand my email. In this case, i could send you a pdf