reply to Foster and Karen :
I read some articles in french and english to understand the problem of
transliteration (romanization) of languages like chinese, arabic,
a systematic way to represent the sounds of words in one language using the
writing system of another language.
Transliteration is not
a translation :
北京 is romanized bei3 jing1 or Beijing in pinyin, but the translation in french is
As Foster noticed,
transliteration is very usefull to "help people learn the language's pronunciation" . In china, a lot of people don't read writes
vernacular chinese because it's too difficult, but they could learn easily
MODS data should be in unicode.
is. Pinyin, or other transliterations, are written in latin characters,
>which are represented in Unicode. The entire
MODS record can be in Unicode
use only letters A-Z,a-z. Let's not confuse "Unicode" with
understood that the problem is a bit more complex :
chineese word 拼音 is
romanized Pinyin. Ok, it use "only" roman characters, so no problem with
But in fact the tone marks are missing.
The exact romanization should be PĪN YĪN with
unicode chars that display the macron ("bar"), that represent the high level
With "only" ascii characters, the tones are represented by
a number, so here the romanized version is Pin1
problem is how to specify the transliteration used ?
(unicode with tones, ascii with numbers, ascii without tones), bopomofo,
we use always unicode version in MODS ?
Morning, New York</title>
type="transliteration">zhao shang hao, New
proposal of Fost to use attributes for a MODS elements is a good
with the knowledge of subtilities of transliteration, i think that the attribute
should be like this :
<title lang="zh" transliteration="pinyin-ascii">bei3 jing1</title>
<title lang="zh" transliteration="beĭjīng">北京</title>
>Providing more options
allows users to make their own choices (in
this case making both vernacular and
transliteration data elements available, and either
or both can be used).
I understood that
transliteration is not a gadget, but it could be very usefull for authorities
(personal names, geographical names).
authority="lcsh" id="78087649" transliteration="pinyin">Mao
type="personal" authority="lcsh" see="78087649"
<name type="personal" authority="lcsh" see="78087649"
PS: in my "unicode"
examples, i put the tone on the vowel and may be it's not right.
I use the html format
with unicode characters, so i could display chineese
In order to see
correctly the tone signs, i use big font size.
So if your mail client use
only ascii chars, it will be more difficult to understand my email. In this
case, i could send you a