there are two issues that we have been discussing under this email subject:

1. transliteration, how should it be used with MODS, and
2. language attributes for MODS element

For the No.1 discussion, we should have it under a different email subject,
so people who not interested in it can skip, or I am willing to discuss it
in private email exchange, if most people in this list are not interested in
this discussion.

For No. 2, I do not think I have made it clear for what the real problem is.
My mistake is  to use a make up example. Let us look at a real LC record.

It is a marc record for a Japanese material with a correct language code
assigned to the record. The bib data itself is in English with
transliteration of title in Romanized form.

It has 7 880 fields, 4 of them has Japanese  vernacular representation, and
3 of them has Chinese  vernacular representation. By default, the metadata
itself is in English. (the same materials can be cataloged by Japanese and
Chinese use their "default" language)

Basically, we need:

1. language indication for the material we try to describe;
2. language indication for the metadata it is in, and
3. language indication for the field/element this is not in the "default"
metadata language.

Why we need so many language indicator for the metadata?

1, for the user you intended to serve: for people who do not understand
Japanese or Chinese, how can they know what data is in what language for
those field inside of the  record?

2. For search to work correctly: how the system index those data correctly?
Index Chinese data with Japanese or index Japanese data with Chinese will
only produce incorrect search result.

3. for library system to work correctly: if you put Chinese data in Japanese
font, or Japanese data in Chinese font, you will offend either side of
users, it is incorrect and is culture sensitive issues. It will tell user
how poor is this system being designed.

4. most importantly, for globe information access: the same material should
be matched up by any metadata search that carry the original vernacular
representation no matter where the metadata is being made. (This issue is
very close to the topic of next year IFLA)

Last, we should not depend on encoding system to tell us what the language
is in for the data being processed. Especially with Unicode, the property of
language and associated attributes (order/sorting, display font) are
disappeared in Unicode (in comparison to ASCII or GB). It is up to the
metadata standard to maintain such important information for global
information accessibility (by provide language information/with the data).

If we want something that close match MARC21 but with XML schema format,
there is MARCXML. I hope that MODS can provide a bridge that enable us to go
to the next generation of descriptive metadata standards that carry the
principle of MARC, but with additional capacities that have been limited by
MARC standard. Along this line, I am suggesting to enable language attribute
as a generic attribute that can be used with any elements of MODS.

Foster Zhang
Systems Department         (650) 725-7924
Green Library East, 2nd Fl.(650) 723-3038 (fax)
Stanford University        [log in to unmask]
Stanford, CA  94305-6069

-----Original Message-----
From: Metadata Object Description Schema List [mailto:[log in to unmask]]On Behalf
Of Karen Coyle
Sent: 2002?11?7?(???) 17:23
To: [log in to unmask]
Subject: Re: [MODS] language attributes for MODS element.

At 04:44 PM 11/6/2002 +0100, you wrote:
>I understood that the problem is a bit more complex :
>>The chineese word 拼音 is romanized Pinyin. Ok, it use "only" roman
>>characters, so no problem with unicode.
>>But in fact the tone marks are missing.
>Yves, I don't think it is up to the MODS format to determine what is and
>what isn't "correct" transliteration. Different communities will use
>different ones. The US library community is in the process of moving from
>one transliteration method to another for Chinese. (Kind of the "it used
>to be Peking now it's Beijing" analogy.) What the format needs to do is to
>allow creators of metadata to carry and identify fields that represent
>different choices in how the data is presented.
Karen Coyle           [log in to unmask]