At 10:33 AM 7/17/2003 -0500, you wrote: >But if we are going to rely on unicode alone, then we might as well drop >the lang, xml:lang and script attributes completely! Unfortunately - or otherwise - you are talking to the person (moi) who argued against xml:lang in descriptive fields (which are those that are copied from the piece, such as author, title, publisher, etc.). What would you do with "Italian Cuisine" or "The Tao of Pooh"? What about a book title like: "Siddhartha"? I don't think we want folks to have to determine if a word that originates in another language is or isn't now considered part of English. And I also don't think we can expect people to make these distinctions for works in languages other than their own. Do you exclude proper nouns? Can you even positively determine what is a proper noun? Sometimes this is easy: Andy Warhol : Ausstellung der Deutschen Gesellschaft für Bildende Kunst Sometimes less so: On the effects of gypsum, or plaster of paris, as a manure; Language distinctions make sense in some areas, like in subject headings when there are subject heading schemes in different languages. In that case you need the language coding or some other coding that translates to a language, i.e. the subject heading scheme of the Bibliotheque nationale de France will be assumed to be in French. Using this you can ask your user what language they wish to search in and you can run their searches only against that set of headings, so that the English "the" and the French "the'" are not confused. You don't need script attributes for Unicode, or so the documentation says. You can tell what script it is from the code range. Does anyone know if this really works? And if it works, is it practical? kc