Karen Coyle wrote:
>> The section symbol should just be a section symbol, not ALSO a marker for a removed character, because it has meaning of its own. In that sense, the backslash is a better suggestion. <<
I agree with respect to the section symbol, but the backslash is also a poor candidate. While not extensively used in bibliographic data, it is heavily used in environments that deal with bibliographic data. We don't need to contribute to the confusion of \ as data and \ as a syntactic character. The use of hexadecimal DF is attractive precisely because it has no meaning in the MARC-8 environment. Depending on the environment, it may display as ß or € unless the displaying system replaces the default glyph with something else. I don't believe that either would cause a problem.
>> That said, I admit to having little enthusiasm for these replaced characters because they will make indexing and display less than functional. <<
The unmappable characters themselves are what cause the indexing and display problems. No solution applied on the sending end can resolve these problems, and I don't believe that any option under discussion would make them worse. Certainly dropping characters would not help.
>> Any "solution" of this kind needs to be seen as temporary, as most libraries with a need for the extended character set will move up to technology that allows it within a short period of time. Many have already done so. <<
It's not so much libraries with a need for the extended character set that we're considering here, but rather libraries that don't feel the need or have the resources to support more characters and get them anyway. Sure, the situation is temporary, but temporary in this case could mean 15 years or so.
>> (ps: Have we discussed whether each removed character would get a marker? In other words, would a 5-character sequence be replaced by "\" or by "\\\\\"?) <<
It never occurred to me that we would do anything other than replace on a character-by character basis, so the 5-character sequence would become "ßßßßß" or "€€€€€" or something like that. Collapsing runs of unmappable characters wouldn't be worth the effort.
>> Like other standards, we need to start thinking in terms of profiles and practices that layer on top of a basic standard like Unicode. We *do* however need to have a way to convey what standard or profile the data adheres to, and the way that we have currently defined Leader 09 is not sufficient to communicate these details. <<
Nothing that's coded within the individual record can adequately communicate such details. They need to be handled by a protocol at a level higher than the record.
Gary L. Smith
Software Architect
Product Architecture and Development
OCLC
[log in to unmask]
|