Print

Print


Maybe I'm missing something, but I don’t think field delimiters are much of an issue. For a given serialization, you have to escape certain characters if they are in the content of the field/property, whatever. In xml, Bibframe would have to output angle brackets in a certain way, in json, it would have to escape quotes, etc.

marc2bibframe would expect a valid marc record and transform/serialize it according to the rules of that serialization.



If an existing source record can't be expressed in valid MARC, how _is_ it being stored? MARCXML would be able to handle it, and we could transform from MARCXML.



In any case, the MARC docs http://www.loc.gov/marc/specifications/specrecstruc.html,  show that the 2709 field delimiter is:

-----------------

    ASCII control character 1F(hex) (represented graphically in MARC 21 documentation as  ǂ or $), which is combined with a data element identifier to make up the subfield code which precedes each individual data element within a variable field. The ASCII name for the delimiter is unit separator (US).

-----------------



So there shouldn't be any confusing it with U+01C2 and U+2021, by machines, anyway.



Nate



---------------------------------------------------------------

Nate Trail

---------------------------------------------------------------

Network Development and MARC  Standards Office

Technology Policy Mail stop 4402

Library Services

Library of Congress

202-707-2193

[log in to unmask]







-----Original Message-----
From: Bibliographic Framework Transition Initiative Forum [mailto:[log in to unmask]] On Behalf Of Riley, Charles
Sent: Tuesday, February 05, 2013 1:21 PM
To: [log in to unmask]
Subject: Re: [BIBFRAME] Punctuation



> The character used for a field delimiter on one system, ǂ, is the

> alveolar click letter used in print in Khoesan languages, supported in

> ISO 6438 and therefore, by extension, in UNIMARC.

>

> Other systems use ‡ as the field delimiter.



>>Was this intentional? U+01C2 and U+2021 could be easily confused if the font is lame enough.



The guidance within MARC21 is iffy on how it would be graphically represented; all that it requires is that the control character 1F be, I think, the underlying data store for the delimiter character.  So there's no intentionality to be ascribed, it just happens to hinder the use of characters that are semantically encoded as letters for certain languages.  Putting the font question aside, if it can be understood that there's an opportunity now for this to get fixed in the design of a new framework, we could be on the right track going forward.