This is why S.l. and S.n. should have been left alone in RDA. And relator codes should be preferred. Let the machines deal with them not fallible humans. Michael Mitchell Technical Services Librarian Brazosport College Lake Jackson, TX Michael.mitchell at brazosport.edu From: Bibliographic Framework Transition Initiative Forum [mailto:[log in to unmask]] On Behalf Of Diane Hillmann Sent: Wednesday, May 29, 2013 12:00 PM To: [log in to unmask] Subject: Re: [BIBFRAME] Consistency Roy: I found the same kinds of things when aggregating NSDL data about a decade ago, though of course on a smaller scale! (Defaults with various misspellings of 'unknown' were my particular trigger). I think that what would help us avoid having to cope with crappy text into our dotage is to build tools that help us serve up standardized text when we think we still need it, while not actually creating or storing it as text. We know humans will continue to make these kinds of errors if we ask them to enter text during the cataloging process, but if users need to see these kinds of notes, we need to build smarter tools to make it happen. Continuing to rant about the imperfect humans around us doesn't help at all. Diane On Wed, May 29, 2013 at 10:49 AM, Tennant,Roy <[log in to unmask]<mailto:[log in to unmask]>> wrote: On 5/28/13 5/28/13 € 10:48 PM, "Bernhard Eversberg" <[log in to unmask]<mailto:[log in to unmask]>> wrote: >Consistency is not hugely important for purely descriptive data... > >Consistency is of utmost importance for access-related data. Agreed. But we nonetheless seem to have focused too much on consistency of descriptive data (for example, "ill." in collation statements) and yet not enough in access-related data (for example, we are unable to consistently determine when a URL will take the user to the full item). And as the table that my colleague Ralph LeVan provided earlier demonstrates, our data is horribly inconsistent in the aggregate. Here is but a beginning list of the problems we face in trying to be consistent: 1) Rules that are inexact or difficult to understand. 2) An unclear understanding, or an imperfect use (whether deliberate or inadvertent), of those rules. 3) Typographical errors. 4) Data acceptance systems (either single record or batch) that fail to validate appropriate elements. 5) Violation of rules for local purposes (for example, putting data in a different element so it will display in a particular system; or adding HTML markup to elements for local display purposes). 6) etc. I'd like to assert that these problems are in our past, but I clearly cannot. Let's take the 264 field for example[1]. Recently created, these fields are now pouring into WorldCat (in Jan. we found 56,706 such fields and in April we found 158,019 -- nearly three times as many). Meanwhile, the rules seem fairly specific about what one should do if the place of publication is not apparent[2]: put "[Place of publication not identified] :" in the $a. Not any of these: [Place of publication not identified : [place of publication not identified] : Place of publication not identified : [Place of publication unknown] : [Place of publication not given] : Unknown place of publication : [place of publication not indicated] : [Place of publication not known] : Unknow place of publication : No place of publication : Place of publication unknown : All of which (and more) already occur[3], and more still as they continue to pour in. So I guess my point is this: we all need to own this problem and work against the forces of inconsistency outlined above and others that may occur to you. These will include a wide variety of techniques that must encompass the entire library metadata ecosystem -- from the individual cataloger to the massive aggregators like my employer. Roy Tennant OCLC Resarch P.S. And please don't get me started on that colon. One rant per day is quite enough. [1] http://experimental.worldcat.org/marcusage/264.html [2] http://www.loc.gov/marc/bibliographic/bd264.html [3] http://experimental.worldcat.org/marcusage/pob.txt and for additional amusement, see all the ways "New York" has already been entered here: http://experimental.worldcat.org/marcusage/ny.txt