On 5/28/13 5/28/13 € 10:48 PM, "Bernhard Eversberg" <[log in to unmask]>
wrote:
>Consistency is not hugely important for purely descriptive data...
>
>Consistency is of utmost importance for access-related data.
Agreed. But we nonetheless seem to have focused too much on consistency of
descriptive data (for example, "ill." in collation statements) and yet not
enough in access-related data (for example, we are unable to consistently
determine when a URL will take the user to the full item).
And as the table that my colleague Ralph LeVan provided earlier
demonstrates, our data is horribly inconsistent in the aggregate.
Here is but a beginning list of the problems we face in trying to be
consistent:
1) Rules that are inexact or difficult to understand.
2) An unclear understanding, or an imperfect use (whether deliberate or
inadvertent), of those rules.
3) Typographical errors.
4) Data acceptance systems (either single record or batch) that fail to
validate appropriate elements.
5) Violation of rules for local purposes (for example, putting data in a
different element so it will display in a particular system; or adding
HTML markup to elements for local display purposes).
6) etc.
I'd like to assert that these problems are in our past, but I clearly
cannot. Let's take the 264 field for example[1]. Recently created, these
fields are now pouring into WorldCat (in Jan. we found 56,706 such fields
and in April we found 158,019 -- nearly three times as many). Meanwhile,
the rules seem fairly specific about what one should do if the place of
publication is not apparent[2]: put "[Place of publication not identified]
:" in the $a. Not any of these:
[Place of publication not identified :
[place of publication not identified] :
Place of publication not identified :
[Place of publication unknown] :
[Place of publication not given] :
Unknown place of publication :
[place of publication not indicated] :
[Place of publication not known] :
Unknow place of publication :
No place of publication :
Place of publication unknown :
All of which (and more) already occur[3], and more still as they continue
to pour in.
So I guess my point is this: we all need to own this problem and work
against the forces of inconsistency outlined above and others that may
occur to you. These will include a wide variety of techniques that must
encompass the entire library metadata ecosystem -- from the individual
cataloger to the massive aggregators like my employer.
Roy Tennant
OCLC Resarch
P.S. And please don't get me started on that colon. One rant per day is
quite enough.
[1] http://experimental.worldcat.org/marcusage/264.html
[2] http://www.loc.gov/marc/bibliographic/bd264.html
[3] http://experimental.worldcat.org/marcusage/pob.txt and for additional
amusement, see all the ways "New York" has already been entered here:
http://experimental.worldcat.org/marcusage/ny.txt
|