When we have a personal name that conflicts with another heading,
LCRI 22.17-22.20 tells us to add subfield $q if some part of the
heading is an initial or other abbreviation and we have the full form
for that initial or abbreviation. Note that both parts of this
condition have to be fulfilled in order for us to add $q at this
point: we not only have to know what the full name is, but we have to
have an abbreviation or initial in subfield $a.
(Although it's somewhat beside the point, I can't resist mentioning
at the outset the careful use in the RI of the terms "initial" and
"abbreviation." These term do not, I think, include shortened or
familiar forms such as "Bill" or "Bea" or "Tom" or "Rudy" or "Greg"
or "Steve", or nicknames such as "Bull" or "Red". For shortened or
familiar forms, $q is not authorized by clause 1a of the rule interpretation.)
If adding subfield $q giving the full form for an initial or
abbreviation doesn't make the heading unique (or if we don't have
full forms for abbreviations or initials, or if the name contains no
abbreviations or initials), we add dates in $d if available. (I'm
quite aware that under the separate LCRI 22.17, we will actually have
already added dates to a new heading if they are available; but this
if anything reinforces the point I want to make about 22.17-20, and
$q in particular.)
I have always assumed that this RI presents things in hierarchical
fashion: you start at the top, and you stop as soon as the exercise
of one of the possibilities produces a unique heading. Were this not
the case (i.e., if we're supposed to apply all of the possibilities
even where not needed), then there wouldn't be any need for the
explicit instruction to add $q for abbreviation/initials and $d for
dates when both are available.
I'll throw in this aside for completeness and to avoid confusion:
Later on in the rule interpretation, we're told that (if all of the
above stuff has failed to produce a unique heading) we can add
subfield $q for parts of the name not present in subfield $a even
though abbreviations are not involved. But we would only do this, I
hasten to emphasize, if the application of foregoing instructions has
not already given us a unique heading. (The RI mentions yet other
possibilities for disambiguating headings, which are beside the point
of this diatribe.)
There's a reason we shouldn't use $q unless necessary, although I
have no way of knowing whether this reason was part of the design of
the RI: it usually makes a better order. To illustrate, let's assume
we have an existing, neatly-ordered file along these lines (I'm
making up this example to protect the guilty; this list should be
assumed to contain some things that represent 100 fields, and some
that represent 400 fields; beside the point):
Strawn, Robert, 1793-1872
Strawn, Robert, 1915-
Strawn, Robert, 1945
Strawn, Robert, 1949-2003
Strawn, Robert, 1951-
Strawn, Robert, 1978
Strawn, Robert A., 1936-
Strawn, Robert B.
Strawn, Robert C., 1947-
Strawn, Robert Conrad, Mrs., 1895-1977
<dozens more Robert Strawns here>
So far, so lovely. Now we've got a new Robert Strawn, and at the
time we're establishing the heading we know that his middle name is
Michael, and that he was born in 1946. Applying LCRI 22.17-22.20
(or, to be more precise, not applying it, because in fact we've
already added dates according to 22.17 so our heading is already
unique), we will not add subfield $q, and we will end up with this
heading, which falls very neatly into the above sequence:
Strawn, Robert, 1946-
If, on the other hand, we were to throw into the heading everything
we know about the person even if not necessary, we would end up with
this heading, which is going to end up at some point in the list
(given current sorting regimes) that is probably less than helpful:
Strawn, Robert (Robert Michael), 1946-
(Warning: Don't even get me started on the sort order provided by the
current group of library automation vendors.)
So far, so clear, I hope. In the absence of an abbreviation/initial
we don't use $q if we have $d, unless nothing else will serve to
produce a unique heading; and that's for a good reason.
A recent traversal through new LC/NACO records issued to date in 2007
turns up 284 cases of personal names that do not have a full stop in
subfield $a (and are therefore assumed not to involve an abbreviation
or initial) and contain both subfield $q and $d. (I didn't consider
name/title headings in this tablulation. We're talking about name
headings, so things with subject subdivisions don't come into the
equation, either.) My working assumption is that these 284 personal
name headings were constructed in error.
To make things easier (on me if not on you), I concentrated on the 4
contributing institutions with 10 or more headings in the "likely
error" pile; there were only 4 of these. (No, I'm not going to tell
you who they are, although 2 might be obvious enough. The point here
isn't to jump on any particular institution.) I'll call them A, B, C
and D. I manually checked each of the likely errors for these four
institutions against headings in the LC/NACO authority file. I found
that a few instances of co-occurring $q and $d were in fact warranted
by existing headings. (In other words, for a few of the "likely
errors" we have two different people using the same basic name; these
people were born in the same year but we do not have month and day of
birth for either; and we know about some unused parts of name for one
of them.) I removed these from my counts. (For institution A I
discarded 3 reported potential errors; for institution D, I discarded
2; none discarded for B and C.)
In the following tabulation, "contributed" records are: new personal
name records with no subfield $t. What I'm trying to tease out is
the ratio of erroneous headings to the total number of records
created: the rate for this particular kind of error. (The count of
errors doesn't include "likely errors" that turned out to be correct.)
A: contributed 83,058 records, of which 76 are errors: error
rate of 0.0915%
B: contributed 346 records, of which 10 are errors: error
rate of 2.89%
C: contributed 957 records, of which 11 are errors: error
rate of 1.149%
D: contributed 23,505 records, of which 40 are errors: error
rate of 0.1702%
For these four institutions taken as a group, the average error rate
is 0.127%. So one large contributor is doing a bit better than
average, another large contributor is not doing quite so well, and
the two smaller contributors are well above the average. My
impression, from spot-checking headings for institutions with a
smaller number of likely errors (including those produced by my own
institution, I hasten to add) is--because of the substantial weight
of the records generated by institution A--that the error rates for
these would prove in most cases to be above the average as well.
So, finally, I come to my point: could we please restrict the use of
subfield $q to those cases where it is necessary and called for by
the rules we're supposed to be following?
Gary L. Strawn, Authorities Librarian, etc.
Northwestern University Library, 1970 Campus Drive, Evanston IL 60208-2300
e-mail: [log in to unmask] voice: 847/491-2788 fax: 847/491-8306
Forsan et haec olim meminisse iuvabit. BatchCat version: 2006.51.826
|