Hal is right. Over the last decade the WorldCat database has changed
greatly in nature due to the batch loading of millions of records from
non-member libraries.
The 2009 OCLC annual report [1] has a chart on page 12 that shows that
in 1998 Worldcat had 39 million records; in 2009 it had 139 million --
most of which came from batch loading. The 2012 report has a figure of
237 million records. That's about a 6x growth in less than 15 years.
OCLC has 22.5K member libraries (which I read as being libraries that do
their cataloging on OCLC), but over 74K "participating libraries."
Another annual report gives the actual figure of member records v.
non-member records, and member records are in the minority. [citation
needed] The 2012 report gives good stats on numbers of records batch
loaded, and it's quite impressive - hundreds of millions.
As we saw with the list of subject heading terms that Roy produced
(which I don't have a link to, sorry), many of the terms ("geschichte"
was a notable one) come from data that is from outside of the
AngloAmerican world.
That said, the OCLC WorldCat database is a good measure of the
bibliographic universe beyond AACR and MARC, although the
"MARC-ification" of the data may mask some of the qualities of the
original data.
I highly recommend looking at the annual reports for good data about
OCLC's growth and contents. There are stats on record numbers by
language, etc.
kc
[1] annual reports are listed on this page:
https://www.oclc.org/en-CA/about/sustainability.html
On 3/8/13 9:20 PM, Hal Cain wrote:
> On Fri, 8 Mar 2013 16:12:48 -0500, Simon Spero <[log in to unmask]> wrote:
>
>
>> Field 245 shows some other curiosities:
>> http://experimental.worldcat.org/marcusage/245.html
>>
>> Subfield 245 $k has 469,891 occurrences, but only 427,311 holdings; this
>> suggests that there are records included in the counts which have zero
>> holdings. These might be worth filtering out.
> Something else that probably has little impact on the totals for the whole
> database, but which should be taken into account if investigation is
> segmented by publication date: my experience suggests that there is a
> sizeable number of duplicate records for pre-AACR cataloguing (that is, more
> or less, pre-1970 publications -- and I would say the period till 1980 and
> the onset of AACR2 also includes more duplicates than later. I ascribe this
> to far less uniformity in cataloguing practice before AACR, and many such
> records having been converted retrospectively with little review, and loaded
> in bulk. In addition, many foreign records, and British Library files, and
> the like, totally non-AACR/AACR2 and without any subject access, have been
> loaded in recent years. The WorldCat database is quite a mixture; beware of
> relying too heavily on any simple tabulations.
>
> Hal Cain
> Melbourne, Australia
> [log in to unmask]
--
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
|