Print

Print


Hal is right. Over the last decade the WorldCat database has changed 
greatly in nature due to the batch loading of millions of records from 
non-member libraries.

The 2009 OCLC annual report [1] has a chart on page 12 that shows that 
in 1998 Worldcat had 39 million records; in 2009 it had 139 million -- 
most of which came from batch loading. The 2012 report has a figure of 
237 million records. That's about a 6x growth in less than 15 years.

OCLC has 22.5K member libraries (which I read as being libraries that do 
their cataloging on OCLC), but over 74K "participating libraries." 
Another annual report gives the actual figure of member records v. 
non-member records, and member records are in the minority. [citation 
needed] The 2012 report gives good stats on numbers of records batch 
loaded, and it's quite impressive - hundreds of millions.

As we saw with the list of subject heading terms that Roy produced 
(which I don't have a link to, sorry), many of the terms ("geschichte" 
was a notable one) come from data that is from outside of the 
AngloAmerican world.

That said, the OCLC WorldCat database is a good measure of the 
bibliographic universe beyond AACR and MARC, although the 
"MARC-ification" of the data may mask some of the qualities of the 
original data.

I highly recommend looking at the annual reports for good data about 
OCLC's growth and contents. There are stats on record numbers by 
language, etc.

kc

[1] annual reports are listed on this page: 
https://www.oclc.org/en-CA/about/sustainability.html

On 3/8/13 9:20 PM, Hal Cain wrote:
> On Fri, 8 Mar 2013 16:12:48 -0500, Simon Spero <[log in to unmask]> wrote:
>
>
>> Field 245 shows some other curiosities:
>> http://experimental.worldcat.org/marcusage/245.html
>>
>> Subfield 245 $k has 469,891 occurrences, but only 427,311 holdings;  this
>> suggests that there are records included in the counts which have zero
>> holdings.  These might be worth filtering out.
> Something else that probably has little impact on the totals for the whole
> database, but which should be taken into account if investigation is
> segmented by publication date: my experience suggests that there is a
> sizeable number of duplicate records for pre-AACR cataloguing (that is, more
> or less, pre-1970 publications -- and I would say the period till 1980 and
> the onset of AACR2 also includes more duplicates than later.  I ascribe this
> to far less uniformity in cataloguing practice before AACR, and many such
> records having been converted retrospectively with little review, and loaded
> in bulk.  In addition, many foreign records, and British Library files, and
> the like, totally non-AACR/AACR2 and without any subject access, have been
> loaded in recent years.  The WorldCat database is quite a mixture; beware of
> relying too heavily on any simple tabulations.
>
> Hal Cain
> Melbourne, Australia
> [log in to unmask]

-- 
Karen Coyle
[log in to unmask] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet