USMARC subscribers might be interested in this message forwarded from the
AUTOCAT list.
---------- Forwarded message ----------
[log in to unmask] wrote:
>
> Catalogers ...
> Yesterday we ran across something we had never seen before in an OCLC
> record ... #32822869, Mimbres mogollon archaeology. It has an 856 with a
> URL for a table of contents on lcweb
> (http://lcweb.loc.gov/catdir/toc/95-4395.html).
> Is this something LC is doing on a routine basis? I couldn't see anything
> on the main catdir page that explained the project.
> I tried searching Autocat archives to see if this had been mentioned before.
> Could someone please enlighten me?
> Thanks much!
> Margo
> *************************
>
> Margo Warner Curl
> Technical Services Librarian
> The College of Wooster Libraries
> phone: 330/263-2154
> fax: 330/263-2253
------------------------------
Date: Tue, 9 Dec 1997 13:37:56 -0500
From: David Williamson <[log in to unmask]>
Subject: LC TOC explanation (long)
I wanted to respond to the discussion regarding tables of contents
showing up either in 505 fields or else by a link in the 856 field. LC
has several projects going on involving TOC, and I have written to AUTOCAT
about some of them in the past. This work is growing, both here at LC and
among other institutions and vendors such as OCLC, WLN, Yankee Book
Peddler, BNA, and others. I want to summarize our projects, offer a few
thoughts, and ask a few questions. If you respond to anything contained
here, please send me a copy at my personal address ([log in to unmask]) as I
usually only have time to skim AUTOCAT these days. Also, as usual, I am
not conveying official LC policy here, just my views based on the work I
am doing in this area.
The first TOC project we have is the one that initiated the message
by Margo Curl. This is the Electronic CIP Experiment. The experiment
began in November 1993 with the first electronic manuscript submitted for
cataloging. This was just an ASCII text of the paper book to be published
(as are all E-CIPs). Since that time, we have received over 2500
manuscripts for electronic processing. You can read about the mechanics
of E-CIP processing in CCQ (Vol. 22, No. 3/4, 1996, pp. 179-196) or
chapter 9 of _Planning and Implementing Technical Services Workstations_,
edited by Michael Kaplan, ALA Publications, c1997. Almost since the
beginning, we have been including the TOC information in the 505 field
when possible. There are 2 criteria:
1. Does the TOC provide useful information (good search terms, good
contents indication, access to individual articles, etc.)?
2. Can this data be manipulated quickly (5 minutes maximum) and easily
into something the E-CIP cataloging program can manipulate into a 505
field?
If the answer is yes, then the data are added in the 505 field. If
not, the TOC is ignored. At the moment, we are processing about 1000
E-CIPs per year in the experimental mode. We are providing 505 access to
TOC data for about half of these items. The other half are not worth 505
fields (TOCs for novels) or are complex TOCs that would require more than
5 minutes to edit into a useable form. Also, LC is not using the enhanced
505 for monograph cataloging at the moment.
It became evident that we were losing some good TOC information with
complex TOCs. A method was developed to take this TOC information from
the E-CIP manuscript, wrap minimal HTML coding around it, save it to the
WWW server, and add the 856 field to the catalog record. This required
only a few clicks of the mouse and only a few seconds to accomplish. This
is the type of TOC information found by Margo Curl.
A few points related to this TOC project:
1. Contents notes are not normally given for monographs. However, if the
data are available electronically and if these data can be manipulated by
a program into a 505 field or into a WWW file with a link, shouldn't they
be provided as an enhancement to the record?
2. Normally, 505 fields are not provided for contents with more than 10 or
so elements. Again, if the data are available and can be quickly
manipulated, we will provide it (chapter titles only) with E-CIPs even if
more than 10 elements are present.
3. Since there is no rekeying, it is faster and more accurate and thus
worth the effort.
4. When E-CIP goes into production (hopefully early 1998), the number of
records with 505s or 856s will definitely increase as this becomes our
primary source for TOC/record enhancement data for monographs.
Now let's get into the 505/856 debate. I think many of us agree that
TOC information can be very helpful to the user in determining if a
particular book is of interest. I have explained the parameters for 505
fields in the E-CIP record. What if the TOC is complex but has a lot of
very useful information? Should it just be ignored? Should we not
create the WWW file and link with the 856? This is a real question I
would like your opinion about (to my email address: [log in to unmask]).
Whenever I demonstrate the 505/856 TOC work we are doing, people always
say they would prefer the 505 (as do I) but within the parameters we have
to work with, the 856 link is better than nothing, isn't it?
That's how our 856 TOC link came about. If you want to try a quick
demo of how it can be used, go to http://lcweb.loc.gov/z3950, go to
"Search Library of Congress Catalog" and select "Simple Search (title or
personal name)". Enter 2 words in the box: GLOBALIZATION HASSAN (not in
caps) and click on the search button. You'll get a brief record for the
book with a link to the TOC. While this is not the MUMS search system
(which is not Internet aware), this search system is used by many people
around the world as is the Experimental Search System, both of which are
Internet aware and will automatically link the TOC for the user, so that
the user does not need to start other software or go somewhere else to
access the TOC information.
There are two other projects involving TOC data and 856 fields that I
want to mention quickly. The first is to buy TOC data from a vendor. In
my research into how we will deal with vendor supplied TOC data, I have
found that this TOC data tends to be unsuitable for 505 use (not the
enhanced 505, remember). There are 2 problems:
1. Some might be OK for a 505, some not. We need an automatic way to
determine this and so far I can't find a way without involving a
human--not good. With shrinking staff, fewer funds, etc., this must be an
automatic process.
2. Since we are not using the enhanced 505 (not for me to decide or debate
here), we would only record chapter titles in the 505. In the TOC data I
have seen, it is not possible to determine chapter titles from section
titles from part titles without human intervention. The MARC-like coding
I have seen used in vendor TOC files tends to use a hierarchy but does not
establish a base level for the hierarchy. In one TOC record, the chapter
titles are the first level in the hierarchy. In another, they are the
third level behind part titles and section titles. Again, this has to be
an automatic process.
The second project is just getting underway and shows promise. Under
the auspices of our Bibliographic Enrichment Advisory Team (BEAT) which
was started with a grant from the Edward Lowe Foundation, we are
experimenting with scanning TOCs from books already published in business
& economics. What we are doing is photocopying the TOC, adding the LC
card number to the top of the page, scanning it in a batch mode, and
running the TOCs through character recognition (which adds a hard page
break between each TOC to separate them). After that, the TOCs are run
through a program which "reads" the LC card number, opens up the
bibliographic record, and adds the 856. What the program then does is
snag the first subject heading from the record, adds the subject heading
in the metadata keyword field, wraps HTML around the TOC data, and saves
the TOC file to the WWW server. The technician is still learning the
process but can do 7 per hour (photocopying, scanning, everything) and is
getting better. It is hoped to get up to 10 per hour. So far about
100-150 are done and if we can get this up to 10 per hour, we'll expand
out into other areas. Reports will be presented in issues of LCCN.
In this last project, why are we using subject headings in the
metadata tag? As part of an experiment to see if it is useful to have
people lead into the catalog from the Internet search engines. Do the
same "GLOBALIZATION HASSAN" search on Yahoo. Yahoo will pass it off to
Alta Vista. When you see the response from Alta Vista, look what is the
first hit--the LC TOC WWW file. When you click on it, you go to the TOC
file on the WWW. If the TOC looks interesting, you can click on the two
buttons at the top of the screen, one for the Z3950 search you did earlier
and one that does the same search in the Experimental Search System. If
you go to the ESS, you see that many of the elements of the OPAC display
are hotlinked. Want to see more by this author? Click on his name. Want
to see more on this subject matter? Click on the subject heading. So,
with this, we are hoping that by using the controlled vocabulary from the
subject headings, if users happen to use those terms in a search engine,
the TOC files will be found and hopefully brought higher up the search
results list. If that actually happens remains to be seen, but if so, we
can then lead the user into the catalog and to the bibliographic record in
ways we couldn't before. So now we can take people from the catalog out
and from outside the catalog in.
Will all of this be of any use? You tell me (another real question I
am asking). We are experimenting with new ways and methods not tried
before now (that we know of). With OPACs becoming Internet aware and all
of the electronic resources out there that need organizing, I am hoping
that these techniques will help us deal with them and get ourselves in
shape for whatever is the next step. I hope this explains a lot about
those TOC records and why we are doing them. Please write me if you have
any further questions.
David Williamson
|=====================================|
| |
| David Williamson |
| Cataloging Automation Specialist |
| Cataloging Directorate |
| Library of Congress |
| Washington, D.C. 20540-4300 |
| 202.707.5179 (voice) |
| 202.707.2824 (fax) |
| [log in to unmask] Team OS/2 |
| |
|=====================================|
|