On 12/22/05, Ray Denenberg, Library of Congress <[log in to unmask]> wrote:
> From: "Karen Coyle" <[log in to unmask]>
> > Since CQL doesn't correspond to particular fields (which I must say
> > strikes me as being abstract to the point of non-utility, but oh well),
>
> Perhaps I overstated the "abstractness".
>
> There is the Z39.50 bib-1 search point 'title' - it's supposed to correspond
> to whatever you consider a title to be for records in a particular
> database -- and if it happens to be a MARC database it is supposed to
> correspond to marc fields 130, 21X-24X, 440, 490, 730, 740, 830, 840,
> subfield $t; 400, 410, 410, 600,610, 611, 700, 710,711, 800, 810, 811.
>
> So 'title' is quite concretely defined, if MARC is your *reference* format.
> The *abstractness* comes into play because we want the search point to be
> useful whether it's MARC records or not. So the definition is extended to
> say (in effect) for "non-MARC records, 'title' corresponds to whatever you
> think a title is, and for reference, here is what it is for MARC."
>
> So we're trying to do come up with search points corresponding to MODS, that
> is, where MODS is the reference format, analogous to MARC for bib-1.
>
>
> > is there any reason why you couldn't produce search points for things
> like:
> >
> > keyword
> > name
> > subject
>
> We absolutely can define these search points. It takes some work. And it
> might be beyond the scope of MODS. Or it might not. Or part of it might and
> part not.
We (PINES/OpenILS/Evergreen) are using MODS as an intermediate format
for indexing our catalog. The records are stored in MARCXML, but we
transform them to MODS for indexing and display (or, more properly,
displayable data extraction) because MODS /greatly/ simplifies working
with records when you're a programmer that's not also an experienced
cataloger (however, our scheme has been vetted by our catalogers).
Although we don't yet have a Z39.50 or SRU/W server built, we have
done essentially what Karen describes for our internal search methods:
define searchpoints in MODS for title, author, subject, keyword and
series.
Here is a simplified version of the XPath we use to extract the indexable data:
title -> mods:mods/mods:titleInfo (also separated by title types)
author -> mods:mods/mods:name[mods:role/mods:text[text()="creator"]/namePart
(separated by name type)
subject -> mods:mods/mods:subject/* (separated by node name
(geographic, name, temporal, topic))
series -> mods:mods/mods:relatedItem[@type="series"]/mods:titleInfo
keyword -> mods:mods/*[not(local-name()='originInfo')] (everything
except originInfo and subnodes)
This allows us to search by title or even title.translated, and the
same is true of the other search "classes". Because MODS does,
generally speaking, the "right thing" with MARCXML it's extremely easy
to filter.
> Take 'name' for example. If you want to define a CQL search point name'
> you first want to decide what the scope is. Let's say we use MODS as a
> reference format in making that decision. Do you want to define a 'name'
> search point such that if you search on 'name' you'll be searching all of
> the following:
> name -personal
> name corporate
> name - conference
> name - part
> name - affiliation
> name - role
>
> Or is it really a different set that you'd have in mind? (Rhetorical
> question.)
>
> And the same question for 'subject'.
>
> For 'keyword' I'd be hard-pressed to make any correspondence to MODS.
> That doesn't at all mean that CQL shouldn't define a 'keyword' search
> point, it just means that it would fit somewhere else in the search
> architecture.
I realize that because we're using MARC as the underlying format we're
at an advantage as far as defining searchpoints, if only because we
have the MARC searchpoints to abstract out to MODS based on the
transformation. However, I personally think those searchpoints are
fairly transparent for MODS on it's own.
In fact, that's how I developed them. I learned MODS, developed those
searchpoints (or something very close), and then went to my catalogers
to vet the design based on the MARC->MODS transform and the
established MARC searchpoints.
--
Mike Rylander
[log in to unmask]
GPLS -- PINES Development
Database Developer
http://open-ils.org
|