Print

Print


We would like to see developed a set of search points based on MODS.

In particular, this would be used when searching MODS records via SRU
(though not intended exclusively for this purpose), so in SRU parlance, this
would be a CQL context set (see "Additional Notes" below).

The SRU implementors will hold a meeting in March,  I would like to propose
(prior to that meeting) a draft set of bibliographic search points based on
MODS, and I'm asking this group (the MODS list) to help develop it.

To begin, I've listed a set of search points. The list is a first cut and
needs to be pruned, as I've included most everything. So  I would like to
ask those of you who might be willing to help on this: (1) which of these
search points are candidates for pruning, that is, they are less- (or not-)
likely to be searched; (2) which are most likely to be searched; (3) what
search points likely to be searched are missing. Please note that even
though I've probably included too many search points, I've also arbitrarily
excluded several, so I'm hoping that some of you will give this critical
review.

Thanks for your help.

Ray Denenberg

BEGIN LIST
------------------------------------------
title
title - abbreviated
title translated
title -alternative
title - uniform
title - sub
title - part number
name -personal
name corporate
name - conference
name - part
name - affiliation
name - role
resource type (enumerated)
genre (controlled)
place of origin (controlled)
publisher
date issued
date created
date captured
date valid
date modified
copyright date
edition
issuance (enumerated)
frequency
language (controlled)
physical form (controlled)
reformatting quality (enumerated)
internet media type
extent
digital origin (enumerated)
abstract
table of contents
target audience (controlled)
note
subject - topic
subject - geographic  (controlled)
subject - temporal
subject - title
subject - name
subject - cartographic scale
subject - cartographic projection
subject - cartographic coordinates
subject - occupation
classification (controlled)
identifier-hdl
identifier-doi
identifier-isbm
identifier-isrc
identifier-ismn
identifier-issn
physical location
location URL
access condition
part- detail - number
part- detail - caption
part- detail - title
part - extent - start
part - extent - end
part - extent - total
part - date
-----------------------------------------
END LIST

Additional Notes. (Which you don't have to read by which may be helpful.)

The SRU protocol specifications are available at  http://www.loc.gov/sru/;
for CQL see http://www.loc.gov/sru/cql/.

CQL calls search points "indexes", actually "abstract indexes" -- "abstract"
for two reasons: (1) There is no implementation implication -- support of a
search point does not necessarily require that you implement an index. (2)
The index does not necessarily correspond directly to an element in the
record.

In this case, each index *does* correspond to a MODS element, but the names
are different, deliberately so, in order to maintain this independence -- in
theory, these indexes could be used to search records other than MODS if
there is an appropriate mapping.  But in this case it should be clear for
each index what MODS element it corresponds to (if not, then the index is
poorly named).

Note also, these are *flat* indexes that do not directly reflect the xml
structure.

CQL indexes are analogous to Z39.50 Use attributes; see
http://www.loc.gov/z3950/agency/defns/bib1.html#use

The motivation for this effort is twofold:

(1) There are current projects where MODS records are being harvested (e.g.
via OAI) and the harvested records are then searched. For example see the
DLF Aquifer project (http://www.diglib.org/aquifer/).

(2) SRU/CQL has not really developed a coherent set of bibliographic search
points.

Some observations about point (2).
-    CQL has defined a "dublin core" index set.  It is useful for some
applications but of little or no value for bibliographic searching.
-  The Z39.50 bib-1 set is not a good place to start.  It began to grow
out-of-control a number of years ago because of some deficiencies in the
protocol and more particularly some of the implementations. We've addressed
these deficiencies in SRU/CQL.
- There is a Bath profile for CQL which specifies a number of bibliographic
indexes, but it doesn't seem rich enough.
- There is also in development a MARC index set - we think that a
bibliographic and a MARC index set will complement each other, the first is
abstract and the second is concrete.

Finally, the concept of an "index set" --  a set of related abstract indexes
(or seach points) -- actually has been generalized into the concept of a
"context set" in CQL, which is described at
http://www.loc.gov/standards/sru/cql/index.html#context. A context set in
CQL is loosely analogous to a Z39.50 attribute set, and an index set would
be analogous to the set of Use attributes for an attribute set.