> Date: Fri, 20 Sep 2002 00:43:28 +0100
> From: Robert Sanderson <[log in to unmask]>
>
> If you want to searh for dublin core, then either do a proximity
> search or an unanchored word based string search.
Porximity works; trying to hack it with strings won't, as we have
already more than demonstrated in the last couple of weeks' to-ing and
fro-ing. On top of all the cases we've already discussed, how about
if you string-search for "dublin core" but my record had "dublin
core" (with two spaces)? Or a tab or newline? Clearly finding the
record would be The Right Thing, but then you're forever klagging
extra special cases on what you mean by "string matching".
Ralph is right on this one -- word indexing is a fundamentally
different kettle of fish. I know, because of my doomed attempts to
implement word-searching in zSQLgate based on LIKE matching :-)
So phrase searching _must_ be done in word indexes, not string
indexes. You are quite right that the "dublin core" search can be
couched as a proximity search, but I think we all agree that this is
rather heavyweight. So why not just Say What We Mean and go with a
structure attribute that says "this search term is a list of words"?
Which is what adjacencyWordList is, if I understand it right. Good.
As for adding adjacencyWordList to the BIB-1 attribute set: I have no
objection to this, since it's a useful thing. But if the BIB-1 old
timers don't like it, I agree with Ray that we should just shrug and
use the AA utility-set attribute. And since we pretty much know that
they won't like it, let's skip the stage of proposing it, and just go
with the AA as God (or at least Clifford Lynch) intended.
And now on to the main event --
> My second impression is that if it takes this much effort for people
> -extremely- familiar with Z39.50 to write a specification for what
> Dublin Core means, which is about as simple as it gets, then the bar
> is intolerably high for anyone not in the Z39.50 community already
> to take up SRW.
Yes. No-one will read this document.
> I suggest -not- requiring a semantic definition in terms of Z39.50
> attributes for SRW indexes, as this step will be almost universally
> ignored as too hard and for no appreciable gain.
Indeed -- people who don't already know what "titleWord" means are not
going to be able to find out by reading this document. So I humbly
offer the following Dublin Core/SRW qualifier-set semantics document:
Qualifier Semantics
+----------+--------------
title | What it says.
titleWord | What it says.
author | What it says.
authorWord | What it says.
subject | What it says.
... | etc.
_/|_ _______________________________________________________________
/o ) \/ Mike Taylor <[log in to unmask]> www.miketaylor.org.uk
)_v__/\ Never look a gift-chicken in the beak.
|