These are some of my thoughts on the issues raised
during recent discussion of cql.
There's some misunderstand of why we're developing
the Bath-index definitions. There are two reasons:
1. Bath defines searches in terms of explicit
attribute combination; no other profile seems to
do this, not as directly anyway. As such, it's a
good fit for srw, because the primary premise
behind cql index definitions is that they imply a
specific attribute combination. Bath of course
serializes and transmits individual attributes (it
has no choice; it's Z39.50) while srw sends the
symbolic name instead.
2. One of the original premises of srw was that
queries would be based on or map to bath searches.
There's the question of why these definitions
don't call-out 6 attributes, as bath does. The
current draft includes 4 attributes, because the
other two will be explicitly part of the syntax --
truncation and relation (at least, that's the
current thinking). If we wanted to make
completeness, position, and structure also
explicitly part of the cql syntax then we wouldn't
need four (would only need one) but we don't want
to do that. I think the consensus is that these
are index-related attributes.
We don't want to include a relation attribute as
part of an abstract index definition. In the
query (bath.titleFirstPart = "tasmanian tiger") ,
the equal character (=) is explicit in the cql.
If we want to put ">" instead, then I suppose
Alan's concern is that this doesn't represent a
real bath search, so calling it a bath search is
misleading. But the index definition alone isn't
the "bath search", the cql string is. We need to
illustrate a cql-string class that maps to a
particular bath search. The cql string would
include the "=", and if a cql string is sent that
includes ">", then it's not a bath search (even
though it used a bath abstract index). Similar
argument for truncation.
So I think that the four attributes are the right
set, but I would like to hear whether others agree
on this point.
As to Alan's point of whether bib-1 needs to be
stated explicitly as part of the bath index
definitions -- or more generally, the attribute
set for a given attribute in an index definition
-- of course it does (that was just laziness on my
part). There is no premise whatever that all the
attributes that comprise a cql index definition be
bib-1, and they don't have to all be from the same
attribute set. It just happens that we haven't
defined any indexes yet that include attributes
from other sets.
Now about the cql syntax. I know we want to keep
it as simple and informal as possible but from the
recent discussion, I've come to the opinion that
cql needs to provide (not mandate) explicit
exposure of operands and terms. I'm not trying to
turn this into an rpn query, but something simple
like (optional) parenthesis around operands and
(optional) quotes around terms would directly
address some of the problems cited. Something
like:
(bath.titleFirstPart= "tasmanian tiger") and
(bath.titlePhrase="hobart Zoo")
[Note bath.titlePhrase isn't currently included in
the list I drafted but probably needs to be
added.]
And if the quote character needs to be represented
in a term, use an extra quote to escape.
I don't have access to the ccl or 8777 standard so
someone please translate. If these standards don't
provide this capability then let's make it up. The
parenthesis and quotes (or whatever we use)
wouldn't be mandatory but if omitted then the
client would have to live with the resulting
ambiguity. If there was no potential ambiguity
then no need to include them, or we could have
precedence rules.
I don't have any insight to offer on the
punctuation problem.
--Ray
|