> Date: Fri, 3 Dec 2004 14:31:42 +0100
> From: Hedzer Westra <[log in to unmask]>
>
> Mike Taylor answered some of my questions about the CQL/SRW
> implementation details I sent in my previous message, here is my reply,
> along with some more questions.
>
> >> - how are words separated? The description hints at splitting on
> (white)space only.
> I looked it up in the HTML documentation:
> CQL tutorial, Section 2:
> [space] (separates words of a CQL expression)
Yes; but this refers to the words that make up the entire query, not
those embedded within a term. So this is talking about breaking up
the query
dc.author any "kernighan ritchie"
into the three tokens
index-name: dc.author
relation: any
term: "kernighan ritchie"
and not at all about how that term "kernighan ritchie" is to be
interpreted.
> BTW: section 4 of the CQL tutorial doesn't mention cql.anywhere
No; the tutorial is a little out of date, as it still describes CQL
version 1.0. What we have now is version 1.1.
> [...] people writing client code will assume [parsing of terms is]
> part of the default CQL semantics and not (as I understand now)
> implementation (i.e. profile) dependent. Perhaps that might be
> mentioned in the tutorial and CQL language description?
Yes, I think it should be.
> b. operator cql.exact -> default modifier is cql.string.
Well. We've not talking about it in these terms. To say "default
modifier" is misleading as there may legitimately be zero, one or more
modifiers on a relation. But, yet, the term _structure_ implied by
the cql.exact relation is indeed "string".
> Question: does this refer to
> 1. exact searching w.r.t. splitting of words (which would imply that
> cql.word and cql.string are mutually exclusive)
Yes, they are. String vs. Words is a fundamental dichotomy that we've
thrashed out neatly on this list and which should be described in both
the official documentation and the tutorial.
> 2. exact searching w.r.t. pattern matching (which would imply that
> cql.masked and cql.string are mutually exclusive),
No, a masked string is just fine. (Why would we prohibit such a
useful thing?)
dc.title exact "the adventures of *"
will find
The Adventures of Hulk
The Adventures of Baron Munchausen
The Adventures of the Famous Five
but _not_
The Amazing Adventures of Captain Gladys Stoatpamphlet and her
Intrepid Spaniel Stig.
because the extra word "amazing" breaks the "exact" condition.
> c. operator = with a single term and all other operators -> default
> modifier is cql.masked
The masked-vs.-unmasked dichotomy is orthogonal to string-vs.-words.
> But then you'd also need to be able to specify cql.unmasked or
> something to disable pattern matching.
Yes; there should be a cql.unmasked relation modifier.
> e. only one of word, string, isoDate, number and uri can be set at the
> same time for one searchClause
Correct, because these particular modifiers all represent alternative
points along the same axis.
> > So, no, you are not obliged to implement proximity.
> But:
> CQL tutorial, Section 2:
> In general, multi-word terms are interpreted as requesting records
> in which a single field contains all the specified words, in the
> specified order, with no other words in between.
Yes. "In general".
>> I see that Marc has already answered your questions about open
>> source clients.
>
> Did he?
Yes. He recommended the fine YAZ command-line client ("yaz-client")
for SRW, and the web-browser of your choice, or wget, for SRU.
_/|_ _______________________________________________________________
/o ) \/ Mike Taylor <[log in to unmask]> http://www.miketaylor.org.uk
)_v__/\ "Saying GPL is less free because it forbids proprietary
derivatives is like saying the United States is less free
because it forbids slavery" -- Ogerman at Slashdot.
--
Listen to free demos of soundtrack music for film, TV and radio
http://www.pipedreaming.org.uk/soundtrack/
|