> Date: Fri, 3 Dec 2004 14:31:42 +0100
> From: Hedzer Westra <[log in to unmask]>
> Mike Taylor answered some of my questions about the CQL/SRW
> implementation details I sent in my previous message, here is my reply,
> along with some more questions.
> >> - how are words separated? The description hints at splitting on
> (white)space only.
> I looked it up in the HTML documentation:
> CQL tutorial, Section 2:
> [space] (separates words of a CQL expression)
Yes; but this refers to the words that make up the entire query, not
those embedded within a term. So this is talking about breaking up
dc.author any "kernighan ritchie"
into the three tokens
term: "kernighan ritchie"
and not at all about how that term "kernighan ritchie" is to be
> BTW: section 4 of the CQL tutorial doesn't mention cql.anywhere
No; the tutorial is a little out of date, as it still describes CQL
version 1.0. What we have now is version 1.1.
> [...] people writing client code will assume [parsing of terms is]
> part of the default CQL semantics and not (as I understand now)
> implementation (i.e. profile) dependent. Perhaps that might be
> mentioned in the tutorial and CQL language description?
Yes, I think it should be.
> b. operator cql.exact -> default modifier is cql.string.
Well. We've not talking about it in these terms. To say "default
modifier" is misleading as there may legitimately be zero, one or more
modifiers on a relation. But, yet, the term _structure_ implied by
the cql.exact relation is indeed "string".
> Question: does this refer to
> 1. exact searching w.r.t. splitting of words (which would imply that
> cql.word and cql.string are mutually exclusive)
Yes, they are. String vs. Words is a fundamental dichotomy that we've
thrashed out neatly on this list and which should be described in both
the official documentation and the tutorial.
> 2. exact searching w.r.t. pattern matching (which would imply that
> cql.masked and cql.string are mutually exclusive),
No, a masked string is just fine. (Why would we prohibit such a
dc.title exact "the adventures of *"
The Adventures of Hulk
The Adventures of Baron Munchausen
The Adventures of the Famous Five
The Amazing Adventures of Captain Gladys Stoatpamphlet and her
Intrepid Spaniel Stig.
because the extra word "amazing" breaks the "exact" condition.
> c. operator = with a single term and all other operators -> default
> modifier is cql.masked
The masked-vs.-unmasked dichotomy is orthogonal to string-vs.-words.
> But then you'd also need to be able to specify cql.unmasked or
> something to disable pattern matching.
Yes; there should be a cql.unmasked relation modifier.
> e. only one of word, string, isoDate, number and uri can be set at the
> same time for one searchClause
Correct, because these particular modifiers all represent alternative
points along the same axis.
> > So, no, you are not obliged to implement proximity.
> CQL tutorial, Section 2:
> In general, multi-word terms are interpreted as requesting records
> in which a single field contains all the specified words, in the
> specified order, with no other words in between.
Yes. "In general".
>> I see that Marc has already answered your questions about open
>> source clients.
> Did he?
Yes. He recommended the fine YAZ command-line client ("yaz-client")
for SRW, and the web-browser of your choice, or wget, for SRU.
/o ) \/ Mike Taylor <[log in to unmask]> http://www.miketaylor.org.uk
)_v__/\ "Saying GPL is less free because it forbids proprietary
derivatives is like saying the United States is less free
because it forbids slavery" -- Ogerman at Slashdot.
Listen to free demos of soundtrack music for film, TV and radio