> Royal Library of Sweden, which will probably also be used by the ONE
> Association. I'd like to share my thoughts on SRU and CQL, I realize
> that there are reasons for some of the things I critize, but I thought
> that my first impressions might be valuable since I guess most of you
> have implemented and created the protocol in parallell.
First impressions are the ones that last :)
The answer to most of the questions is:
Yes that's the way it works, but don't feel like you have to implement
it.
> * carat/hat in word lists (from the examples)
> I would say that 'dc.title any "^cat ^dog eats rat"' matches "cat eats
> dog". "cat" and "eats" matches part of the string, "dog" is not in the
> beginning, but since the relation is "any" it's still a match. Right?
> (But honestly, do we really really need left-anchored words in word
> lists, it's neat, but is it necessery?)
It's not necessary, but if we don't define what happens when the client
sends ^ in a word list, there'll be interoperability problems as
developers make up their own minds as to what it might mean.
> * the CQL BNF
> According to the current BNF "cat prox/>/2//ordered hat" is not a valid
> query since (1) a modifier can not only be "/>" and (2) two consecutive
> slashes is not allowed. Is it supposed to be:
Yes. Where was that example, because it's out of date.
It should now be:
cat prox/distance=3/unit=word/ordered hat
> The problem with Z39.50 was(/is) that you need a lot of toolkits
> implementing standards that are not widely used. If you want to write a
> server from scratch, it's quite a lot of work. Which is silly for just a
> search/retrieve protocol. And that is *before* you start worrying about
> profiles and indices. Z39.50 failed to *keep it simple*.
I agree, but would like to highlight 'not widely used'. The toolkits that
SRW relies on are pretty widespread and available for free in many
different languages.
> In my opinion the following features add to the the percieved complexity
> of the protocol:
> * lists
> Confusing syntax, sometimes a string is a string, sometimes it's a list
> of (unordered) words. I would prefer an explicit list syntax. For
> example:
>
> "A B C" --> [A, B, C]
> "\"A B\" C" --> ["A B", C]
In CQL?
> * encloses and within (and partial)
> It seems to me that what you are trying to achieve is to do tests on
> server-side n-dimensional objects. Although a geometrical search engine
> would be totally awesome, I doubt that it is something that will be
> widely implemented (across different communities), and therefore should
> not be in the cql context set.
That's one of the applications of encloses/within/partial. The other
significantly more widespread use is date searching.
> If you want access a two-dimensional object's members, couldn't you just
> use different index names? Like this:
Two reasons:
1) That multiplies date in to three indexes: date, dateStart and dateEnd.
2) It doesn't actually work, as the date might be matched in two unique
dates within the same record which each fulfil one of the clauses, but
not the other.
> Again, it's kinda neat, but is it really necessery?
This one really is necessary.
> * the "relevant" term funtion
> How can the term function order the result set? What happens if two
> terms have the "relevant" term function?
Then the relevancies are merged, and the resultset re-sorted.
> * XCQL
> Why, oh why is this necessery? If it's only used for debugging, then put
> it somewhere else, like in the diagnostics.
It's only used in echoedResponse now (and debugging)
> * prefix maps
> I honestly do not see the need for them. It makes the query harder to
> read, and the since the server tells the client what, for example,
> "bath" in "bath.title" means there is no need for the client to specify
> it. It only makes sense when the server supports more than one context
> set with the same name, forcing the client to explicitely choose the one
> that is not used by default. Forcing server implementors to *not* use
> context sets with the same prefix seems better than forcing everyone to
> handle the prefix map syntax.
Prefixes are useful in the following situation:
You have a gateway which sends the same query out to multiple servers,
which may or may not use the same default names for context sets.
In this way, you can name the context set maps yourself to ensure that dc
is dublin core, not the dark custard context set.
The counter position to this is that this is the job of profiles, and is
in the Explain record anyway.
The counter-counter argument was that the explain record may conceivably
change between when you retrieve it and when you use the information in
it (as there are no persistent connections)
Overall, they're there because they are/were thought to be needed for
ensuring the query is interpreted as expected.
Rob
--
,'/:. Dr Robert Sanderson ([log in to unmask])
,'-/::::. http://www.o-r-g.org/~azaroth/
,'--/::(@)::. Special Collections and Archives, extension 3142
,'---/::::::::::. Nebmedes: http://nebmedes.o-r-g.org:8000/
____/:::::::::::::.
I L L U M I N A T I
|