There is one server talking SRU to the user. This server accesses a
database and an index via whatever protocol and the indexer does not
store the actual records. And parsing can work the way you describe. I
think it is just a matter of taste I think whether or not to make a
distinction between searching in an index and retrieving the actual
record and it is not a real issue so I accept it the way it is.
>>> [log in to unmask] 06/03 4:17 >>>
> My main reason for a distinction between requesting records via SRW
> searching via CQL is that at least in our case both are requests
> different servers and I expect that there will be many more
> in which the index server that gets the CQL query is quite different
> from the database server(s) that gets the record request. In the
> situation the CQL may have to be parsed twice.
I don't understand.
You have an architecture where one server doesn't know anything about
records, but does have access to the indexes generated from the
record data and metadata.
Then there's another server which has access to the records, but not
indexes. On a search request, you parse the query, consult the
and get back ... a list of matching document identifiers? ... and then
need to retrieve those records from the second database.
Why are you using SRW for the record database at all if it doesn't
searching? If you're going to use URIs as identifiers, why have a
/database/ at all, just mirror them into a webserver and retrieve them
using the URI. If it doesn't support CQL, then it's not SRW /anyway/
according to the minimum specifications.
Or is the search server a metasearcher, which somehow indexes the
documents but doesn't store them, and the record server is a full SRW
engine but for a limited number of the records indexed by the
Either way, the computational expense of parsing the CQL is negligable
compared to the other expenses of such a transaction (host name
socket connect, database lookup of record etc)
There's no booleans to process, and rec.id exact "foo" is just one rule
use. Assuming that your CQL parser is similar to mine, Adam's and
Mike's, it'd go something like:
Lookahead one token
Second token is a relation, therefore first is an index and third is a
Instantiate internal searchClause representation
,'/:. Rob Sanderson ([log in to unmask])
,'--/::(@)::. Special Collections and Archives, extension 3142
,'---/::::::::::. Twin Cathedrals: telnet: liverpool.o-r-g.org
____/:::::::::::::. WWW: http://liverpool.o-r-g.org:8000/
I L L U M I N A T I