Print

Print


There is one server talking SRU to the user. This server accesses a
database and an index via whatever protocol and the indexer does not
store the actual records. And parsing can work the way you describe. I
think it is just a matter of taste I think whether or not to make a
distinction between searching in an index and retrieving  the actual
record and it is not a real issue so I accept it the way it is.

Theo


>>> [log in to unmask] 06/03 4:17  >>>
> My main reason for a distinction between requesting records via SRW
and
> searching via CQL is that at least in our case  both are requests
for
> different servers and I expect that there will be many more
situations
> in which the index server that gets the CQL query is quite different
> from the database server(s) that gets the record request. In the
current
> situation the CQL may have to be parsed twice.

I don't understand.

You have an architecture where one server doesn't know anything about
the
records, but does have access to the indexes generated from the
record data and metadata.
Then there's another server which has access to the records, but not
the
indexes.  On a search request, you parse the query, consult the
indexes
and get back ... a list of matching document identifiers? ... and then
need to retrieve those records from the second database.

Why are you using SRW for the record database at all if it doesn't
support
searching? If you're going to use URIs as identifiers, why have a
/database/ at all, just mirror them into a webserver and retrieve them
using the URI.  If it doesn't support CQL, then it's not SRW /anyway/
according to the minimum specifications.

Or is the search server a metasearcher, which somehow indexes the
documents but doesn't store them, and the record server is a full SRW
engine but for a limited number of the records indexed by the
metasearcher?

Either way, the computational expense of parsing the CQL is negligable
compared to the other expenses of such a transaction (host name
lookup,
socket connect, database lookup of record etc)
There's no booleans to process, and rec.id exact "foo" is just one rule
to
use.  Assuming that your CQL parser is similar to mine, Adam's and
Mike's, it'd go something like:

Instantiate parser
Lookahead one token
Second token is a relation, therefore first is an index and third is a
term.
Instantiate internal searchClause representation
Return representation

Rob

--
      ,'/:.          Rob Sanderson ([log in to unmask])
    ,'-/::::.        http://www.o-r-g.org/~azaroth/
  ,'--/::(@)::.      Special Collections and Archives, extension 3142
,'---/::::::::::.    Twin Cathedrals:  telnet: liverpool.o-r-g.org
7777
____/:::::::::::::.              WWW:  http://liverpool.o-r-g.org:8000/

I L L U M I N A T I