I expect they are doing now already something to get a query, dispatch
the query over several databases and give a response. As such it is a
black box and we only have to specify how we specify the databases to be
searched in and how we can recognize the originating databases in the
response. How a query is split up into different queries is the
responsibility of the target system and we do not have to worry. For the
response I see three possibilities (I assumes a single response).
1) Different searchretrieveresponses in a single message
2) One searchretrieveresponse but for each record an SRW: field
indication the original collections
3) The originating database is part of the metadata and we do not have
to worry at all.
I prefer 2) although 3) will always be possible.
For the request I do not think it will be a problem to allow an extra
parameter containing the list of databases or a refernce to the list of
databases.
In SRU it can even be solved without that: each request consists of a
base-URL and the SRU-parameters. We include local parameters in the
base-URL. We have two parameters for this purpose: collection (specifies
the collection you want to search in) and base (specifies the base part
of the query enable virtual collections)
Theo
>>> [log in to unmask] 9/17/03 10:03:17 nm >>>
I thought the requirement was to send the *same* query to multiple
databases
(at the same server). Sending *different* queries would be complex, but
I
don't understand that to be a firm requirement. Is is?
And if not, wouldn't sending the *same* query be significantly simpler
--
all we need to do is allow specification of multiple destinations (and
there
are a number of ways to do that).
And on the response: is it necessary to bundle responses together or
is
multiple responses ok?
--Ray
----- Original Message -----
From: "Sebastian Hammer" <[log in to unmask]>
To: <[log in to unmask]>
Sent: Wednesday, September 17, 2003 1:46 PM
Subject: Agenda Item: Metasearching - Multi-database search
> Hi All,
>
> Ray asked Brad Buckley of Gale to provide his input on
Metasearch-specific
> requirements. His response to Ray echoes discussion from the
Metasearch
> workshop in Denver earlier this year and I think it's a good point
to
raise.
>
> Content-providers like Gale, who hosts large numbers of databases,
are
> faced with a peculiar challenge from metasearch engines. Specifically
it
> involves metasearchers automatically launching parallel searches
against
> several Gale-owned databases at once. This is perceived as a useful
service
> by the developers and operators of metasearch engines, but it
creates
> issues related to performance for the operator of the databases.
Picture a
> situation where a single user fires off a search in a metasearch
portal
> which is automatically turned into 10, 20, or 100 parallel search
> operations against the server.. imagine this kind of activity from a
> popular metasearch portal, and things might start to get pretty hot
in the
> server pit at Gale (for example).
>
> In a nutshell, the proposal is for a mechanism that allows a
metasearch
> engine to 'bundle' these searches into a single request, and
similarly to
> allow the server to bundle the responses into a single package.
>
> We're talking about something that goes beyond mere multi-database
> searching in the original Z39.50 sense of the word.. we need
individual
> hit-counts (and result sets) back from the server, and we will
probably in
> some cases wish to send different query expressions to different
> databases... it comes much more close to the idea of a compound PDU
model
> which was discussed for Z39.50 a while back. A way to implement this
would
> be to introduce a "wrapper" element to allow multiple SRW requests to
go
in
> the same SOAP package. Another benefit for the content provider would
be
> the ability to manage resources by refusing individual component
requests
> (eg. during peak loads) without having to fail all requests.
Presumably
> this would allow greater "fairness" in managing scant resources.
>
> I am personally ambivalent about this, and even as a metasearch
engine
> developer, I feel that there are some thorny issues here. But this is
an
> attempt to honestly represent a requirement that was put on the table
by
> representatives of the content providers in Denver.
>
> --Sebastian
> --
> Sebastian Hammer, Index Data <http://www.indexdata.dk/>
> Ph: +45 3341 0100, Fax: +45 3341 0101
|