Ray asked Brad Buckley of Gale to provide his input on Metasearch-specific
requirements. His response to Ray echoes discussion from the Metasearch
workshop in Denver earlier this year and I think it's a good point to raise.
Content-providers like Gale, who hosts large numbers of databases, are
faced with a peculiar challenge from metasearch engines. Specifically it
involves metasearchers automatically launching parallel searches against
several Gale-owned databases at once. This is perceived as a useful service
by the developers and operators of metasearch engines, but it creates
issues related to performance for the operator of the databases. Picture a
situation where a single user fires off a search in a metasearch portal
which is automatically turned into 10, 20, or 100 parallel search
operations against the server.. imagine this kind of activity from a
popular metasearch portal, and things might start to get pretty hot in the
server pit at Gale (for example).
In a nutshell, the proposal is for a mechanism that allows a metasearch
engine to 'bundle' these searches into a single request, and similarly to
allow the server to bundle the responses into a single package.
We're talking about something that goes beyond mere multi-database
searching in the original Z39.50 sense of the word.. we need individual
hit-counts (and result sets) back from the server, and we will probably in
some cases wish to send different query expressions to different
databases... it comes much more close to the idea of a compound PDU model
which was discussed for Z39.50 a while back. A way to implement this would
be to introduce a "wrapper" element to allow multiple SRW requests to go in
the same SOAP package. Another benefit for the content provider would be
the ability to manage resources by refusing individual component requests
(eg. during peak loads) without having to fail all requests. Presumably
this would allow greater "fairness" in managing scant resources.
I am personally ambivalent about this, and even as a metasearch engine
developer, I feel that there are some thorny issues here. But this is an
attempt to honestly represent a requirement that was put on the table by
representatives of the content providers in Denver.
Sebastian Hammer, Index Data <http://www.indexdata.dk/>
Ph: +45 3341 0100, Fax: +45 3341 0101