Comments from another lurker.
Metasearch engines generally do search over diverse sources, so any
commonality (from Z39.50 or SRU/W, etc.) is only a bonus; it cannot be
considered the norm. Consequently the engines will have to undertake all the
heavy lifting for sites which do their own thing anyway.
Marc is correct that there is still the issue of what to do with the
results. It is bad enough at the moment. Adding a set of results of unknown
type (from a set of say 5) would mean a lot more conversion logic in the
engine.
Supporting this functionality at the search engine (target) side will mean
quite major extensions to the processing for those systems to choose the
"best" action to take. Unfortunately this may also be client dependant. So
things could be moving away from a decent result to something very much less
desirable.
One of the perennial big complaints from the content providers is that the
metasearch engines are "dumbing down" both their search capabilities and the
quality of the data returned. Unfortunately this could have a bad effect in
both of these areas. So I think the content providers would not be too
excited about it. As a metasearch engine provider, it will not really help
us as we have to take account of the non-standard guys as well.
I also can't quite see how any of this could be considered as a computing
grid arrangement, but that seems like another thread.
I will be in NC so there will be plenty of us to discuss this in person.
Peter Noerr
MuseGlobal
-----Original Message-----
From: Z39.50 Next-Generation Initiative [mailto:[log in to unmask]]On Behalf Of
Marc Cromme
Sent: Tuesday, April 20, 2004 8:04 AM
To: [log in to unmask]
Subject: Re: metasearch
Interesting proposal
I'll comment inbetween your lines:
Theo van Veen wrote:
>Currently it is not easy with SRU/W to broadcast the same query to many
>SRU/W servers because one has to take into account all the differences
>between different servers.
>
Definitely true - this due to the fact that SRW is a client-server
protocol, whereas
metasearch broadcasting as you are describing is more a grid computing task.
Currently, a client has to have good knowledge of the multiple servers
asked, and one has to program the metasearch client logic accordingly to
the capabilities of the SRW servers used.
The real solution might be a SRW like grid computing protocol, not an
extention to a client-server protocol.
>Especially in metasearching I think it would
>be convenient when there was a possibility to send a query saying "give
>me what is closest to this query" and allow different servers to respond
>with a servers choice according to one or more predefined responses. The
>responses could a.o. be:
>1) searchRetrieveResponse
>2) scanResponse
>3) results of a fuzzy match
>4) number of hits for different access points
>5) etc.
>
>Without having to find out how to translate a query for different
>targets such an "give me the best you can" request returns one or more
>response blocks and the client can use the ones that it understands to
>generate guidance to the user to improve his search.
>
I think a give-me-the-best-you-can answer does not resolve the problem,
since the client still must merge result sets from multiple servers, and
does not even know if the server considered the hit set to be "the real
thing" or "just the best I can". The problem is still the logic inside
the client - how to know what and how to merge??
>It is not the same
>as the "x-scanOnSearchFail" parameter, because it can also apply to
>other situations. For example when there are thousands of hits a server
>could provide a response block in which it gives the number of hits for
>different indexes. The client can use this to propose new searches, even
>with indexes that it would not have offered otherwise.
>
>
>
This means spreading the indexes from server to client - IMHO it smells
like distributed hash tables or distributed inverted indexes in grid and
peer-to-peer networks. If you do not want to write considerable amount
of logic for each client, it might be better to throw out the
client-server philisophy entirely and let the network merge and keep
track of indexes. But this will not possible by a server-client centric
protocol like SRW/SRU.
>I remember having proposed something like this earlier and we will
>implement this as a private extension. However, in the context of the
>NISO metaseach meeting there may be more support for this concept.
>
>1) Who would support a proposal for extending SRU/SRW with such an
>operation?
>2) Should this be done via a new x-parameter or via a new operation?
>
>BTW Who of this group is attending the NISO metasearch meeting?
>
>Theo
>
>
>
This said, I would like to see more specific how you'd plan to do your
extentions. Probably I understand then better your motivation and ideas.
I see forward to see more about your ideas.
Marc Cromme, Index Data
|