Sebastian - Thanks. There might be a compelling case for building muti-database capability
into SRW. Then again maybe not. Ralph's insight on this would be useful and I hope he'll
weigh in here, as I strongly suspect that he would be passionately opposed to the idea. And (I
think) he was at the workshop?
From: Z39.50 Next-Generation Initiative [mailto:[log in to unmask]]On Behalf Of Sebastian Hammer
Sent: Saturday, May 24, 2003 5:15 AM
To: [log in to unmask]
Subject: Re: SRW/SRU and Metasearch products
At 19:11 23-05-2003 -0400, Ray Denenberg wrote:
>Ok let me ask my question another way. First, is there one of the
>formalizes the metasearch model? (Appologies if there is one but I can't find
I could have missed it too, ut to the best of my knowledge, no such model
has been formalised. I believe that one of the short-term goals of the NISO
metasearch effort is to arrive at just such a formalization -- one that
incorporates the rights management, authentication, search and business
model requirements of the different actors. However, at present, this is
viewed as a fairly new busines area and there's generally not much of an
established terminology. Even the term "metasearcher" is kind of grabbed
arbitrarily from a handful of competing terms.
>I'm assuming it's an extension of the client/server model with an
>in the middle, the "metasearcher", so the model is
>client/metasearcher/server(s), where the metasearcher is a server to the
>end-client and acts as multiple clients to multiple end-servers. (This is a
>model we've attempted to formalize in Z39.50 but never could.) And if so,
I actually think that we have succeeded extremely well in formalising one
of the central elements -- the search part... no-one challenges the fact
that Z39.50 is a comprehensive approach to solving the problem (even if the
content vendors still see it as incomplete). When Z39.50 is not at the top
of the list is is because of a perception among some people (perhaps a
majority, certainly a *vocal* majority) that Z39.50 is overly complex and
mired in obsolete technology.
>there is a search protocol between end-client and metasearcher, and a protocol
That would, in the majority of cases, be HTTP/HTML. However, it's certainly
possible to imagine that down the road, there will be a desire for a
web-services-like formalisation of the interface/API to a metasearcher.
>between metasearcher and end-servers. And these are not necessarily the same
>protocol. Are there names for these in the model, like "metasearch" protocol
>and "access" protocol? (I'm asking out of ignorance; surely there's a model
>somewhere that I've overlooked.) For the metaseach protocol, clearly the
I don't believe there is presently a hard formalisation of these terms.
Again, this forum is less than a year old and represents a get-together of
many varied, sometimes conflicting business interests attempting to find a
common platform. I'd expect to see a "vocabulary of concepts" arise from
the email/telecon work of the group over the next half year or so.
>multi-database issue is important, and for the access protocol it's not. (Does
>this make sense or am I missing the big picture?)
If by "access protocol" you mean the protocol between metasearcher and
end-servers (content providers), then you're wrog -- the multi-database
issue is seen as a core one. I know this runs against the grain of our
discussions in ZiNG, but bear in mind that the *new* players on the scene,
the large, commercial content vendors, very frequently run very large
numbers (hundreds or even tens of thousands) of logical databases. For some
of these, they allow cross-searching via their own interface, for some they
don't, for technical or other reasons. However, the metasearchers is
capable of cross-searching *any* combination of databases on their servers,
basically by emulating individual users against each one. So consider
ElseVier offering 50 different searchable, logical databases within a given
subject area. The simple-minded metasearcher is capable of launching
individual searches against all of these 50 databases in response to a
single search, and this creates a very noticeable load against the server
-- potetially a 50-fold increase in server load if a large percentage of
users access their dataases through metasearch agents.
This is why, in a nutshell, there is suddenly a requirement for
multi-database searching in the end-server access protocol (for lack of a
>Which protocol are we considering when we talk about SRW in this model?
>the other or both? My reading leads me to think to me it's the metasearch
SRW in this context, I believe, is being considered only for access to the
end-servers, or content providers.
>protocol, so I'm inclined to think we should reconsider the database issue for
>SRW (which is a different view from my earlier posting now as I've re-thought
>this). I also think we should start looking at the Z39.50 dedup service,
>significant intellectual resources went into it, and nobody has
>since it's really a metaseach function.
I agree that a discussion about dedup is interesting, and that SRW could
potentially evolve into a useful protocol for user access to metasearch
engines, but that's not where the NISO group is at right now.. basically
the metasearch engines are developed by competing companies trying to stake
out a claim in an evolving market, and there's not presently a great
interest in standardizing access to the metasearcher. Another concern is
that the type of application is still so new and poorly understood that I
think the scope of that interface would be very hard to agree on.
Sebastian Hammer, Index Data <http://www.indexdata.dk/>
Ph: +45 3341 0100, Fax: +45 3341 0101