On Wed, 22 Jan 2003, LeVan,Ralph wrote:
> The purpose of Scan is to provide a window into the dictionary of query
> terms for a database. It was NEVER intended to be used for thesaurus work,
> though I admit that some tried to use it for such.
Agreed.
When used with authority data, there are a lot of interesting things that
could be done with an extended scan service, but the basic functionality
is to expose the index in cursorable chunks to the client in such a way as
to enable discovery.
As the client [assuming that we agree that we do need to send all of a
searchClause] has already sent a query structure, generating the
equivalent search is trivial by replacing the term in the query with the
term from the returned list.
I see the generic need for a 'displayTerm' to present to end users if the
term is not readable (due to stemming, normalisation, entity substitution
or whatever)
But how to enable the interesting bits? Currently here's no way to know
if a given index is controlled vocabulary or not. There's nowhere to put
this information in the protocol, so what about having an entry in explain
which points at the related database?
For example, an lcsh subject index might have something like:
<related type="authority">http://srw.o-r-g.org:8080/lcsh/</related>
Then we don't burden the PDUs^H^H^H^H SOAP messages with extraneous
information but still alert the client that this data is structured and
there may be interesting things to be done with it.
That said, I still like Jannifer's idea that instead of stepsize there
should be a more structured way to limit the terms returned. Perhaps a
request parameter that carries structureLevel, which would permit limiting
by the heirarchy, but otherwise still respects the idea that scan is just
a window on the index?
I think that between these two, everything suggested so far can be
performed.
Rob
Joe:
> > or direct searching of the used form; if there are related terms,
> > these should be available; if term is a node in a hierarchical
> > structure, it should be possible to navigate that structure. AND it
> > should provide an indication of how many documents (or, preferably
> > works) a search on the term is likely to retrieve.
Jannifer:
> > On step size, I've never used it but I think it is designed primarily
> > for subject browse. There are better ways to do an expanded and
> > collapsed scan, e.g. by browsing headings that have no subdivisions
> > then allowing them to be "opened" (Windows explorer metaphor). Step
--
,'/:. Rob Sanderson ([log in to unmask])
,'-/::::. http://www.o-r-g.org/~azaroth/
,'--/::(@)::. Special Collections and Archives, extension 3142
,'---/::::::::::. Twin Cathedrals: telnet: liverpool.o-r-g.org 7777
____/:::::::::::::. WWW: http://liverpool.o-r-g.org:8000/
I L L U M I N A T I
|