Quoting Ray Denenberg <[log in to unmask]>:
>
> Dr R. Sanderson wrote:
> * Be able to specify term level data to filter on
> ...eg browse terms that match a given pattern (Sand*, R*)
I've implemented this. I support glob on scan terms and I support other
operations too like phonetic, thesauri etc since my scan is "realtime"
(all the terms in the index in contrast to a "static dictionary").
>
> * Be able to specify term level metadata to filter on
> ...eg browse terms that occur in more than 100 records or have a term
> weight higher than a given threshold
>
I don't have this but give me 5 minutes! :-)
> * Be able to specify record level data/metadata to filter on
> ...eg browse terms that occur in records from a regular query
> ... and get back global and query specific term metadata
Yes. I've implemented this too.. I called it ScanSearch instead of Scan :-)
> Over the years, both for Z39.50 and to some extent SRU, people have proposed
> functionality similar to this (derived indexes) and the response seems to
> always have been these are sound theoretical functions but nobody will
> implement them. Has that changed?
I don't know.. I've not only implemented them but use them too!
On the query level I've tried to make my scan "smart" and be good at guessing
what kind of scan people are trying to execute. Its not hard.
> --Ray
>
I don't think its a good idea to try to clump scan and search together since
they are really different.
- Scan is looking for search terms (that can be used to create queries)
- Search is looking for a list of search results (these MIGHT be used to
create queries, however, as in "query by example" or "relevant feedback")
-- not really the topic but I'd like to emphasize again that we should not
view the result of a search as an static body but understand that the unit
of retrieval might be a unit of appropriate information (a fragment) and
not always the whole corpus. In this regards we see even more how the process
of search for information versus search for terms is very different and will,
as will bring this thing along, have very very different technical demands.
They might both have the word "search" in them but that's really insufficient,
I think, to try to throw them into the same sink. Worse still, I think, it
could create more complications and confusion than "simplicity".
--
E. Zimmermann, BSn/Munich R&D Unit
Leopoldstrasse 53-55, D-80802 Munich,
Federal Republic of Germany
http://www.nonmonotonic.net
|